Introduction

This series of posts discusses expression transformations, with a focus on evaluation-based transformations. Its objectives are twofold.

First, it proposes a classification for such transformations and shows where the tactics offered by Coq fit therein. My hope is that it may help you gain insight into the strengths, weaknesses and quirks of the tactics offered by vanilla Coq, so that it may inform your future tooling choices.

Second, it offers a critique of some shortcomings of the prover. I go on to propose a partial answer to these shortcomings. The changes are discussed and motivated throughout this series.

This is all related to or directly part of my exchange project I did at SYSTEMF, Clément Pit-Claudel's lab at EPFL in roughly the first half of 2023. I found him to be a great advisor and I learned a lot under his supervision.

1. What even are expression transformations?

Expression transformations are functions from expressions to expressions. This is a very general concept — in fact, too much so to be useful. We will restrict our attention to the subset of transformations whose output is equivalent to their input, for some notion of equivalence (usually functional equivalence, but possibly something stronger or weaker — just not so much so that it stops being practical).

When do these transformations crop up in practice? Well, there are many ways of describing some given computation, with some of them being the best fit for some purpose but not a great choice for some others. For instance, a high-level formulation of a program may be easy to grasp but slow to run, and the opposite may be true for its low-level counterpart. Between these two versions of this program, something is preserved: they should both be embodiments of the same abstract program. Going from the high-level form to the low-level one (or the other way around) is a functional equivalence preserving expression transformation.

In fact, you interact with such transformations as part of your everyday programming experience:

Compilation is an expression transformation. It typically takes a program written in a high-level language (e.g. C) to an equivalent program in a lower-level language (e.g. machine language). Here, the notion of equivalence should include both functional correctness and preservation of side-effects. Also, the transformation turns a readable, high-level program into an efficient, machine-executable one. Note that individual optimization passes (constants inlining, loops unrolling, register allocation, dead-code elimination, …) are also expression transformations in their own right.
Evaluation of programs or expressions are transformations which take an input to a value (+ side effects, usually).

From the next section on, consider that "expression transformations" means "equivalence preserving expression transformations", for some notion of equivalence. In fact, all expression transformations are equivalence preserving expression transformations for the most general notion of equivalence (the one which identifies all expressions). Of course, we will stick with actually useful notions of equivalence.

2. Why I care

So, expression transformations are related to compilers and interpreters. However, this series is not about compilers and interpreters, but about the process of compiling and interpreting. This is an important nuance: how do you go from 5 + 3 + 2 to 10? You could assume some arbitrary parenthesizing, say (5 + 3) + 2 and unfold the definition of + following the classical PEMDAS order of operations. Alternatively, you could pick the other parenthesizing, i.e. 5 + (3 + 2)! So there is a notion of choice, and a natural question at this stage is whether or not all such choices (for all operations, not just addition) are equivalent with regards to performance. Can you think about a situation where there is a difference? Spoilers in the next paragraph.

The most familiar example is perhaps short circuiting: true && <something slow to compute> can be reduced instantaneously to true, and in fact commonly is (as in e.g. C, Python or Rust). 0*<something slow to compute> could also be reduced instantaneously to 0, although I don't know a language which does so in practice.

Here's a more complex example: consider the expression if <x> then <y> else <z>. Assume that <x> actually evaluates to true, and that both <y> and <z> evaluate to 5. Here are two ways of showing that whole expression reduces to 5:

First, evaluate <x>. As it is true, we know that we only care about the first branch of the conditional. By hypothesis, this branch evaluates to 5.
If the two branches of a conditional can be reduced to the same value, we don't even have to check the value of the condition expression <x>: the result will be 5 no matter what.

Now, we know what <x>, <y> and <z> evaluate to, but not how efficiently. Assume that <x> evaluates very slowly and the other expressions evaluate very fast. Then, the second method of evaluation is the most efficient. On the other hand, if it is <z> which evaluates very slowly and the rest which evaluates very fast, then the first method of evaluation is the better one!

In the general case, we can devise interesting heuristics to try to evaluate expressions as efficiently as possible, but they are bound to be imperfect. Leaving the user a way of manually controlling the order in which reductions occur is very helpful in some situations, especially in proof assistants.

In fact, thinking back about our first example, we can already find the seed of a similar issue: we said that we unfolded the definition of + according to the order of priority of addition. But that is also a choice! In Coq, the definition of a + is a recursive function. Had we unfolded it in the opposite order we would have ended up using more memory for state than necessary, a bit like losing tail recursivity (we need the value of the innermost additions before we can compute the rest). Just to be clear, we wouldn't be violating the order of operations, functionally speaking. Pedantically, we would go from (5 + 3) + 2 to:

(fix add n m := match n with | 0 => m | S p => S (add p m) end ) (5 + 3) 2

Unlike when we were following the traditional order, we can't immediately simplify this expression: the arguments have not been reduced to values. In this simple example, things don't blow up. But what about adding up 10000 terms this way? This may seem like an artificial remark: of course no sane person would ever mess with the order of evaluation for expressions involving only additions! Good for them, but I personally wouldn't hesitate to violate the classical order of reductions in some contexts, as in Coq, it often is the case that you need your expression to match some very specific form. How would you go about proving forall a, 0 + (a + 5) = a + 5? One easy way would be to unfold the definition of the outermost + and to simplify the result.

Additionally, it should be noted that changing a Coq term to another form by using a tactic such as change may come at a great cost. This is related to the following phenomenon:

Definition x = <something slow to compute>. Definition y = x. (* Example a *) Goal y = x. reflexivity. (* fast! *) Qed. (* Example b *) Goal x = y. reflexivity. (* slow! *) Qed.

The two goals are equivalent as equality is commutative, however the first proof runs much faster. How come? Well, when you call reflexivity, Coq does the following:

1. Check if the head of the terms on both sides of the equal sign are equal.
2. They are in neither of our examples, but both of them are references to terms defined in the environment. Who knows, maybe the left term reduces to the right term? Unfold the definition in the left term.
3. Are both terms equal at this point?

3.example a. Yes. The proof is over.
3.example b. No. But who knows, maybe the left term reduces to the right term?

3.b.1. [Time lost in an useless computation]. No it doesn't, but now we know what the left term looks like when it is fully reduced. Who knows, maybe unfolding the definition in the right term gives us the same thing?
3.b.2. No it doesn't, but it is now another reference, namely x. Who knows, maybe it is this one that unfolds to the left term? Unfold! (at this point, we have lost track of the fact that the left term ever was x)
3.b.3. Still no dice! But who knows, maybe the right term reduces to the left one. Try that!
3.b.4. [Another equivalent and useless slow computation]. Now the left term equals the right one! The proof is over.

When you call change in Coq, you implicitly introduce a conversion in the proof object. After all, the fact that your proposed change is licit has to be checked! And, following the above pattern, change-ing y to x takes a long time compared to its virtually instantaneous x to y counterpart. Being able to control the reductions more finely would allow you to get the performance of example a in example b.

To answer this section's title, I care about expression transformations because they give me a model through which I understand why some Coq computations are needlessly slow and how to circumvent such pain points. The way I like to think about it is that in an expression, you have several locus points at which some basic simplification step can be applied: unfolding a definition/fixpoint/cofixpoint, passing some argument to a function, applying a match, destroying a let, … Woops, did I just say "…"? I meant ".", as this was in fact an exhaustive list of primitive Coq reduction steps, a generating set of Coq expression reductions of sorts (edit: I note that the if example is not covered… It would make sense to add a reduction primitive which returns the single value all branches of a match reduce to, assuming there is one). Applying a reduction removes a possible locus point, but may add others! By default, Coq applies heuristics which work well in most cases. The thing is, I also care about the rest of the cases. As was shown, letting the user apply these reductions by hand is crucial in many situations. Alas it is not possible in vanilla Coq as of this writing. A workaround will be proposed towards the end of this series.

Besides that, Coq still offers plenty of tactics through which more or less specialized forms of expression transformations can be performed. Learning about expression transformations in general will help us get a better grasp on these tactics.

3. Rewriting systems

3.1. Definition, examples and remarks

Expression transformations are commonly studied through the lens of rewriting systems. At the core of such systems stands a set of objects which represent the expressions, alongside a relation showing which rewriting steps can be applied to an expression. The relation should only make it possible to reach equivalent expressions (again, for some appropriate notion of equivalence).

Fig. 1: A first example of a rewriting system in action

The expression 5 + 3 + 2 is getting progressively simplified to the value 10. Arrows represent the rewriting steps allowed in our system starting from a given state. The expressions inside all of the boxes are equal — this notion of equality is the equivalence relation in this case!

Note that the set of objects does not have to be homogeneous: it can comprise elements of different types. In Fig. 1 we had expressions (5+3+2, 5+(3+2), 8+2, …) and values (10). Assuming that the language depicted in this example only supports concrete expressions (i.e. fully reducible ones; in particular, it does not include variables), we should always be able to reduce any expression to a value by following steps authorized by our rewriting system (equivalently, by following arrows in the associated graph).

In this first example, the concrete values are a subset of general expressions. In more complex situations, such as the aforementioned compilation of C programs down to machine language, the set could contain objects of very distinct natures - C programs, machine code, and even intermediate representations. To compile a C program, you could pick its representative and follow arrows until you reach an object corresponding to machine code, going through intermediate representations along the way.

A rewriting system allows you to take steps in the equivalence class of your terms according to the structure it imposes. In particular, you usually can't reach every single point of your equivalence class. Typically, rewriting systems are designed so as to allow only transitions which simplify your term in some sense, i.e. taking it closer and closer to some normal form.

In fact, the way the relation at the core of a rewriting system is structured has some interesting consequences. For instance, a compiler can be automatically derived from a rewriting system if the following conditions are met:

The relation is strictly acyclic (you can't get stuck in a loop).

✗

✗

✓
The relation respects the descending chain condition (you can't make progress forever — eventually, you have to reach a sink state). Put together, these first two conditions entail that the relation in question is well-founded.
The relation is injective (for any object, you can reach at most one object in one step — you never have a choice regarding your next action).

✗

✓
Starting from an object of the input language, the sink reached through repeated application of the relation belongs to the right subset (e.g., if you are compiling a valid C program down to machine language, you should obtain a machine program as output, and certainly not some other C program). Let's call this property well-typedness.

Here's a universal compiler based on such well-behaved rewriting systems:

fun compile (e: expression) (r: relation) := let reachable_objects := get_reachable_one_step e r in if reachable_objects.size() == 0 # Have we reached a sink state already? then e else compile reachable_objects[0] r # There can't be more than one element

In fact, this code works for a slightly less restrictive class of rewriting systems. In particular, we limited ourselves to injective relations, but we can let go of this requirement in certain situations — those in which we could say a bit too casually that "choices don't matter", in the sense that no matter which choice you take at every step, you always end up in the same sink (this is the aptly-named confluence property — this is related to the Church-Rosser theorem).

(Some other criterions can be relaxed to some extent, courtesy of the fact that our starting expressions all live in a specific subset of the objects. For instance, the existence of loops isn't an issue as long as you can't actually reach them from the starting expressions.)

You may have been lead to conflate rewriting systems with the graphs I represented them as. These are merely a convenient representation, often showing only a subset of a rewriting system, when these are in fact defined for whole spaces of states. The spaces in question are generally so large that it wouldn't make much sense to try and precompute them so as to get such graphs. The only thing that really matters is that we are able to obtain the transitions available at some point of the space — in our case, the transformations available for a given expression. In fact we don't even require all transitions to be easily obtainable: for all we know, there could be an infinite amout of transitions — we hope that at least some of them can be found efficiently. Nevertheless, such graphs are a nice tool and a useful mental model in many situations.

Also, most interesting programs are not simple expressions. In particular, they don't tend to be pure. In this case, representing their entire evaluation process as a rewriting system gets trickier: you may for instance want to keep track of some state for side-effects, and nondeterminism may make you run into infinite (or at least very large) sets of allowed transitions. My post on impurity (and Haskell) is tangentially related to this question, as it touches upon the limits what compilation can achieve, with nondeterminism (introduced by external impurity) being the limiting factor.

It should also be noted that not all expression transformations are about reducing expressions. Sometimes you have to make them bigger (e.g. loop unrolling in imperative programs, or unfoldings to match a form expected by a lemma in a proof assistant), or to turn them into seemingly unrelated, but somehow still equivalent expressions (rewrite tactic in Coq).

3.2. Rewriting systems in Coq

Interestingly, we can interpret Coq as a rewriting system at different levels:

at the expressions level (such as a in goal or a hypothesis): Coq expressions can be simplified, replaced with something proved equivalent, … We will only focus on this interpretation as it is the most practical one, but I thought that the other interpretations are fun, for some loose enough definition of fun.
at the proof state level: applying a tactic to a proof state yields a new proof state to play with. The rewriting relation is given by the set of tactics you can successfully apply in given proof state. Caveat: if you assume coherence to be your equivalence relation, then you shouldn't include unsafe tactics such as admit or change_no_check (those which may anger Qed).
globally: Coq maintains a safe environment containing all accessible definitions, verified theorems and whatnot. As long as you don't admit silly theorems or import wonky axioms, we can expect this environment to be correct (it contains only true theorems and well-typed definitions). Again, this doesn't play nice with Admitted and friends.

We will stick to the expressions-level interpretation. Coq offers many tactics, and a lot of them secretly correspond to expression transformations following this interpretation: think about vm_compute, cbn, simpl, rewrite, … However, it is important to understand that not all tactics are about such transformations! Think about left or exfalso, for instance. In their 2010 tutorial "An introduction to small scale reflection in Coq", Georges Gonthier and Assia Mahboubi distinguish three classes of Coq tactics:

rewriting ones, which correspond to expression transformations.
deductive ones, which can replace expressions with something that is not equivalent to them (left, reflexivity, apply, …). These "directly specify part of the construction of the proof [term], either [through forward steps or backwards steps]".
bookkeeping ones, which are about managing the proof context (clear, split, …).

Admittedly, many tactics are not that pure. For instance, it could be argued that destruct matches all three of these categories in certain aspects. Still, it can be viewed as a composite of simpler tactics.

As to the reduction by application of basic reduction steps I mentioned earlier, there is some good news: it is possible obtain the list of locus point in a given term at which our basic reduction steps may be applied, with the additional desirable property that there are no positions in the term at which more than one reduction type applies.

The structure of this series

Part 1: building a classification of expression transformations.
Part 2: showing where Coq's tactics fit in this classification.
Part 3: discussing properties of Coq tactics that are orthogonal to expression transformations.
Part 4: presenting and discussing our atomic reduction tactics proposal and implementation in more details.

In the next episode…

I will propose a classification for different forms of expression transformations. I won't really discuss Coq in that post, but I will have some things to say about evaluation strategies.