Control mechanisms in Coq

We have seen which tactics can be used to transform expressions in Coq, but I didn't go into as much detail as I could have about how these tactics can be controlled. For instance, many tactics offer ways to select to which subterm(s) they can be applied, and these control mechanisms are usually shared between different tactics.

This post will introduce all these mechanisms. The classification will then be enriched with related information.

1. How transformations are applied

Up to this point I mostly focused on what sort of transformation the tactics do, but not on how they actually reach their results. In other words, I stuck to the functional aspects of these transformations. In this section, we are going to see how tactics sorted in the same category may differ.

1.1. Laziness and eagerness

I already mentioned laziness and eagerness in the first classification I gave as well as in my introductory post about evaluation strategies. Laziness and eagerness decide when an argument is evaluated: when it is first encountered or only when we are sure that its value is required. In some contexts, the latter is much more efficient (albeit it comes with a very small overhead).

These properties are chiefly associated to partial or total evaluation transformations: after all, they are an important characteristic of strategies for picking which reduction steps to choose between the ones available at some point.

(fun x y z => x + z + z) 2 (<something complex>) (1 + 2 + 3) = (fun y z => 2 + z + z) (<something complex>) (1 + 2 + 3) = (fun z => 2 + z + z) (1 + 2 + 3) = 2 + (1 + 2 + 3) + (1 + 2 + 3)

Actually, what we showed in the above example was call-by-name, a form of lazy evaluation where the value of inputs can be computed several times, as was the case for (1 + 2 + 3). For functions with complex inputs, this can lead to important performance issues! Therefore, actual lazy strategies are usually based on call-by-need, where the value of each input is computed once if and only if it is actually required. This can be done a bit like this:

(fun x y z => x + z + z) 2 (<something complex>) (1 + 2 + 3) = let x := 2 in (fun y z => x + z + z) (<something complex>) (1 + 2 + 3) = let x := 2 in let y := (<something complex>) in (fun z => x + z + z) (1 + 2 + 3) = let x := 2 in let y := (<something complex>) in let z := 1 + 2 + 3 in x + z + z = let x := 2 in let y := (<something complex>) in let z := 1 + 2 + 3 in x + z + z = let y := (<something complex>) in let z := 1 + 2 + 3 in 2 + z + z = let y := (<something complex>) in let z := 6 in 2 + z + z = let y := (<something complex>) in 2 + 6 + 6 = let y := (<something complex>) in 14 = 14

Most strategies in Coq are eager. In fact, the only lazy tactic is, well, lazy — but so is the reduction strategy employed by the kernel (not a tactic!).

1.2. Acceleration

Most Coq strategies run "inside of Coq". That is, they directly manipulate Coq terms without stepping through complex intermediate representations.

The two notable exceptions are vm_compute and native_compute. The former works by converting terms to a form which is then executed by a virtual machine written in C (of course, the result also has to be converted back to Coq). The latter converts the term to OCaml code which is then compiled and run.

These tactics are very efficient at computing but come with some overhead (for the conversion between the different forms, and compilation in the case of native_compute). native_compute is the faster of the two, but it also comes with a larger overhead.

1.3. Impact on the proof term

Calls to Coq tactics build a proof term. When Qed is then called, the kernel verifies that this term has the type corresponding to the property being proved, as per Curry-Howard. For instance, a successful call to reflexivity replaces the hole in the proof term which corresponds to the current goal with eq_refl (with the appropriate implicit parameters). But what about expression transformations?

Let's first turn our attention to non evaluation-based tactics. To a first approximation, evaluation based tactics are useless: they take an object to something equivalent to it. Still, they guide the reduction process. The way Coq keeps track of their effect is by inserting an explicit conversion in the proof term. Concretely:

Set Printing Implicit. Goal 2 + 3 = 5. Show Proof. (* (?Goal) *) cbv. Show Proof. (* (?Goal : 2 + 3 = 5) *) reflexivity. Show Proof. (* (@eq_refl nat 5 : 2 + 3 = 5) *) Qed.

However, I mentioned that vm_compute and native_compute are more efficient than e.g. cbv. This is good, however you should remember that the conversion that these tactics perform are not only performed once, but twice: once during the call to the tactic, and then again during the typechecking of the generated proof term (at Qed time).

Now, if the accelerated tactics were to trigger to non-accelerated conversions in the proof term, then calls to Qed would end up taking a problematic amount of time. Consequently, the kernel also handles accelerated conversions:

Set Printing Implicit. Goal 2 + 3 = 5. Show Proof. (* (?Goal) *) vm_compute. Show Proof. (* (?Goal <: 2 + 3 = 5) *) reflexivity. Show Proof. (* (@eq_refl nat 5 <: 2 + 3 = 5) *) Qed.

native_compute leaves a similar mark, denoted as <<:.

This comes at a cost which is paid in trust: the Coq kernel now contains a compiler and a virtual machine. So much for minimality. The efficiency of these tactics makes the tradeoff acceptable, and it is always possible to use the very restrictive program coqchk to check proofs while avoiding these facilities.

The information left in the proof term is rather limited. Consequently, it may be that you were careful to prove something in a smart way which keeps off performance traps, only for the proof term checker to fall head first into them as not enough information was remembered. Luckily, this should not happen with e.g. your carefully picked cbv flags, due to the way the default conversion used by the kernel works. Read the code below to get a sense of how the reduction process work. In it, we do one proof and then another one which is very close to the first one, yet with a very different performance profile.

Definition slow_comp (n: nat) := <something which takes n seconds to compute 0> Definition slow := slow_comp 5; Axiom sc_5_0 : slow_comp 5 = 0. Axiom s_0 : slow = 0. (* Example A *) Goal slow = 0. change_no_check (slow_comp 5 = 0). apply sc_5_0. Show Proof. (* sc_5_0 *) Qed. (* Almost instantaneous *) (* Example B *) Goal slow_comp 5 = 0. change_no_check (slow = 0). apply s_0. Show Proof. (* s_0 *) Qed. (* ~10s *)

I am not 100% certain about the finer details, but my interpretation of what happens is as follows:

Also note how such conversions applied to very large terms can lead to huge proof terms — a very large term and another large term equal to the first one after some simplification both appear in the proof; if there are 50 such passes, then there are 51 large terms (TODO sharing probably mitigates this issue — is it really a problem in practice?).

Not all tactics use conversions to leave a mark in the proof term. For instance, rewrite uses the eq_ind_r, which is a function which takes x to y given a proof of x = y.

Axiom foo: 2 + 2 = 5. Goal 2 + 2 = 5. Show Proof. (* ?Goal *) rewrite foo. (* 5 = 5 *) Show Proof. (* (eq_ind_r (fun n : nat => n = 5) ?Goal foo) *) reflexivity. (* All Goals Completed. *) Show Proof. (* (eq_ind_r (fun n : nat => n = 5) eq_refl foo) *) Qed.

In summary, the tactics behave like this:

However, this is only true when the tactics are applied to a goal! When targeting hypotheses, almost all expression transformation tactics are traceless! In fact, the only exceptions to this are rewrite and subst (I didn't check every single other tactic but I think they are it). This has some nasty consequences:

Lemma t: slow = slow_comp 20 -> slow = 0. Proof. intro. (* Fast *) vm_compute slow_comp in H. (* Fast (accelerated) *) rewrite H. (* Fast *) Qed. (* Slow *)

With rewrite, the changes to hypotheses are dealt with by introducing let bindings overriding the value of the argument that is the Curry-Howard analog to the hypothesis. I don't know why the same isn't done for e.g. conversions — I don't see why it wouldn't work and it would make the performance of Coq much more predictable.

Also, it's perfectly fine for change never to leave a trace in the term, as this tactic searches a proof for the replacement with performance similar to what the kernel does at Qed time (TODO exactly the same? what about laziness?).

2. Where transformations are applied

Most tactics offer a way to target subterms, hypotheses, … In this section, I catalogue the targeting methods available to tactics.

2.1. Occurrences

If you were to design a tactic language for Coq, how would you enable the user to refer to subterms? Asking for an explicit path in the syntax tree works, sure, but this is really not pleasant to deal with, and it gets worse the deeper the targeted subterm is. Coq's answer is based on the notion of occurrences: give a pattern and Coq will associate an identifier to each matching subterm, or apply your tactic to all matches.

Reference occurrences

Some Coq tactics let you refer to occurrences of some definitions:

Goal 3 + (1 - 1) = 2 - (1 + 1). vm_compute plus. (* 3 = 2 - 2 *)

In this example, the - in the subterm to the left of the = sign was also simplified, but the one in the right subterm was not. This is because vm_compute identified all occurrences of +, and fully evaluated them (including their subterms).

Reference occurrences also understand notations:

Goal 3 + (1 - 1) = 5 - (1 + 1). vm_compute "+". (* 3 = 5 - 2 *)

Pattern occurrences

Pattern occurrences work by identifying subterms matching a given pattern in a term:

Goal 3 + (1 - 1) = 5 - (1 + 1). vm_compute (3 + _). (* 3 = 5 - 2 *)

Or even:

Goal 3 + (1 - 1) = 5 - (1 + 1). vm_compute (_ 1). (* 3 + 0 = 5 - 2 *) (* This matches any application of a function to argument 1*)

And (more complex):

Goal (5 - 1) + (1 - 1) = (1 + 4) - (1 + 1). vm_compute (_ 1). (* 4 + 0 = (fun m : nat => S m) 4 - 2 *)

As you can see, unlike reference occurrences, pattern occurrences corresponding to a function only simplify these functions themselves and not the full application. In particular, they leave the subterms being passed to these functions unchanged:

Goal plus 1 2 = 3. vm_compute (plus). (* (fix Ffix (x x0 : nat) {struct x} : nat := match x with | 0 => x0 | S H => S (Ffix H x0) end) 1 2 = 3 *)

Also note that the universal pattern matches every subterm (and therefore leads to a result no different to a simple call to vm_compute).

Goal plus 1 2 = 3. vm_compute (_). (* 3 = 3 *)

The language for expressing patterns include the joker symbol _. This is not sufficient to express all sort of patterns. For instance, you can't express the pattern corresponding to f ?a ?a.

Occurrence numbers

Occurrence numbers are a targeting mechanism where subterms are designated by a number which depends on the position of a match of some query in a term. For instance:

Goal ((2 + (2 + 2)) + (2 + 2)) = 10. vm_compute plus at 3. (* (2 + 4) + (2 + 2) *)

The occurrence number can be computed by considering a depth first search of the syntax tree of the term:

Multiple occurrences in the same term

It is possible to indicate a group of occurrences at once. There are two ways of doing this. First, the positive one:

Goal 3 + (1 + 1) = 2 + (2 + 1). vm_compute "+" at 2 4. (* 3 + 2 = 2 + 3 *) (* Tactic applied at each given positions *)

And then the negative one:

Goal 3 + (1 + 1) = 2 + (2 + 1). vm_compute "+" at - 1 3. (* 3 + 2 = 2 + 3 *) (* Tactic applied at all but the given positions *)

Occurrences in different terms

Up to this point, we only saw occurrences applied to the goal, but they can also target hypotheses:

Goal 1 + 1 = 2 + 1 -> 2 + 2 = 5. vm_compute "+" in H at 2. (* H becomes 1 + 1 = 3 *)

Multiple hypotheses can be targeted at once:

Goal 2 + 3 = 4 + 1 -> (1 + 3) - (1 + 1) = 2 + 1 -> 2 + 2 = 5. vm_compute "+" in H at 2, H0 at 1 3. (* H becomes 2 + 3 = 5, H0 becomes 4 - (1 + 1) = 3 *)

All hypotheses:

Goal 2 + 3 = 4 + 1 -> (1 + 3) - (1 + 1) = 2 + 1 -> 2 + 2 = 5. vm_compute plus in * |-. (* Fully reduces every hypothesis, but leaves the goal untouched *)

Some hypotheses and goal:

Goal 2 + 3 = 4 + 1 -> 1 + 1 = 2 + (1 + 1) -> 2 + 2 = 5 + (4 - 1). vm_compute plus in H at 1, H0 at -2 |- * at 1. (* 5 = 4 + 1 -> 2 = 2 + 2 -> 4 = 5 + (4 - 1) *)

Simple occurrences

For some reason, there are tactics which accept occurrences but not occurrence numbers ("simple occurrences" — that is, you can pass a pattern but you can't focus on occurrence x or y). This is not imposed at the syntactic level: that is, you won't get a parsing error when trying to pass at x arguments to such tactics, but you will get an error message telling you that this tactic doesn't take at clauses.

2.2. Other methods for bunched applications

Besides mundane occurrences, there are some mechanisms for bunched applications that are exclusive to some tactics.

unfold

Definition a : nat = 2. Definition b : nat = 2. Goal a = b. unfold a, b. (* 2 = 2 *)

In fact, unfold takes a list of reference occurrences as arguments, so it also works with notations such as "+".

fold

fold works a bit similarly, but without commas between arguments (life wouldn't be much fun if everything was predictable).

Definition a : nat := 2. Definition b : nat := 3. Goal 2 = 3. fold b a. (* a = b *) Restart. fold a b. (* a = S a *)

pattern

pattern takes commas:

Axiom a : nat. Goal 2 = a. pattern 2, a. (* (fun n n0 : nat => n = n0) 2 a *)

TODO rewrite/replace

rewrite takes a comma separated list of the form (-> | <-)? (<n>)? (? | !)? <rewrite>. These are called oriented rewriters. Each of them are applied in the order of the list. If one of them fails, then so does the whole tactic. The first component of an oriented rewriter is the orientation — simple enough. Then comes a natural number indicating how many rewrites to perform. The meaning of this number depends on the following component. If it is ? means that the number is an upper bound on the number of rewrites to perform, where 0 is not considered a failure, whereas ! means that it is precisely the number of rewrites to be performed. In the absence of a number, ? simply means try to rewrite and ! means rewrite and fail if you can't (! is the default, no matter if a number is given or not). Finally, Note that only the last of these items is required.

TODO finish. Can't get the natural number to behave like I expect it to. Present rewrite *, impact of occurrences on success

subst

subst doesn't take commas:

Goal forall a b, a = 2 -> b = 2 -> a = b. intros. subst a b. (* 2 = 2 *)

2.3. Opacity

There are some situations where you know that unfolding some definition will generally lead to trouble in a context. The thing is, Coq tactics can be complex beasts which do a lot of things implicitly. It would take a lot of care to pass the appropriate flags to ensure that no tactic starts unfolding said defintion. Wouldn't it be nicer to just mark the term as not to be unfolded in the current context? This is precisely what opacity is about.

The Opaque command marks some definitions as not to be unfolded by tactics:

Definition a := 2. Opaque a. Goal a = 2. cbv a. (* no change: a = 2 *) unfold a. (* fails with "a is opaque." *)

The opposite of Opaque is Transparent.

Definition a := 2. Opaque a. Transparent a. Goal a = 2. cbv a. (* 2 = 2 *)

Some tactics do not respect opacity:

Definition a := 2. Opaque a. Goal a = 2. vm_compute. (* 2 = 2 *)

Another context which features opacity is calls to Qed. Closing a proof with Qed makes the proof terminately opaque (it can't be salvaged through Transparent). For most proofs, that is perfectly fine, but in the cases where it's not the proof should be closed with Defined, which keeps everything transparent.

2.4. Strategy

The Strategy command (man) is

2.5. Arguments

The Arguments command (man) comes in handy for controlling when functions are evaluated by tactics. For instance, I often had issues with functions which blow up when Coq tries to evaluate them when e.g. their second argument is not concrete. So, we would like to evaluate these functions when their second argument is concrete but not otherwise. A perfect usecase for Arguments!

Arguments is also used in other contexts (its most common usecase is making some arguments of a function implicit). Here, we focus only on its uses related to the control of the unfolding process. This mechanism is only taken into account by simpl and cbn, however.

Arguments can block the unfolding of a constant, like a worse version of Opaque.

Arguments plus n m : simpl never. Goal 2 + 3 = 8 - 3. cbn. (* 2 + 3 = 5 *)

It can also tell when to force the unfolding of a function definition:

Definition foo (a b c: nat) := match a with | 0 => 4 | _ => match b with | 0 => 2 | _ => c end end. Goal foo 1 2 = foo 3 4. cbn. (* foo 1 2 = foo 3 4 *) Arguments foo a b / c. cbn. (* (fun c : nat => c) = (fun c : nat => c) *)

In my experience, the most interesting feature of Arguments is its ability to block the evaluation.

Axiom a : nat. Arguments minus n! m!. Goal 2 - (a + 2) = 3 - (2 + 1). cbn. (* match a + 2 with | 0 => 2 | 1 => 1 | S (S _) => 0 end = 0 *) Restart. Arguments minus !n !m. cbn. (* 2 - (a + 2) = 0 *)

!<arg> means "before evaluating the function, make sure that argument <arg> can be evaluated down to a concrete value". It is very useful for avoiding explosions.

Finally, Arguments can be used to block evaluations which lead to new match construct in the head of the simplified subterm.

Axiom a : nat. Arguments minus n! m!. Goal 2 - (a + 2) = 3 - (2 + 1). cbn. (* match a + 2 with | 0 => 2 | 1 => 1 | S (S _) => 0 end = 0 *) Restart. Arguments minus n m: simpl nomatch. cbn. (* 2 - (a + 2) = 0 *)

In this example, it helps us get the exact same behavior as in the previous one which made use of ! instead.

3. Until when transformations are applied — and in which order

3.1. A word on reduction steps

3.2. Restricting the reduction types

3.3. Normal forms

Normal form (NF)

Head normal form (HNF)

Weak head normal form (WHNF)

4. Recap

In the next episode…

We will make a case about the need for the availability of single reduction steps — and propose a solution.