Let us consider several ways of implementing the natural-number-predecessor function. Recall the definition from the standard library:
Returning 0 as the predecessor of 0 can come across as somewhat of a hack.
In some situations, we might like to be sure that we never try to take the
predecessor of 0. We can enforce this by giving pred
a stronger, dependent
type.
0 > 0 -> Falselia. Qed. Definition pred_strong1 {n : nat} : n > 0 -> nat := match n with | O => fun pf : 0 > 0 => match zgtz pf with end | S n' => fun _ => n' end.0 > 0 -> False
We expand the type of pred
to include a proof that its argument n
is
greater than 0. When n
is 0, we use the proof to derive a contradiction,
which we can use to build a value of any type via a vacuous pattern match.
When n
is a successor, we have no need for the proof and just return the
answer. The proof argument can be said to have a dependent type, because
its type depends on the value of the argument n
.
Coq's Compute
command can execute particular invocations of pred_strong1
just as easily as it can execute more traditional functional programs.
2 > 0lia. Qed.2 > 0
One aspect in particular of the definition of pred_strong1
may be
surprising. We took advantage of Definition
's syntactic sugar for defining
function arguments in the case of n
, but we bound the proofs later with
explicit fun
expressions. Let us see what happens if we write this
function in the way that at first seems most natural.
The term zgtz pf
fails to type-check. Somehow the type checker has failed
to take into account information that follows from which match
branch that
term appears in. The problem is that, by default, match
does not let us
use such implied information. To get refined typing, we must always rely on
match
annotations, either written explicitly or inferred.
In this case, we must use a return
annotation to declare the relationship
between the value of the match
discriminee and the type of the result.
There is no annotation that lets us declare a relationship between the
discriminee and the type of a variable that is already in scope; hence, we
delay the binding of pf
, so that we can use the return
annotation to
express the needed relationship.
We are lucky that Coq's heuristics infer the return
clause (specifically,
return n > 0 -> nat
) for us in the definition of pred_strong1
, leading to
the following elaborated code:
Definition pred_strong1' (n : nat) : n > 0 -> nat :=
match n return n > 0 -> nat with
| O => fun pf : 0 > 0 => match zgtz pf with end
| S n' => fun _ => n'
end.
By making explicit the functional relationship between value n
and the
result type of the match
, we guide Coq toward proper type checking. The
clause for this example follows by simple copying of the original annotation
on the definition. In general, however, the match
annotation inference
problem is undecidable. The known undecidable problem of
higher-order unification reduces to the match
type inference problem.
Over time, Coq is enhanced with more and more heuristics to get around this
problem, but there must always exist match
es whose types Coq cannot infer
without annotations.
Let us now take a look at the OCaml code Coq generates for pred_strong1
.
The proof argument has disappeared! We get exactly the OCaml code we would have written manually. This is our first demonstration of the main technically interesting feature of Coq program extraction: proofs are erased systematically.
We can reimplement our dependently typed pred
based on subset types,
defined in the standard library with the type family sig
.
We rewrite pred_strong1
, using some syntactic sugar for subset types, after
we deactivate some clashing notations for set literals.
Definition pred_strong2 (s : {n : nat | n > 0} ) : nat := match s with | exist _ O pf => match zgtz pf with end | exist _ (S n') _ => n' end.
To build a value of a subset type, we use the exist
constructor, and the
details of how to do that follow from the output of our earlier Print sig
command, where we elided the extra information that parameter A
is
implicit.
We arrive at the same OCaml code as was extracted from pred_strong1
, which
may seem surprising at first. The reason is that a value of sig
is a pair
of two pieces, a value and a proof about it. Extraction erases the proof,
which reduces the constructor exist
of sig
to taking just a single
argument. An optimization eliminates uses of datatypes with single
constructors taking single arguments, and we arrive back where we started.
We can continue on in the process of refining pred
's type. Let us change
its result type to capture that the output is really the predecessor of the
input.
Definition pred_strong3 (s : {n : nat | n > 0}) : {m : nat | proj1_sig s = S m} := match s return {m : nat | proj1_sig s = S m} with | exist _ 0 pf => match zgtz pf with end | exist _ (S n') pf => exist _ n' (eq_refl _) end.
A value in a subset type can be thought of as a dependent pair (or
sigma type) of a base value and a proof about it. The function proj1_sig
extracts the first component of the pair. It turns out that we need to
include an explicit return
clause here, since Coq's heuristics are not
smart enough to propagate the result type that we wrote earlier.
By now, the reader is probably ready to believe that the new pred_strong
leads to the same OCaml code as we have seen several times so far, and Coq
does not disappoint.
We have managed to reach a type that is, in a formal sense, the most
expressive possible for pred
. Any other implementation of the same type
must have the same input-output behavior. However, there is still room for
improvement in making this kind of code easier to write. Here is a version
that takes advantage of tactic-based theorem proving. We switch back to
passing a separate proof argument instead of using a subset type for the
function's input, because this leads to cleaner code. (False_rec
is a
library function that can be used to produce a value in any type given a
proof of False
. It's defined in terms of the vacuous pattern match we saw
earlier.)
forall n : nat, n > 0 -> {m : nat | n = S m}forall n : nat, n > 0 -> {m : nat | n = S m}n: nat
g: 0 > 0Falsen, n': nat
g: S n' > 0S n' = S n'
We build pred_strong4
using tactic-based proving, beginning with a
Definition
command that ends in a period before a definition is given.
Such a command enters the interactive proving mode, with the type given for
the new identifier as our proof goal.
We do most of the work with the refine
tactic, to which we pass a partial
"proof" of the type we are trying to prove. There may be some pieces left
to fill in, indicated by underscores. Any underscore that Coq cannot
reconstruct with type inference is added as a proof subgoal. In this case,
we have two subgoals.
We can see that the first subgoal comes from the second underscore passed
to False_rec
, and the second subgoal comes from the second underscore
passed to exist
. In the first case, we see that, though we bound the
proof variable with an underscore, it is still available in our proof
context. Both subgoals are easy to discharge, so we can back up and ask to
prove all subgoals automatically.
refine (fun n => match n with | O => fun _ => False_rec _ _ | S n' => fun _ => exist _ n' _ end); lia || congruence.forall n : nat, n > 0 -> {m : nat | n = S m}forall n : nat, n > 0 -> {m : nat | n = S m}
As another alternative, we can fill out the missing pieces of the proof:
- In the O
case, we apply zgtz
to the proof of 0 > 0
- In the S n'
case, we supply eq_refl
as the proof of S n' = S n'
when
constructing the subset type.
Following this approach, we can complete the proof using exact
rather than refine
(though refine
would work too).
refine (fun n => match n with | O => fun H => False_rec _ (zgtz H) | S n' => fun _ => exist _ n' eq_refl end). Defined.
We see the code we entered, with some (pretty long!) proofs filled in.
We are almost done with the ideal implementation of dependent predecessor. We can use Coq's syntax-extension facility to arrive at code with almost no complexity beyond a Haskell or ML program with a complete specification in a comment. In this book, we will not dwell on the details of syntax extensions; the Coq manual gives a straightforward introduction to them.
Notation "!" := (False_rec _ _). Notation "[ e ]" := (exist _ e _).forall n : nat, n > 0 -> {m : nat | n = S m}refine (fun n => match n with | O => fun _ => ! | S n' => fun _ => [n'] end); congruence || lia. Defined.forall n : nat, n > 0 -> {m : nat | n = S m}
Recall the definition of length-index lists.
Module ilist. Section ilist. Context {A : Set}. Inductive ilist : nat -> Set := | Nil : ilist O | Cons : forall {n}, A -> ilist n -> ilist (S n). End ilist. Arguments ilist A n : clear implicits. End ilist.
Should we want to get rid of the n
in the type of ilist
. Is it
possible? Here is an attempt we explored previously... but there is a
fly in the ointment, due to how Prop
works in Coq.
Module ilist_propnat. Section ilist_propnat. Context {A : Set}. Inductive pnat : Prop := | P0 | PS : pnat -> pnat. Inductive ilist : pnat -> Type := | Nil : ilist P0 | Cons (a: A) {n: pnat} (il: ilist n) : ilist (PS n).Inductive pnat_nat_equiv : forall (p: pnat) (n: nat), Prop := | PN0: pnat_nat_equiv P0 0 | PNS: forall p n, pnat_nat_equiv p n -> pnat_nat_equiv (PS p) (S n). Fixpoint length {n} (il: ilist n) : nat := match il with | Nil => 0 | Cons a il => S (length il) end.A: Set
n: pnat
il: ilist npnat_nat_equiv n (length il)induction il; constructor; eauto. Qed. Definition hd {p} (il : ilist p) (Hneq:p <> P0) := match il as il' in (ilist p') return (p = p' -> A) with | Nil => fun Heq => False_rec A (Hneq Heq) | Cons h t => fun _ => h end eq_refl.A: Set
n: pnat
il: ilist npnat_nat_equiv n (length il)A: Setforall p : pnat, P0 <> PS pA: Setforall p : pnat, P0 <> PS pA: Setforall p : pnat, P0 = PS p -> FalseA: Set
p: pnat
H: P0 = PS pFalseAbort.A: Set
p: pnat
H: P0 = PS pFalseA: Setforall n : nat, 0 <> S nA: Setforall n : nat, 0 <> S nA: Setforall n : nat, 0 = S n -> Falseinversion H. Qed. End ilist_propnat. End ilist_propnat.A: Set
n: nat
H: 0 = S nFalse
In many situations, it is useful to apply a technique that Adam Chlipala calls "the convoy pattern." Recall that match annotations only make it possible to describe a dependence of a match result type on the discriminee. There is no automatic refinement of the types of free variables. However, it is possible to effect such a refinement by finding a way to encode free variable type dependencies in the match result type, so that a return clause can express the connection.
Module MoreIlist. Import ilist.Definition firstElements {n A B} (ls1 : ilist A n) (ls2 : ilist B n) : option (A * B) := match ls1 in ilist _ N return ilist B N -> option (A * B) with | Cons v1 _ => fun ls2 => Some (v1, match ls2 in ilist _ N return match N with | O => unit | S _ => B end with | Cons v2 _ => v2 | Nil => tt end) | Nil => fun _ => None end ls2.
Note use of a struct
annotation to tell Coq which argument should decrease
across recursive calls. It's an artificial choice here, since usually those
annotations are inferred. Here we are making an effort to demonstrate a
decently common problem!
Fixpoint zip {n A B} (ls1 : ilist A n) (ls2 : ilist B n) {struct ls1} : ilist (A * B) n := match ls1 in ilist _ N return ilist B N -> ilist (A * B) N with | Cons v1 ls1' => fun ls2 => match ls2 in ilist _ N return match N with | O => unit | S N' => (ilist B N' -> ilist (A * B) N') -> ilist (A * B) N end with | Cons v2 ls2' => fun zip_ls1' => Cons (v1, v2) (zip_ls1' ls2') | Nil => tt end (zip ls1') | Nil => fun _ => Nil end ls2. End MoreIlist.
Module fixlist. Section fixlist. Context {A : Type}. Fixpoint fixlist n: Type := match n with | 0 => unit | S n => A * fixlist n end. Definition hd {n} (v: fixlist (S n)) := fst v. Fixpoint app {n0} (v0: fixlist n0) {n1} (v1: fixlist n1): fixlist (n0 + n1) := match n0 return fixlist n0 -> fixlist (n0 + n1) with | 0 => fun _ => v1 | S n => fun v0 => (fst v0, app (snd v0) v1) end v0. Fixpoint inject (ls : list A): fixlist (length ls) := match ls return fixlist (length ls) with | [] => tt | hd :: tl => (hd, inject tl) end. Fixpoint unject {n} (v : fixlist n): list A := match n return fixlist n -> list A with | 0 => fun _ => [] | S n => fun v => fst v :: unject (snd v) end v.A: Typeforall l : list A, unject (inject l) = linduction l; simpl; congruence. Qed. End fixlist. Arguments fixlist A n : clear implicits.A: Typeforall l : list A, unject (inject l) = l
Without peeking, try to define the functions zip
, map
, and map2
.
Fixpoint zip {A B n} (va: fixlist A n) (vb: fixlist B n) : fixlist (A * B) n := match n return fixlist A n -> fixlist B n -> fixlist (A * B) n with | 0 => fun _ _ => tt | S n => fun va vb => ((fst va, fst vb), zip (snd va) (snd vb)) end va vb. Fixpoint map {A B n} (f: A -> B) := match n return fixlist A n -> fixlist B n with | 0 => fun _ => tt | S n => fun fl => (f (fst fl), map f (snd fl)) end. Fixpoint map2 {A B C n} (f: A -> B -> C) := match n return fixlist A n -> fixlist B n -> fixlist C n with | 0 => fun _ _ => tt | S n => fun fla flb => (f (fst fla) (fst flb), map2 f (snd fla) (snd flb)) end. End fixlist.
Next, define an analog to Fin.t
in the style of fixlist
(fixnat
).
Import fixlist. Fixpoint fixnat (n: nat): Set := match n with | 0 => False | S n => option (fixnat n) end. Fixpoint fixlist_nth {A n} (fl: fixlist A n) (idx: fixnat n) {struct n}: A := match n return fixlist A n -> fixnat n -> A with | 0 => fun _ idx => False_rect A idx | S n => fun fl idx => match idx with | Some idx => fixlist_nth (snd fl) idx | None => (fst fl) end end fl idx. Fixpoint fixnat_to_nat {n} := match n return fixnat n -> nat with | 0 => fun f => False_rect _ f | S n => fun f => match f with | None => 0 | Some f => S (fixnat_to_nat f) end end.