Lecture 20: Even more dependent types

Author: Adam Chlipala, with modifications by CS-628 and CS 6115 staff.
License: No redistribution allowed (usage by permission in CS-628 and CS 6115).

More Subset Types
The Convoy Pattern
Exercise: a different take on length-indexed lists

More Subset Types

Let us consider several ways of implementing the natural-number-predecessor function. Recall the definition from the standard library:

Print Nat.pred.Nat.pred =
fun n : nat => match n with
               | 0 => n
               | S u => u
               end
     : nat -> nat

Arguments Nat.pred n%nat_scope

Extraction pred.(** val pred : nat -> nat **)

let pred n = match n with
| O -> n
| S u -> u

Returning 0 as the predecessor of 0 can come across as somewhat of a hack. In some situations, we might like to be sure that we never try to take the predecessor of 0. We can enforce this by giving pred a stronger, dependent type.

Lemma zgtz : 0 > 0 -> False.0 > 0 -> False
Proof.0 > 0 -> False
  lia.
Qed.

Definition pred_strong1 {n : nat} : n > 0 -> nat :=
  match n with
  | O => fun pf : 0 > 0 => match zgtz pf with end
  | S n' => fun _ => n'
  end.

We expand the type of pred to include a proof that its argument n is greater than 0. When n is 0, we use the proof to derive a contradiction, which we can use to build a value of any type via a vacuous pattern match. When n is a successor, we have no need for the proof and just return the answer. The proof argument can be said to have a dependent type, because its type depends on the value of the argument n.

Coq's Compute command can execute particular invocations of pred_strong1 just as easily as it can execute more traditional functional programs.

Theorem two_gt0 : 2 > 0.2 > 0
Proof.2 > 0
  lia.
Qed.

Compute pred_strong1 two_gt0.= 1
: nat

One aspect in particular of the definition of pred_strong1 may be surprising. We took advantage of Definition's syntactic sugar for defining function arguments in the case of n, but we bound the proofs later with explicit fun expressions. Let us see what happens if we write this function in the way that at first seems most natural.

Fail Definition pred_strong1' (n : nat) (pf : n > 0) : nat :=
  match n with
  | O => match zgtz pf with end
  | S n' => n'
  end.The command has indeed failed with message:
In environment
n : nat
pf : n > 0
The term "pf" has type "n > 0"
while it is expected to have type "0 > 0".

The term zgtz pf fails to type-check. Somehow the type checker has failed to take into account information that follows from which match branch that term appears in. The problem is that, by default, match does not let us use such implied information. To get refined typing, we must always rely on match annotations, either written explicitly or inferred.

In this case, we must use a return annotation to declare the relationship between the value of the match discriminee and the type of the result. There is no annotation that lets us declare a relationship between the discriminee and the type of a variable that is already in scope; hence, we delay the binding of pf, so that we can use the return annotation to express the needed relationship.

We are lucky that Coq's heuristics infer the return clause (specifically, return n > 0 -> nat) for us in the definition of pred_strong1, leading to the following elaborated code:

Definition pred_strong1' (n : nat) : n > 0 -> nat := 
  match n return n > 0 -> nat with
  | O => fun pf : 0 > 0 => match zgtz pf with end
  | S n' => fun _ => n'
  end.

By making explicit the functional relationship between value n and the result type of the match, we guide Coq toward proper type checking. The clause for this example follows by simple copying of the original annotation on the definition. In general, however, the match annotation inference problem is undecidable. The known undecidable problem of higher-order unification reduces to the match type inference problem. Over time, Coq is enhanced with more and more heuristics to get around this problem, but there must always exist matches whose types Coq cannot infer without annotations.

Let us now take a look at the OCaml code Coq generates for pred_strong1.

Extraction pred_strong1.(** val pred_strong1 : nat -> nat **)

let pred_strong1 = function
| O -> assert false (* absurd case *)
| S n' -> n'

The proof argument has disappeared! We get exactly the OCaml code we would have written manually. This is our first demonstration of the main technically interesting feature of Coq program extraction: proofs are erased systematically.

We can reimplement our dependently typed pred based on subset types, defined in the standard library with the type family sig.

Print sig.Inductive sig (A : Type) (P : A -> Prop) : Type :=
    exist : forall x : A, P x -> {x : A | P x}.

Arguments sig [A]%type_scope P%type_scope
Arguments exist [A]%type_scope P%function_scope x _

We rewrite pred_strong1, using some syntactic sugar for subset types, after we deactivate some clashing notations for set literals.

Locate "{ _ : _ | _ }".Notation "{ x : A | P }" := (sig (fun x => P))
  : type_scope (default interpretation)

Definition pred_strong2 (s : {n : nat | n > 0} ) : nat :=
  match s with
  | exist _ O pf => match zgtz pf with end
  | exist _ (S n') _ => n'
  end.

To build a value of a subset type, we use the exist constructor, and the details of how to do that follow from the output of our earlier Print sig command, where we elided the extra information that parameter A is implicit.

Compute pred_strong2 (exist _ 2 two_gt0).= 1
: nat

Extraction pred_strong2.(** val pred_strong2 : nat -> nat **)

let pred_strong2 = function
| O -> assert false (* absurd case *)
| S n' -> n'

We arrive at the same OCaml code as was extracted from pred_strong1, which may seem surprising at first. The reason is that a value of sig is a pair of two pieces, a value and a proof about it. Extraction erases the proof, which reduces the constructor exist of sig to taking just a single argument. An optimization eliminates uses of datatypes with single constructors taking single arguments, and we arrive back where we started.

We can continue on in the process of refining pred's type. Let us change its result type to capture that the output is really the predecessor of the input.

Definition pred_strong3
  (s : {n : nat | n > 0}) : {m : nat | proj1_sig s = S m} :=
  match s return {m : nat | proj1_sig s = S m} with
  | exist _ 0 pf => match zgtz pf with end
  | exist _ (S n') pf => exist _ n' (eq_refl _)
  end.

Compute pred_strong3 (exist _ 2 two_gt0).= exist
    (fun m : nat =>
     proj1_sig (exist (lt 0) 2 two_gt0) = S m) 1
    eq_refl
: {m : nat
  | proj1_sig (exist (lt 0) 2 two_gt0) = S m}

A value in a subset type can be thought of as a dependent pair (or sigma type) of a base value and a proof about it. The function proj1_sig extracts the first component of the pair. It turns out that we need to include an explicit return clause here, since Coq's heuristics are not smart enough to propagate the result type that we wrote earlier.

By now, the reader is probably ready to believe that the new pred_strong leads to the same OCaml code as we have seen several times so far, and Coq does not disappoint.

Extraction pred_strong3.(** val pred_strong3 : nat -> nat **)

let pred_strong3 = function
| O -> assert false (* absurd case *)
| S n' -> n'

We have managed to reach a type that is, in a formal sense, the most expressive possible for pred. Any other implementation of the same type must have the same input-output behavior. However, there is still room for improvement in making this kind of code easier to write. Here is a version that takes advantage of tactic-based theorem proving. We switch back to passing a separate proof argument instead of using a subset type for the function's input, because this leads to cleaner code. (False_rec is a library function that can be used to produce a value in any type given a proof of False. It's defined in terms of the vacuous pattern match we saw earlier.)

Definition pred_strong4 :
  forall {n : nat}, n > 0 -> {m : nat | n = S m}.forall n : nat, n > 0 -> {m : nat | n = S m}
Proof.forall n : nat, n > 0 -> {m : nat | n = S m}
  refine (fun n =>
    match n with
    | O => fun _ => False_rec _ _
    | S n' => fun _ => exist _ n' _
    end).n: nat
g: 0 > 0
False
n, n': nat
g: S n' > 0
S n' = S n'

We build pred_strong4 using tactic-based proving, beginning with a Definition command that ends in a period before a definition is given. Such a command enters the interactive proving mode, with the type given for the new identifier as our proof goal.

We do most of the work with the refine tactic, to which we pass a partial "proof" of the type we are trying to prove. There may be some pieces left to fill in, indicated by underscores. Any underscore that Coq cannot reconstruct with type inference is added as a proof subgoal. In this case, we have two subgoals.

We can see that the first subgoal comes from the second underscore passed to False_rec, and the second subgoal comes from the second underscore passed to exist. In the first case, we see that, though we bound the proof variable with an underscore, it is still available in our proof context. Both subgoals are easy to discharge, so we can back up and ask to prove all subgoals automatically.

  Undo.forall n : nat, n > 0 -> {m : nat | n = S m}
  refine (fun n =>
    match n with
    | O => fun _ => False_rec _ _
    | S n' => fun _ => exist _ n' _
    end); lia || congruence.
  Undo.forall n : nat, n > 0 -> {m : nat | n = S m}

As another alternative, we can fill out the missing pieces of the proof: - In the O case, we apply zgtz to the proof of 0 > 0 - In the S n' case, we supply eq_refl as the proof of S n' = S n' when constructing the subset type. Following this approach, we can complete the proof using exact rather than refine (though refine would work too).

  refine (fun n =>
    match n with
    | O => fun H => False_rec _ (zgtz H)
    | S n' => fun _ => exist _ n' eq_refl
    end).
Defined.

Print pred_strong4.pred_strong4 =
fun n : nat =>
match
  n as n0 return (n0 > 0 -> {m : nat | n0 = S m})
with
| 0 =>
    fun H : 0 > 0 =>
    False_rec {m : nat | 0 = S m} (zgtz H)
| S n' =>
    fun _ : S n' > 0 =>
    exist (fun m : nat => S n' = S m) n' eq_refl
end
     : forall n : nat, n > 0 -> {m : nat | n = S m}

Arguments pred_strong4 {n}%nat_scope _

We see the code we entered, with some (pretty long!) proofs filled in.

Compute pred_strong4 two_gt0.= exist (fun m : nat => 2 = S m) 1 eq_refl
: {m : nat | 2 = S m}

We are almost done with the ideal implementation of dependent predecessor. We can use Coq's syntax-extension facility to arrive at code with almost no complexity beyond a Haskell or ML program with a complete specification in a comment. In this book, we will not dwell on the details of syntax extensions; the Coq manual gives a straightforward introduction to them.

Notation "!" := (False_rec _ _).
Notation "[ e ]" := (exist _ e _).

Definition pred_strong5 :
  forall {n : nat}, n > 0 -> {m : nat | n = S m}.forall n : nat, n > 0 -> {m : nat | n = S m}
Proof.forall n : nat, n > 0 -> {m : nat | n = S m}
  refine (fun n =>
    match n with
    | O => fun _ => !
    | S n' => fun _ => [n']
    end); congruence || lia.
Defined.

Compute pred_strong5 two_gt0.= [1]
: {m : nat | 2 = S m}

Recall the definition of length-index lists.

Module ilist.
  Section ilist.
    Context {A : Set}.

    Inductive ilist : nat -> Set :=
    | Nil : ilist O
    | Cons : forall {n}, A -> ilist n -> ilist (S n).

  End ilist.

  Arguments ilist A n : clear implicits.

End ilist.

Should we want to get rid of the n in the type of ilist. Is it possible? Here is an attempt we explored previously... but there is a fly in the ointment, due to how Prop works in Coq.

Module ilist_propnat.
  Section ilist_propnat.
    Context {A : Set}.

    Inductive pnat : Prop :=
    | P0
    | PS : pnat -> pnat.

    Inductive ilist : pnat -> Type :=
    | Nil : ilist P0
    | Cons (a: A) {n: pnat} (il: ilist n) : ilist (PS n).

    Fail Fixpoint nat_of_pnat (p: pnat) : nat :=
      match p with
      | P0 => 0
      | PS p => S (nat_of_pnat p)
      end.The command has indeed failed with message:
Incorrect elimination of "p" in the inductive type
"pnat":
the return type has sort "Set"
while it should be SProp or Prop.
Elimination of an inductive object of sort Prop
is not allowed on a predicate in sort "Set"
because proofs can be eliminated only to build proofs.

    Inductive pnat_nat_equiv : forall (p: pnat) (n: nat), Prop :=
    | PN0: pnat_nat_equiv P0 0
    | PNS: forall p n, pnat_nat_equiv p n ->
                  pnat_nat_equiv (PS p) (S n).

    Fixpoint length {n} (il: ilist n) : nat :=
      match il with
      | Nil => 0
      | Cons a il => S (length il)
      end.

    Lemma length_ok {n} (il: ilist n) :
      pnat_nat_equiv n (length il).A: Set
n: pnat
il: ilist n
pnat_nat_equiv n (length il)
    Proof.A: Set
n: pnat
il: ilist n
pnat_nat_equiv n (length il)
      induction il; constructor; eauto.
    Qed.

    Definition hd {p} (il : ilist p) (Hneq:p <> P0) :=
      match il as il' in (ilist p') return (p = p' -> A) with
      | Nil =>
          fun Heq => False_rec A (Hneq Heq)
      | Cons h t =>
          fun _ => h
      end eq_refl.

    Lemma sanity_pnat: forall p, P0 <> PS p.A: Set
forall p : pnat, P0 <> PS p
    Proof.A: Set
forall p : pnat, P0 <> PS p
      unfold "<>".A: Set
forall p : pnat, P0 = PS p -> False
      intros.A: Set
p: pnat
H: P0 = PS p
False
      inversion H.A: Set
p: pnat
H: P0 = PS p
False
      Abort.

    Lemma sanity_nat: forall n, 0 <> S n.A: Set
forall n : nat, 0 <> S n
    Proof.A: Set
forall n : nat, 0 <> S n
      unfold "<>".A: Set
forall n : nat, 0 = S n -> False
      intros.A: Set
n: nat
H: 0 = S n
False
      inversion H.
    Qed.

  End ilist_propnat.
End ilist_propnat.

The Convoy Pattern

In many situations, it is useful to apply a technique that Adam Chlipala calls "the convoy pattern." Recall that match annotations only make it possible to describe a dependence of a match result type on the discriminee. There is no automatic refinement of the types of free variables. However, it is possible to effect such a refinement by finding a way to encode free variable type dependencies in the match result type, so that a return clause can express the connection.

Module MoreIlist.
  Import ilist.

  Fail Definition firstElements
       {n A B}
       (ls1 : ilist A n)
       (ls2 : ilist B n) : option (A * B) :=
    match ls1 with
    | Cons v1 _ =>
        Some (v1,
               match ls2 in ilist _ N
                     return match N with
                            | O => unit
                            | S _ => B end with
               | Cons v2 _ => v2
               | Nil => tt
               end)
    | Nil => None
    end.The command has indeed failed with message:
In environment
n : nat
A : Set
B : Set
ls1 : ilist A n
ls2 : ilist B n
n0 : nat
v1 : A
i : ilist A n0
The term
 "Some
    (v1,
     match
       ls2 in (ilist _ N)
       return match N with
              | 0 => unit
              | S _ => B
              end
     with
     | Nil => tt
     | Cons v2 _ => v2
     end)" has type
 "option (A * match n with
              | 0 => unit
              | S _ => B
              end)" while it is expected to have type
 "option (A * B)".

  Definition firstElements
             {n A B} (ls1 : ilist A n) (ls2 : ilist B n)
    : option (A * B) :=
    match ls1 in ilist _ N return ilist B N -> option (A * B) with
    | Cons v1 _ =>
        fun ls2 =>
          Some (v1,
                 match ls2 in ilist _ N
                       return match N with
                              | O => unit
                              | S _ => B end with
                 | Cons v2 _ => v2
                 | Nil => tt
                 end)
    | Nil => fun _ => None
    end ls2.

Note use of a struct annotation to tell Coq which argument should decrease across recursive calls. It's an artificial choice here, since usually those annotations are inferred. Here we are making an effort to demonstrate a decently common problem!

  Fail Fixpoint zip
       {n A B} (ls1 : ilist A n) (ls2 : ilist B n)
       {struct ls1} : ilist (A * B) n :=
    match ls1 in ilist _ N return ilist B N -> ilist (A * B) N with
    | Cons v1 ls1' =>
        fun ls2 =>
          match ls2 in ilist _ N return
                match N with
                | O => unit
                | S N' => ilist A N' -> ilist (A * B) N
                end with
          | Cons v2 ls2' =>
              fun ls1' => Cons (v1, v2) (zip ls1' ls2')
          | Nll => tt
          end ls1'
    | Nil => fun _ => Nil
    end ls2.Unused variable Nll might be a misspelled constructor.
Use _ or _Nll to silence this warning.
[unused-pattern-matching-variable,default]
The command has indeed failed with message:
Recursive definition of zip is ill-formed.
In environment
zip :
forall (n : nat) (A B : Set),
ilist A n -> ilist B n -> ilist (A * B) n
n : nat
A : Set
B : Set
ls1 : ilist A n
ls2 : ilist B n
n0 : nat
v1 : A
ls1' : ilist A n0
ls0 : ilist B (S n0)
n1 : nat
v2 : B
ls2' : ilist B n1
ls1'0 : ilist A n1
Recursive call to zip has principal argument equal to
"ls1'0" instead of "ls1'".
Recursive definition is:
"fun (n : nat) (A B : Set) (ls1 : ilist A n)
   (ls2 : ilist B n) =>
 match
   ls1 in (ilist _ N)
   return (ilist B N -> ilist (A * B) N)
 with
 | Nil => fun _ : ilist B 0 => Nil
 | @Cons _ n0 v1 ls1' =>
     fun ls3 : ilist B (S n0) =>
     match
       ls3 in (ilist _ N)
       return
         match N with
         | 0 => unit
         | S N' => ilist A N' -> ilist (A * B) N
         end
     with
     | Nil => tt
     | @Cons _ n1 v2 ls2' =>
         fun ls1'0 : ilist A n1 =>
         Cons (v1, v2) (zip n1 A B ls1'0 ls2')
     end ls1'
 end ls2".

  Fixpoint zip {n A B} (ls1 : ilist A n) (ls2 : ilist B n)
           {struct ls1} : ilist (A * B) n :=
    match ls1 in ilist _ N return ilist B N -> ilist (A * B) N with
    | Cons v1 ls1' =>
        fun ls2 =>
          match ls2 in ilist _ N
                return match N with
                       | O => unit
                       | S N' => (ilist B N' ->
                                  ilist (A * B) N') ->
                                  ilist (A * B) N
                       end with
          | Cons v2 ls2' =>
              fun zip_ls1' => Cons (v1, v2) (zip_ls1' ls2')
          | Nil => tt
          end (zip ls1')
    | Nil => fun _ => Nil
    end ls2.
End MoreIlist.

Exercise: a different take on length-indexed lists

Module fixlist.
  Section fixlist.
    Context {A : Type}.

    Fixpoint fixlist n: Type :=
      match n with
      | 0 => unit
      | S n => A * fixlist n
      end.

    Definition hd {n} (v: fixlist (S n)) :=
      fst v.

    Fixpoint app {n0} (v0: fixlist n0) {n1} (v1: fixlist n1):
      fixlist (n0 + n1) :=
      match n0 return fixlist n0 -> fixlist (n0 + n1) with
      | 0 => fun _ => v1
      | S n => fun v0 => (fst v0, app (snd v0) v1)
      end v0.

    Fixpoint inject (ls : list A): fixlist (length ls) :=
      match ls return fixlist (length ls) with
      | [] => tt
      | hd :: tl => (hd, inject tl)
      end.

    Fixpoint unject {n} (v : fixlist n): list A :=
      match n return fixlist n -> list A with
      | 0 => fun _ => []
      | S n => fun v => fst v :: unject (snd v)
      end v.

    Theorem inject_unject_identity : forall l, unject (inject l) = l.A: Type
forall l : list A, unject (inject l) = l
    Proof.A: Type
forall l : list A, unject (inject l) = l
      induction l; simpl; congruence.
    Qed.

  End fixlist.

  Arguments fixlist A n : clear implicits.

Without peeking, try to define the functions zip, map, and map2.

  Fixpoint zip {A B n} (va: fixlist A n) (vb: fixlist B n)
    : fixlist (A * B) n :=

    match n return fixlist A n -> fixlist B n -> fixlist (A * B) n with
    | 0 => fun _ _ => tt
    | S n => fun va vb => ((fst va, fst vb), zip (snd va) (snd vb))
    end va vb.

  Fixpoint map {A B n} (f: A -> B) :=
    match n return fixlist A n ->
                   fixlist B n with
    | 0 => fun _ => tt
    | S n =>
        fun fl => (f (fst fl), map f (snd fl))
    end.

  Fixpoint map2 {A B C n} (f: A -> B -> C) :=
    match n return fixlist A n ->
                   fixlist B n ->
                   fixlist C n with
    | 0 => fun _ _ => tt
    | S n =>
        fun fla flb =>
          (f (fst fla) (fst flb),
            map2 f (snd fla) (snd flb))
    end.

End fixlist.

Next, define an analog to Fin.t in the style of fixlist (fixnat).

Import fixlist.

Fixpoint fixnat (n: nat): Set :=
  match n with
  | 0 => False
  | S n => option (fixnat n)
  end.

Fixpoint fixlist_nth {A n}
  (fl: fixlist A n) (idx: fixnat n) {struct n}: A :=
  match n return fixlist A n -> fixnat n -> A with
  | 0 => fun _ idx => False_rect A idx
  | S n => fun fl idx =>
            match idx with
            | Some idx => fixlist_nth (snd fl) idx
            | None => (fst fl)
            end
  end fl idx.

Fixpoint fixnat_to_nat {n} :=
  match n return fixnat n -> nat with
  | 0 => fun f => False_rect _ f
  | S n => fun f =>
            match f with
            | None => 0
            | Some f => S (fixnat_to_nat f)
            end
  end.