ADTAbstract Data Types

An Abstract Data Type comprises:
  • A type with a hidden representation.
  • Interface functions that operate on that type.
  • Axioms about the interaction of those functions.
Require Import Omega.
Let's consider the concept of lookup tables, indexed by keys that are numbers, mapping those keys to values of arbitrary (parametric) type. We can express this in Coq as follows:
Module Type TABLE.
 Parameter V: Type.
 Parameter default: V.
 Parameter table: Type.
 Definition key := nat.
 Parameter empty: table.
 Parameter get: keytableV.
 Parameter set: keyVtabletable.
 Axiom gempty: k, (* get-empty *)
       get k empty = default.
 Axiom gss: k v t, (* get-set-same *)
      get k (set k v t) = v.
 Axiom gso: j k v t, (* get-set-other *)
      jkget j (set k v t) = get j t.
This means: in any Module that satisfies this Module Type, there's a type table of lookup-tables, a type V of values, and operators empty, get, set that satisfy the axioms gempty, gss, and gso.
It's easy to make an implementation of TABLE, using Maps. Just for example, let's choose V to be string.
From VFA Require Import Maps.
Require Import String.
Open Scope string_scope.
First, let's specialize the definition of TABLE to have values that are strings. Why? Let's return to that question after we define the module.
Module Type TABLE_STRING := TABLE with Definition V := string.
To define a Module that implements a Module Type, you need to provide a Definition or Theorem for each Parameter or Axiom.
Module MapsTable : TABLE_STRING.
 Definition V := string.
 Definition default: V := "".
 Definition table := total_map V.
 Definition key := nat.
 Definition empty : table := t_empty default.
 Definition get (k: key) (m: table) : V := m k.
 Definition set (k: key) (v: V) (m: table) : table :=
    t_update m k v.
 Theorem gempty: k, get k empty = default.
   Proof. intros. reflexivity. Qed.
 Theorem gss: k v t, get k (set k v t) = v.
   Proof. intros. unfold get, set. apply t_update_eq. Qed.
 Theorem gso: j k v t, jkget j (set k v t) = get j t.
   Proof. intros. unfold get, set. apply t_update_neq.
End MapsTable.
The specialization of TABLE to TABLE_STRING exposes the definition of V as string to the world outside the module. Without it, all that would be known is what TABLE exposes, which is that V is some Type.
You could instead write Module MapsTable <: TABLE.. That would cause Coq to check that MapsTable does conform to the TABLE type, but would leave all the definitions of MapsTable exposed to the outside world. That is, there would be no encapsulation.
Now we can see that V is known to be string, but table is abstract: all that is known is that it is some Type.
Print MapsTable.V.
(* ====>
   MapsTable.V = string
        : Type *)

Print MapsTable.table.
(* ====>
   ***  MapsTable.table : Type  *)

As an example, let's put 1 and 3 into a map, then lookup 1.
Definition ex_t :=
  MapsTable.set 3 "three"
                (MapsTable.set 1 "one"

Example test_maps_table : MapsTable.get 1 ex_t = "one".
Proof. Fail reflexivity. Abort.
Why does the proof fail? Because like table, get and set and empty are also abstract. Outside the module, nothing is known about their implementation except what is revealed in TABLE. Fortunately, TABLE reveals some useful theorems.
Example test_maps_table : MapsTable.get 1 ex_t = "one".
  unfold ex_t.
  rewrite MapsTable.gso.
  rewrite MapsTable.gss.
The problem with MapsTable is that the Maps implementation is very inefficient: linear time per get operation. If you do a sequence of N get and set operations, it can take time quadratic in N. For a more efficient implementation, let's use our search trees.
From VFA Require Import SearchTree.

Module TreeTable : TABLE.
 Definition V := string.
 Definition default : V := "".
 Definition table := tree V.
 Definition key := nat.
 Definition empty : table := empty_tree V.
 Definition get (k: key) (m: table) : V := lookup V default k m.
 Definition set (k: key) (v: V) (m: table) : table :=
     insert V k v m.
 Theorem gempty: k, get k empty = default.
   Proof. intros. reflexivity. Qed.

 Theorem gss: k v t, get k (set k v t) = v.
   Proof. intros. unfold get, set.
     destruct (unrealistically_strong_can_relate V default t)
        as [m H].
     assert (H0 := insert_relate V default k v t m H).
     assert (H1 := lookup_relate V default k _ _ H0).
    rewrite H1. apply t_update_eq.

Exercise: 3 stars, standard (TreeTable_gso)

Prove this using techniques similar to the proof of gss just above.
 Theorem gso: j k v t, jkget j (set k v t) = get j t.
 (* FILL IN HERE *) Admitted.
End TreeTable.
But suppose we don't have an unrealistically strong can-relate theorem? Remember the type of the "ordinary" can_relate:
Check can_relate.
  (*  : forall (V : Type) (default : V) (t : tree V),
       SearchTree V t ->
       exists m : total_map V, Abs V default t m *)

This requires that t have the SearchTree property, or in general, any value of type table should be well-formed, that is, should satisfy the representation invariant. We must ensure that the client of an ADT cannot "forge" values, that is, cannot coerce the representation type into the abstract type; especially ill-formed values of the representation type. This "unforgeability" is enforced in some real programming languages: ML (Standard ML or OCaml) with its module system; Java, whose Classes have "private variables" that the client cannot see.

A Brief Excursion into Dependent Types

Let's briefly look at a different way of enforcing the representation invariant: every value of the representation type will carry along with it a proof that the value satisfies the representation invariant.
To do this, we'll use sigma types, aka dependent sums, which are defined in the Coq standard library. Suppose P is a predicate on type A, that is, P : A Prop. Then sig A P is the type of elements of A that satisfy P. You could think of this as a subset: sig A P is the subset of values of A that satisfy P. Based on that intuition, there is a notation {x : A | P} that means sig A P.
Check sig.
(* ====>
   sig : forall A : Type, (A -> Prop) -> Type *)

(* As an example, here is a type for even natural numbers. Nat.Even : nat Prop is defined in the standard library. The A argument to
   sig is implicit, so we don't write it below. *)

Definition even_nat := sig Nat.Even.
Print even_nat.
(* ====>
   even_nat = {x : nat | Nat.Even x}
        : Set

A value of type sig A P is a like a "package" of two things: a value x : A, and a proof of P x, that is, that x satisfies P. Such values are created with the constructor exist.
Print sig.
(* ====>
   Inductive sig (A : Type) (P : A -> Prop) : Type :=
       exist : forall x : A, P x -> {x : A | P x}

Theorem zero_even : Nat.Even 0.
  unfold Nat.Even. 0. omega.

Definition even_zero : even_nat :=
  exist Nat.Even 0 zero_even.
Sigma types and existential quantification no doubt seem quite similar. They are! We can see that by looking at their definitions side by side, and removing some of the notational sugar:
   Inductive sig (A : Type) (P : A -> Prop) : Type :=
       exist    : forall x : A, P x -> sig A (fun y => P y)

   Inductive ex  (A : Type) (P : A -> Prop) : Prop :=
       ex_intro : forall x : A, P x -> ex  A (fun y => P y)

The essential difference is that sig creates a Type whereas ex creates a Prop. That is, the former is computational in content, whereas the latter is logical. Therefore we can write functions to recover the witness from a sig, even though we cannot do the same from an ex. The standard library defines those functions to project the first or second component of a sig:
Check proj1_sig.
(* ====>
   forall {A : Type} {P : A -> Prop},
     {x | P x} -> A

Check proj2_sig.
(* ====>
   forall (A : Type) {P : A -> Prop} (e : {x | P x}),
     P (proj1_sig e)

Definition the_witness : nat := proj1_sig even_zero.
Compute the_witness. (* = 0 : nat *)

Definition the_proof : Nat.Even 0 := proj2_sig even_zero.
Compute the_proof. (* = zero_even : Nat.Even 0 *)

Search Trees using Sigma Types

Let's apply this idea to search trees. The type A will be tree V. The predicate P x will be SearchTree x.
Module TreeTable2 : TABLE.
 Definition V := string.
 Definition default : V := "".
 Definition table := {x | SearchTree V x}.
 Definition key := nat.
 Definition empty : table :=
   exist (SearchTree V) (empty_tree V) (empty_tree_SearchTree V).
 Definition get (k: key) (m: table) : V :=
          (lookup V default k (proj1_sig m)).
 Definition set (k: key) (v: V) (m: table) : table :=
   exist (SearchTree V) (insert V k v (proj1_sig m))
          (insert_SearchTree _ _ _ _ (proj2_sig m)).

 Theorem gempty: k, get k empty = default.
   Proof. intros. reflexivity. Qed.

 Theorem gss: k v t, get k (set k v t) = v.
  Proof. intros. unfold get, set.
    unfold table in t.
Now: t is a package with two components: The first component is a tree, and the second component is a proof that the first component has the SearchTree property. We can destruct t to see that more clearly.
    destruct t as [a Ha].
    (* Watch what this simpl does: *)
    (* Now we can use can_relate instead of
       unrealistically_strong_can_relate: *)

    destruct (can_relate V default a Ha) as [m H].
    pose proof (insert_relate V default k v a m H).
    pose proof (lookup_relate V default k _ _ H0).
    rewrite H1. apply t_update_eq.

Exercise: 3 stars, standard (TreeTable_gso)

Prove this using techniques similar to the proof of gss just above; don't use unrealistically_strong_can_relate.
 Theorem gso: j k v t, jkget j (set k v t) = get j t.
 (* FILL IN HERE *) Admitted.
End TreeTable2.
(End of the brief excursion into dependent types.)

Summary of Abstract Data Type Proofs

Variable V: Type.
Variable default: V.
Step 1. Define a representation invariant. (In the case of search trees, the representation invariant is the SearchTree predicate.) Prove that each operation on the data type preserves the representation invariant. For example:
Check (empty_tree_SearchTree V).
  (*  SearchTree V (empty_tree V) *)
Check (insert_SearchTree V).
  (* forall (k : key) (v : V) (t : tree V),
       SearchTree V t -> SearchTree V (insert V k v t) *)

Notice two things: Any operator (such as insert) that takes a tree parameter can assume that the parameter satisfies the representation invariant. That is, the insert_SearchTree theorem takes a premise, SearchTree V t.
Any operator that produces a tree result must prove that the result satisfies the representation invariant. Thus, the conclusions, SearchTree V (empty_tree V) and SearchTree V (empty_tree V) of the two theorems above.
Finally, any operator that produces a result of "base type", has no obligation to prove that the result satisfies the representation invariant; that wouldn't make any sense anyway, because the types wouldn't match. That is, there's no "lookup_SearchTree" theorem, because lookup doesn't return a result that's a tree.
Step 2. Define an abstraction relation. (In the case of search trees, it's the Abs relation. This relates the data structure to some mathematical value that is (presumably) simpler to reason about.
Check (Abs V default). (* tree V -> total_map V -> Prop *)
For each operator, prove that: assuming each tree argument satisfies the representation invariant and the abstraction relation, prove that the results also satisfy the appropriate abstraction relation.
Check (empty_tree_relate V default). (*
       Abs V default (empty_tree V) (t_empty default)    *)

Check (lookup_relate' V default). (* forall k t m,
       SearchTree V t ->
       Abs V default t m ->
       lookup V default k t = m (Id k)  *)

Check (insert_relate' V default). (*     : forall k v t m,
       SearchTree V t ->
       Abs V default t m ->
       Abs V default (insert V k v t) (t_update m (Id k) v) *)

Step 3. Using the representation invariant and the abstraction relation, prove that all the axioms of your ADT are valid. For example...
Check TreeTable2.gso. (*
      : forall (j k : TreeTable2.key) (v : TreeTable2.V)
         (t : TreeTable2.table),
       j <> k ->
       TreeTable2.get j (TreeTable2.set k v t) = TreeTable2.get j t  *)


Exercise in Data Abstraction

The rest of this chapter is optional.
Require Import List.
Import ListNotations.
Here's the Fibonacci function.
Fixpoint fibonacci (n: nat) :=
 match n with
 | 0 ⇒ 1
 | S imatch i with 0 ⇒ 1 | S jfibonacci i + fibonacci j end

Eval compute in map fibonacci [0;1;2;3;4;5;6].
Here's a silly little program that computes the Fibonacci function.
Fixpoint repeat {A} (f: AA) (x: A) n :=
 match n with Ox | S n'f (repeat f x n') end.

Definition step (al: list nat) : list nat :=
 List.cons (nth 0 al 0 + nth 1 al 0) al.

Eval compute in map (repeat step [1;0;0]) [0;1;2;3;4;5].

Definition fib n := nth 0 (repeat step [1;0;0] n) 0.

Eval compute in map fib [0;1;2;3;4;5;6].
Here's a strange "List" module.
Module Type LISTISH.
 Parameter list: Type.
 Parameter create : natnatnatlist.
 Parameter cons: natlistlist.
 Parameter nth: natlistnat.

Module L <: LISTISH.
 Definition list := (nat*nat*nat)%type.
 Definition create (a b c: nat) : list := (a,b,c).
 Definition cons (i: nat) (il : list) := match il with (a,b,c) ⇒ (i,a,b) end.
 Definition nth (n: nat) (al: list) :=
   match al with (a,b,c) ⇒
      match n with 0 ⇒ a | 1 ⇒ b | 2 ⇒ c | _ ⇒ 0 end
End L.

Definition sixlist := L.cons 0 (L.cons 1 (L.cons 2 (L.create 3 4 5))).

Eval compute in map (fun iL.nth i sixlist) [0;1;2;3;4;5;6;7].
Module L implements approximations of lists: it can remember the first three elements, and forget the rest. Now watch:
Definition stepish (al: L.list) : L.list :=
 L.cons (L.nth 0 al + L.nth 1 al) al.

Eval compute in map (repeat stepish (L.create 1 0 0)) [0;1;2;3;4;5].

Definition fibish n := L.nth 0 (repeat stepish (L.create 1 0 0) n).

Eval compute in map fibish [0;1;2;3;4;5;6].
This little theorem may be useful in the next exercise.
Lemma nth_firstn:
  A d i j (al: list A), i<jnth i (firstn j al) d = nth i al d.
induction i; destruct j,al; simpl; intros; auto; try omega.
apply IHi. omega.

Exercise: 4 stars, standard, optional (listish_abstraction)

In this exercise we will not need a representation invariant. Define an abstraction relation:
Inductive L_Abs: L.listList.list natProp :=

Definition O_Abs al al' := L_Abs al al'.

(* State these theorems using O_Abs, not L_Abs.
   You'll see why below, at "Opaque". *)

Lemma create_relate : True. (* change this line appropriately *)
(* FILL IN HERE *) Admitted.

Lemma cons_relate : True. (* change this line appropriately *)
(* FILL IN HERE *) Admitted.

Lemma nth_relate : True. (* change this line appropriately *)
(* FILL IN HERE *) Admitted.
Now, we will make these operators opaque. Therefore, in the rest of the proofs in this exercise, you will not unfold their definitions. Instead, you will just use the theorems create_relate, cons_relate, nth_relate.
Opaque L.list.
Opaque L.create.
Opaque L.cons.
Opaque L.nth.
Opaque O_Abs.

Lemma step_relate:
  al al',
   O_Abs al al'
   O_Abs (stepish al) (step al').
(* FILL IN HERE *) Admitted.

Lemma repeat_step_relate:
 n al al',
 O_Abs al al'
 O_Abs (repeat stepish al n) (repeat step al' n).
(* FILL IN HERE *) Admitted.

Lemma fibish_correct: n, fibish n = fib n.
Proof. (* No induction needed in this proof! *)
(* FILL IN HERE *) Admitted.

Exercise: 2 stars, standard, optional (fib_time_complexity)

Suppose you run these three programs call-by-value, that is, as if they were ML programs. fibonacci N fib N fibish N What is the asymptotic time complexity (big-Oh run time) of each, as a function of N? Assume that the plus function runs in constant time. You can use terms like "linear," "N log N," "quadratic," "cubic," "exponential." Explain your answers briefly.
fibonacci: (* fill in here *)
fib: (* fill  in here *)
fibish: (* fill in here *)
(* Thu May 2 14:47:51 EDT 2019 *)