Recitation 8: Program equivalence and structural induction

Overview

This recitation explores how we can use the substitution model of evaluation and structural induction to prove equivalences between OCaml programs.

Program Equivalence

Deciding whether two programs are equivalent is a fundamental question that arises in a wide range of contexts from debugging to compilation to optimization and complexity analysis. But what does it mean for two programs to be equivalent? Intuitively, they should produce identical outputs when they are provided with identical inputs, where an output can either be a value, side effects, an exception, or an infinite loop. As a simplification, if the programs are purely functional — that is, they do not exhibit side effects such as printing to the console or modifying mutable data structures — then we can simplify this to say that on any input, the programs should either both loop or both produce the same final value. We will adopt this simplified notion of equivalence in the rest of this recitation.

As examples of programs that we might attempt to prove equivalent, consider the following:

if e1 then e2 else e3

and

if not e1 then e3 else e2

match e1 with x,y -> f x y

and

match e1 with y,x -> f x y

```
let x = e1 in e2
```
and
```
(fun x -> e2) e1
```

List.map g l

and

List.fold_right (fun x a -> g x::a) l []

Structural Induction

In many situations, in order to prove programs equivalent, we will need to do a proof over infinite sets of structures — all natural numbers, lists, or trees. Establishing properties over inifinite sets is challenging because simple proof techniques such as exhaustive case analysis are often not powerful enough. Fortunately, there is a general proof technique called structural induction that can often be used to establish such properties. Structural induction exploits the fact that recursive data types are ultimately built up using applications of constructors — to construct a list, we "cons" the elements onto the "nil" element.

The structural induction principle for a data type can be obtained as follows. An OCaml data type is defined by a number of variants, each of which can wrap a tuple of values. Given such a definition, each non-recursive variant is a base-case, and each recursive variant is an inductive case, with an induction hypothesis for each occurence of a recursive type.

For the list data type

  type 'a list = 
     [] 
   | (::) of 'a * 'a list

the structural induction principle is as follows: given an arbitrary predicate on lists, if P([]) and for all values vh and vt we have that P(vt) implies P(vh::vt), then for all lists t, the predicate P(t) holds.

For the nat data type type nat = Zero | Succ of nat the structural induction principle is as follows: given an arbitrary predicate on lists, if P(Zero) and for all values vn we have that P(vn) implies P(Succ vn), then for all natural numbers n, the predicate P(n) holds.

What is the induction principles for the following data types?

type suit = 
    Clubs
  | Diamonds
  | Hearts 
  | Spades

type arith = 
    Num of int
  | Neg of arith
  | Plus of arith * arith
  | Times of arith * arith

type lambda = 
    Var of string
  | App of lambda * lambda
  | Abs of string * lambda

```
type b = 
    Bottom of b 
```

Substitution Model Proofs

Using the substitution model and structural induction, we can prove many useful equivalences. As an example, we will prove the following equivalence,

let rec map f l = 
  match l with 
    | [] -> []
    | h::t -> (f h)::(map f t) in 
map g l

and

let rec fold_right f l a = 
  match l with 
    | [] -> a 
    | h::t -> f h (fold_right f t a) in 
fold_right (fun x a -> g x::a) l []

using structural induction on lists.

To start, assume that we have generated fresh names map' and fold_right' for map and fold_right respectively, and added the following evaluation rules to the model:

map' --> 
  (fun f -> fun l -> 
     match l with 
       | [] -> []
       | h::t -> (f h)::(map' f t)

and

fold_right' --> 
  (fun f -> fun l -> fun a -> 
     match l with
       | [] -> a
       | h::t -> f h (fold_right' f t a)

Next, we will prove a lemma showing that P(l) holds for all lists l, where P(l) is defined as follows;

P(l) =def=
  (fun f -> fun l -> 
     match l with 
       | [] -> []
       | h::t -> (f h)::(map' f t))
  g l 
=    
  (fun f -> fun l -> fun a -> 
     match l with
       | [] -> a
       | h::t -> (f h)::(fold_right' f t a) 
  g l []

Let g be arbitrary. To use the structural induction principle for lists, we must prove two simpler properties, P([]) and forall vh, vt, P(vt) ==> P(vh::vt). We prove each separately.