Lecture 7: More on the Substitution Model and Structural Induction

In this lecture, we will continue to examine how OCaml programs are evaluated, extending our substitution model to handle recursion.

Recursion

The model presented in the last lecture has one significant weakness: It doesn't explain how recursive functions work. The problem is that a recursive function is in scope within its own body. Mutually recursive functions are also problematic, because each mutually recursive function is in scope within the bodies of all the others.

One way to understand this is that a recursive function can be "unrolled" by substituting the entire function for the name of the function in the body, and the resulting function is equivalent to the original. For example, we can define the factorial function

```let rec fact n = if n = 0 then 1 else n * fact (n - 1)
```

or equivalently,

```let rec fact = fun n -> if n = 0 then 1 else n * fact (n - 1)
```

The `rec` keyword indicates that the `fact` in the body refers to the entire function

```fun n -> if n = 0 then 1 else n * fact (n - 1)
```

But the definition seems circular. The function `fact` is defined in terms of itself. We can "unroll" the function once by substituting the entire function for `fact` in the body, which would give

```fun n -> if n = 0 then 1 else n *
(fun n -> if n = 0 then 1 else n * fact (n - 1)) (n - 1)
```

and this is equivalent to the original. We can unroll as many times as we like:

```fun n -> if n = 0 then 1 else n *
(fun n -> if n = 0 then 1 else n *
(fun n -> if n = 0 then 1 else n *
(fun n -> if n = 0 then 1 else n *
fact (n - 1))
(n - 1))
(n - 1))
(n - 1)
```

and this is equivalent to the original. However, note that if we unroll finitely many times, no matter how many, there is always a free occurrence of `fact` in the body, so it still seems like we have a circular definition.

But say we could unroll infinitely many times. Then this would give an infinite anonymous function with no free occurrence of `fact` in the body:

```fun n -> if n = 0 then 1 else n *
(fun n -> if n = 0 then 1 else n *
(fun n -> if n = 0 then 1 else n *
(fun n -> if n = 0 then 1 else n *
(...) (n - 1))
(n - 1))
(n - 1))
(n - 1)
```

Let's call this thing F. This is rather an unconventional object, since it is not a finite expression, but an infinite expression. However, whatever it is, it does satisfy the equation

F  =  `fun n -> if n = 0 then 1 else n *` F `(n - 1)`

and this is what we are binding to `fact` with the `let rec` declaration. As equational logic allows substitution of equals for equals, this justifies the unrolling operation.

It's probably easier to think of F as an anonymous function that hasn't been infinitely unrolled, but rather contains a pointer to itself that expands out into the same full anonymous function whenever it is used:

When a `let rec f = e in e'` is evaluated, a fresh variable `f'` is generated for the variable `f` (a variable is fresh if it appears nowhere else in the program), along with an equation

```f'  =  e{f'/f}
```

which will typically only be applied as a reduction rule in the left-to-right direction, as

```f'  →  e{f'/f}
```

Then we evaluate `let rec f = e in e'` as if it were `let f = f' in e'`.

The name `f'` stands for the value of the expression that the arrow points to the graphical representation above. If evaluation ever hits `f'`, the reduction rule is applied.

For example, consider this difficult-to-understand code that is similar to the example above:

```let rec f g n =
if n = 1 then g 0
else g 0 + f (fun x -> n) (n - 1)
in f (fun x -> 10) 3
```

Can you predict what the result will be? It is evaluated as follows. If you can follow this then you really understand the substitution model!

We introduce a fresh symbol `f'` for the recursive function bound in the `let rec` expression, along with the reduction rule

```f'  →  fun g -> fun n ->
if n = 1 then g 0
else g 0 + f' (fun x -> n) (n - 1)
```

then evaluate

```let f = f' in f (fun x -> 10) 3
```

The evaluation then proceeds as follows.

```let f = f' in f (fun x -> 10) 3
→  f' (fun x -> 10) 3
→  (fun g -> fun n ->
if n = 1 then g 0
else g 0 + f' (fun x -> n) (n - 1)) (fun x -> 10) 3
→  (fun n ->
if n = 1 then (fun x -> 10) 0
else (fun x -> 10) 0 + f' (fun x -> n) (n - 1)) 3
→  if 3 = 1 then (fun x -> 10) 0
else (fun x -> 10) 0 + f' (fun x -> 3) (3 - 1)
→  if false then (fun x -> 10) 0
else (fun x -> 10) 0 + f' (fun x -> 3) (3 - 1)
→  (fun x -> 10) 0 + f' (fun x -> 3) (3 - 1)
→  10 + f' (fun x -> 3) (3 - 1)
→  10 + (fun g -> fun n -> ...) (fun x -> 3) (3 - 1)
→  10 + (fun n -> ...) 2
→  10 + if 2 = 1 then (fun x -> 3) 0
else (fun x -> 3) 0 + f' (fun x -> 2) (2 - 1)
→  10 + if false then (fun x -> 3) 0
else (fun x -> 3) 0 + f' (fun x -> 2) (2 - 1)
→  10 + (fun x -> 3) 0 + f' (fun x -> 2) (2 - 1)
→  10 + 3 + f' (fun x -> 2) (2 - 1)
→  10 + 3 + (fun g -> fun n -> ...) (fun x -> 2) (2 - 1)
→  10 + 3 + (fun n -> ...) 1
→  10 + 3 + if 1 = 1 then (fun x -> 2) 0 else ...
→  10 + 3 + if true then (fun x -> 2) 0 else ...
→  10 + 3 + (fun x -> 2) 0
→  10 + 3 + 2
→  15
```

In general, there might be multiple functions defined in a `let rec`. These are evaluated as follows:

```let rec f1 = fun x1 -> e1
and f2 = fun x2 -> e2
...
and fn = fun xn -> en
in e'

→

e'{f1'/f1, ..., fn'/fn}
(with equations f1' = fun x1 -> e{f1'/f1,...,fn'/fn}, ...
fn' = fun xn -> e{f1'/f1,...,fn'/fn},
all fi fresh)
```

The tricky example revisited

Now we have the tools to return to the tricky example from above. Let's first consider an easier example, where the third parameter is `1` rather than `3` as above.

```let rec evil (f1, f2, n) =
let f x = 10 + n in
if n = 1 then f 0 + f1 0 + f2 0
else evil (f, f1, n-1)
and dummy x = 1000
in evil (dummy, dummy, 1)
```

We introduce a fresh variable `evil'` denoting the recursive function bound to `evil` along with the reduction rule

```evil'  →  fun (f1, f2, n) ->
let f x = 10 + n in
if n = 1 then f 0 + f1 0 + f2 0
else evil' (f, f1, n-1)
```

and the tricky example can be rewritten

```let evil = evil'
and dummy = fun x -> 1000
in evil (dummy, dummy, 1)
```

Now evaluating this expression by substitution,

```let evil = evil'
and dummy = fun x -> 1000
in evil (dummy, dummy, 1)
→  evil' ((fun x -> 1000), (fun x -> 1000), 1)
→  let f x = 10 + 1 in
if 1 = 1 then f 0 + (fun x -> 1000) 0 + (fun x -> 1000) 0
else evil' (f, (fun x -> 1000), 1 - 1)
→  (fun x -> 10 + 1) 0 + (fun x -> 1000) 0 + (fun x -> 1000) 0
→  2011
```

Now if we consider the case where `evil` is called with `n`=2 rather than `n`=1, things get a bit more interesting. Here we will write down just the reduction steps corresponding to the recursive calls to `evil` and the calculation of the final return value.

```let evil = evil'
and dummy = fun x -> 1000
in evil (dummy, dummy, 2)
→  evil' ((fun x -> 1000), (fun x -> 1000), 2)
→  evil' ((fun x -> 10 + 2), (fun x -> 1000), 1)
→  (fun x -> 10 + 1) 0 + (fun x -> 10 + 2) 0 + (fun x -> 1000) 0
→  1023
```

How about when `n`=3?

```let evil = evil'
and dummy = fun x -> 1000
in evil (dummy, dummy, 3)
→  evil' ((fun x -> 1000), (fun x -> 1000), 3)
→  evil' ((fun x -> 10 + 3), (fun x -> 1000), 2)
→  evil' ((fun x -> 10 + 2), (fun x -> 10 + 3), 1)
→  (fun x -> 10 + 1) 0 + (fun x -> 10 + 2) 0 + (fun x -> 10 + 3) 0
→  36
```

Lexical (static) vs. dynamic scoping

Variable names are substituted immediately throughout their scope when a function is applied or a `let` is evaluated. This means that whenever we see an occurrence of a variable, how that variable is bound is immediately clear from the program text: the variable is bound to the innermost binding occurrence in whose scope it occurs. This rule is called lexical (or static) scoping.

Let us apply this to the tricky example from earlier. The key question is what the variable `n` means within the functions `f`, `f1`, `f2`. Even though these variables are all bound to the function `f`, they are bound to versions of the function `f` that occurred in three different scopes, where the variable `n` was bound to 1, 2, and 3 respectively. For example, on the first entry to `evil`, the value 3 is substituted for the variable `n` within the function `f` (which ultimately becomes `f2` on the third application on `evil`).

The most common alternative to lexical scoping is called dynamic scoping. In dynamic scoping, a variable is bound to the most recently bound version of the variable, and function values do not record how their variables such as `n` are bound. For example, in the language Perl, the equivalent of the example code would print 33 rather than 36, because the most recently bound value for the variable `n` is 1. Dynamic scoping can be confusing because the meaning of a function depends on the context where it is used, not where it was defined. Most modern languages use lexical scoping.

Structural Induction

Having a formal model of evaluation allows us to reason precisely about the execution of OCaml programs. However, establishing many of the most interesting program properties involves reasoning about infinite sets. The proof technique of structural induction is a powerful mechanism for carrying out such proofs.

As an example, suppose that we want to prove that the following expression terminates for an arbitrary list value `v`:

```let rec fold_left f a l =
match l with
| [] -> a
| h::t -> fold_left f (f a h) t in
fold_left (+) 0 v
```
The case where `v` is `[]` is trivial, and can be proven easily using the substitution model. However, the case where `v` is vh::vt is trickier beacuse it involves a recursive call on `vt`. Intuitively, it is clear that the recursion must eventually "bottom out," because `v` is finite, and the length of the list passed to `fold_left` decreases on each recursive call. But to establish this formally, we need a proof principle.

It turns that out that OCaml data type has an associated structural induction principle that can be used to prove properties about arbitrary values of that type. For the list data type

```  type 'a list =
[]
| (::) of 'a * 'a list
```
the structural induction principle is as follows: given an arbitrary predicate on lists, if P(`[]`) and for all values `vh` and `vt` we have that P(`vt`) implies P(`vh::vt`), then for all lists `t`, the predicate P(`t`) holds.

The structural induction principle for other data types is similar. For example, recall the natural numbers:

```  type nat =
Zero
| Succ of nat
```
The corresponding structural induction principle is as follows: given an arbitary predicate on natural numbers P, if P(`Zero`) and for all values `m` we have P(`m`) implies P(`Succ m`), then for all values `m` the predicate P(`m`) holds. Generally speaking, given an arbitrary data type, the non-recursive variants become base cases and the recursive variants become inductive cases.

Using structural induction, it is easy to prove many properties involving infinite sets of structures. For example, assuming we have added a reduction rule for `fold_left'` as sketched above, we can prove the following (strengthened) property by structural induction on `v`:

```P(v) =
(fun f -> fun a -> fun l ->
match l with
| [] -> a
| h::t -> fold_left' f (f a h) t)
(+) n []
terminates
```
For the base case, we prove P(`[]`):
```  (fun f -> fun a -> fun l ->
match l with
| [] -> a
| h::t -> fold_left' f (f a h) t)
(+) n []
-->
(fun a -> fun l ->
match l with
| [] -> a
| h::t -> fold_left' (+) ((+) a h) t)
n []
-->
(fun l ->
match l with
| [] -> n
| h::t -> fold_left' (+) ((+) n h) t)
[]
-->
match [] with
| [] -> n
| h::t -> fold_left' (+) ((+) n h) t)
-->
n
```
Then for the inductive step, we let `vh` and `vt` be arbitrary values such that P(`vt`) holds, and we prove that P(`vh::vt`) holds:
```  (fun f -> fun a -> fun l ->
match l with
| [] -> a
| h::t -> fold_left' f (f a h) t)
(+) n (vh::vt)
-->
(fun a -> fun l ->
match l with
| [] -> a
| h::t -> fold_left' (+) ((+) a h) t)
n (vh::vt)
-->
(fun l ->
match l with
| [] -> n
| h::t -> fold_left' (+) ((+) n h) t)
(vh::vt)
-->
match vh::vt with
| [] -> n
| h::t -> fold_left' (+) ((+) n h) t)
-->
fold_left' (+) ((+) n vh) vt
-->
fold_left' (+) vn' vt
```
where `vn'` is `(+) n vh`. This final expression terminates by hypothesis.