# Substitution Model
This lab focuses on the dynamic semantics given in the lecture notes on
[a core sub-language of OCaml](core-ocaml.html). These rules extend the
rules you've seen in lecture to support pairs, simple variants, and anonymous
functions.
## Evaluation
Here is an example of evaluating an expression:
```
7+5*2
--> (step * operation)
7+10
--> (step + operation)
17
```
There are two steps in that example, and we've annotated each
step with a parenthetical comment to hint at which evaluation
rule we've used. We stopped evaluating when we reached a value.
We could write that evaluation more tersely as `7+5*2 --> 7+10 --> 17`,
or even more tersely as `7+5*2 -->* 17`, but each of those provides
progressively less detail about the evaluation.
##### Exercise: simple expressions [✭]
Evaluate the following expressions using the small-step substitution model.
Use the "long form" of evaluation that we demonstrated above, in which
you provide a hint as to which rule is applied at each step.
- `(3 + 5) * 2` (2 steps)
- `if 2 + 3 < 4 then 1 + 1 else 2 + 2` (4 steps)
□
Here is an example of evaluating a let expression:
```
let x = 2 in x+x
--> (step let expression)
(x+x){2/x} = 2+2
--> (step + operation)
4
```
Here we've first taken one step, which results in a substitution.
We put parentheses around the `x+x` to be clear that the substitution
operation `{2/x}` applies to both of the `x`'s: if we had instead
written `x+x{2/x}` it might be ambiguous as to whether we wanted to
apply the substitution to the `x` on the left-hand side.
We then applied the substitution. That was not a step of evaluation,
because it was not a use of a `-->` rule. So we wrote it on the same
line. Then we took a second step.
##### Exercise: substitution [✭✭]
Evaluate these expressions, again using the "long form" we just
demonstrated. Be very careful to correctly use the definition of substitution
that is provided in the notes. Double check your answers against utop.
- `let x = 2 + 2 in x + x` (3 steps)
- `let x = 5 in ((let x = 6 in x) + x)` (3 steps)
- `let x = 1 in (let x = x + x in x + x)` (4 steps)
□
## Pairs
In OCaml, a pair of values `(v1,v2)` is a value. OCaml does not specify
whether pairs of expressions are evaluated from left to right or from
right to left. The evaluation rules we gave in our core OCaml semantics
evaluates from left to right. It turns out this is actually different
from what the current implementation of OCaml does.
##### Exercise: pairs [✭✭]
Evaluate these:
- `(3 + 5, 2 + 4)` (2 steps)
- `(fst (3,4), snd (3,4))` (2 steps)
- `((if fst (true, 3) then snd (1,2) else 3), (if snd (3,false) then snd (1,2) else 3))` (5 steps)
□
## Variants
A variant value is either the constructor `Left` or `Right` applied to a value.
##### Exercise: variants [✭✭]
Evaluate these:
- `Left (1+2)` (1 step)
- `match Left 42 with Left x -> x+1 | Right y -> y-1` (2 steps)
- (4 steps)
```
match Right (Left (1+2)) with
Left a -> 1+a
| Right a ->
match a with
Left b -> 2+b
| Right b -> 3+b
```
□
## Functions
Function evaluation in the substitution model is very similar to the evaluation
of `let` expressions. Recall that anonymous functions are values, and do
not evaluate any further.
To evaluate a function application,
we first evaluate the function being applied, then we evaluate
the argument, then we substitute the argument into the body of
the function.
Like with pairs, OCaml does not specify whether the function being applied
is evaluated before or after the argument to which it is being applied.
Our core semantics does specify that the function being applied is
evaluated first.
##### Exercise: application [✭✭]
Evaluate these:
- `(fun x -> 3 + x) 2` (2 steps)
- `let f = (fun x -> x + x) in (f 3) + (f 3)` (6 steps)
- `let f = fun x -> x + x in let x = 1 in let g = fun y -> x + f y in g 3` (7 steps)
- `let f = (fun x -> fun y -> x + y) in let g = f 3 in (g 1) + (f 2 3)` (9 steps)
□
## Lists
In lecture, we claimed that `Left` and `Right` were sufficient to code
up more complicated variants. Let's give an example of that now
by coding up lists.
Suppose we treat list expressions like syntactic sugar in the following way:
* `[]` is syntactic sugar for `Left 0`.
* `e1::e2` is syntactic sugar for `Right (e1,e2)`.
##### Exercise: desugar list [✭]
What is the core OCaml expression to which `[1;2;3]` desugars?
□
##### Exercise: list not empty [✭✭]
Write a core OCaml function `not_empty` that returns `1` if a list
is non-empty and `0` if the list is empty. Use the substitution
model to check that your function behaves properly on these
test cases:
- `not_empty []`
- `not_empty [1]`
□
## Additional exercise: pattern matching [✭✭✭]
In core OCaml, there are only two patterns: `Left x` and `Right x`, where
`x` is a variable name. But in full OCaml, patterns are far more general.
Let's see how far we can generalize patterns in core OCaml.
**Step 1:** Here is a BNF grammar for patterns, and slightly revised BNF grammar
for expressions:
```
p ::= i | (p1,p2) | Left p | Right p | x | _
e ::= ...
| match e with | p1 -> e1 | p2 -> e2 | ... | pn -> en
```
In the revised syntax for `match`, only the very first `|` on the line, immediately
before the keyword `match`, is meta-syntax. The remaining four `|` on the line
are syntax. Note that we require `|` before the first pattern.
**Step 2:** A value `v` matches a pattern `p` if by substituting any variables
or wildcards in `p` with values, we can obtain exactly `v`. For
example:
* `2` matches `x` because `x{2/x}` is `2`.
* `Right(0,Left 0)` matches `Right(x,_)` because `Right(x,_){0/x}{Left 0/_}`
is `Right(0,Left 0)`.
Let's define a new ternary relation called `matches`, guided by those examples:
```
v =~ p // s
```
Pronounce this relation as "`v` matches `p` producing substitutions `s`."
Here, `s` is a sequence of substitutions, such as `{0/x}{Left 3/y}{(1,2)/z}`.
There is just a single rule for this relation:
```
v =~ p // s
if v = p s
```
For example,
```
2 =~ x // {2/x}
because 2 = x{2/x}
```
**Step 3:** To evaluate a match expression:
* Evaluate the expression being matched to a value.
* If that expression matches the first pattern, evaluate the expression
corresponding to that pattern.
* Otherwise, match against the second pattern, the third, etc.
* If none of the patterns matches, evaluation is *stuck*: it cannot
take any more steps.
Using those insights, complete the following evaluation rules
by filling in the places marked with `???`:
```
(* This rule should implement evaluation of e. *)
match e with | p1 -> e1 | p2 -> e2 | ... | pn -> en
--> ???
if ???
(* This rule implements moving past p1 to the next pattern. *)
match v with | p1 -> e1 | p2 -> e2 | ... | pn -> en
--> match v with | p2 -> e2 | ... | pn -> en
if there does not exist an s such that ???
(* This rule implements matching v with p1 then proceeding to evaluate e1. *)
match v with | p1 -> e1 | p2 -> e2 | ... | pn -> en
--> ??? (* something involving e1 *)
if ???
```
Note that we don't need to write the following rule explicitly:
```
match v with | -/->
```
Evaluation will get stuck at that point because none of the three other
rules above will apply.
**Step 4:** Double check your rules by evaluating the following expression:
- `match (1 + 2, 3) with | (1,0) -> 4 | (1,x) -> x | (x,y) -> x + y`
## Recursion
One of the evaluation rules for `let` is
```
let x=v in e --> e{v/x}
```
We could try adapting that to `let rec`:
```
let rec x=v in e --> e{v/x} (* broken *)
```
But that rule doesn't work properly, as we see in the following example:
```
let rec fact = fun x ->
if x <= 1 then 1 else x * (fact (x - 1)) in
fact 3
-->
(fun x -> if x <= 1 then 1 else x * (fact (x - 1)) 3
-->
if 3 <= 1 then 1 else 3 * (fact (3 - 1))
-->
3 * (fact (3 - 1))
-->
3 * (fact 2)
-/->
```
We're now stuck, because we need to evaluate `fact`, but it doesn't step.
In essence, the semantic rule we used "forgot" the function value that should
have been associated with `fact`.
A good way to fix this problem is to introduce a new language construct
for recursion called simply `rec`. (Note that OCaml does not have any
construct that corresponds directly to `rec`.) Formally, we extend the
syntax for expressions as follows:
```
e ::= ...
| rec f -> e
```
and add the following evaluation rule:
```
rec f -> e --> e{(rec f -> e)/f}
```
The intuitive reading of this rule is that when evaluating
`rec f -> e`, we "unfold" `f` in the body of `e`. For example,
here is an infinite loop coded with `rec`:
```
rec f -> f
--> (* step rec *)
f{(rec f -> f)/f}
= (* substitute *)
rec f -> f
--> (* step rec *)
f{(rec f -> f)/f}
...
```
Now we can use `rec` to implement `let rec`. Anywhere
`let rec` appears in a program:
```
let rec f = e1 in e2
```
we *desugar* (i.e., rewrite) it to
```
let f = rec f -> e1 in e2
```
Note that the second occurrence of `f` (inside the `rec`) shadows the
first one. Going back to the `fact` example, its desugared version is
```
let fact = rec fact -> fun x ->
if x <= 1 then 1 else x * (fact (x - 1)) in
fact 3
```
##### Exercise: let rec [✭✭✭]
Evaluate the following expression (17 steps, we think,
though it does get pretty tedious).
You may want to simplify your life by writing "F" in place of
`(rec fact -> fun x -> if x <= 1 then 1 else x * (fact (x-1)))`
```
let rec fact = fun x ->
if x <= 1 then 1 else x * (fact (x - 1)) in
fact 3
```
□