Substitution Model

This lab focuses on the dynamic semantics given in the lecture notes on a core sub-language of OCaml. These rules extend the rules you've seen in lecture to support pairs, simple variants, and anonymous functions.

Evaluation

Here is an example of evaluating an expression:

  7+5*2
-->  (step * operation)
  7+10
-->  (step + operation)
  17

There are two steps in that example, and we've annotated each step with a parenthetical comment to hint at which evaluation rule we've used. We stopped evaluating when we reached a value.

We could write that evaluation more tersely as 7+5*2 --> 7+10 --> 17, or even more tersely as 7+5*2 -->* 17, but each of those provides progressively less detail about the evaluation.

Exercise: simple expressions [✭]

Evaluate the following expressions using the small-step substitution model. Use the "long form" of evaluation that we demonstrated above, in which you provide a hint as to which rule is applied at each step.

(3 + 5) * 2 (2 steps)
if 2 + 3 < 4 then 1 + 1 else 2 + 2 (4 steps)

□

Here is an example of evaluating a let expression:

  let x = 2 in x+x
-->  (step let expression)
  (x+x){2/x}  =  2+2
-->  (step + operation)
  4

Here we've first taken one step, which results in a substitution. We put parentheses around the x+x to be clear that the substitution operation {2/x} applies to both of the x's: if we had instead written x+x{2/x} it might be ambiguous as to whether we wanted to apply the substitution to the x on the left-hand side.

We then applied the substitution. That was not a step of evaluation, because it was not a use of a --> rule. So we wrote it on the same line. Then we took a second step.

Exercise: substitution [✭✭]

Evaluate these expressions, again using the "long form" we just demonstrated. Be very careful to correctly use the definition of substitution that is provided in the notes. Double check your answers against utop.

let x = 2 + 2 in x + x (3 steps)
let x = 5 in ((let x = 6 in x) + x) (3 steps)
let x = 1 in (let x = x + x in x + x) (4 steps)

□

Pairs

In OCaml, a pair of values (v1,v2) is a value. OCaml does not specify whether pairs of expressions are evaluated from left to right or from right to left. The evaluation rules we gave in our core OCaml semantics evaluates from left to right. It turns out this is actually different from what the current implementation of OCaml does.

Exercise: pairs [✭✭]

Evaluate these:

(3 + 5, 2 + 4) (2 steps)
(fst (3,4), snd (3,4)) (2 steps)
((if fst (true, 3) then snd (1,2) else 3), (if snd (3,false) then snd (1,2) else 3)) (5 steps)

□

Variants

A variant value is either the constructor Left or Right applied to a value.

Exercise: variants [✭✭]

Evaluate these:

Left (1+2) (1 step)
match Left 42 with Left x -> x+1 | Right y -> y-1 (2 steps)

(4 steps)

match Right (Left (1+2)) with 
    Left a -> 1+a
  | Right a -> 
      match a with 
          Left b -> 2+b
        | Right b -> 3+b

□

Functions

Function evaluation in the substitution model is very similar to the evaluation of let expressions. Recall that anonymous functions are values, and do not evaluate any further.

To evaluate a function application, we first evaluate the function being applied, then we evaluate the argument, then we substitute the argument into the body of the function.

Like with pairs, OCaml does not specify whether the function being applied is evaluated before or after the argument to which it is being applied. Our core semantics does specify that the function being applied is evaluated first.

Exercise: application [✭✭]

Evaluate these:

(fun x -> 3 + x) 2 (2 steps)
let f = (fun x -> x + x) in (f 3) + (f 3) (6 steps)
let f = fun x -> x + x in let x = 1 in let g = fun y -> x + f y in g 3 (7 steps)
let f = (fun x -> fun y -> x + y) in let g = f 3 in (g 1) + (f 2 3) (9 steps)

□

Lists

In lecture, we claimed that Left and Right were sufficient to code up more complicated variants. Let's give an example of that now by coding up lists.

Suppose we treat list expressions like syntactic sugar in the following way:

[] is syntactic sugar for Left 0.
e1::e2 is syntactic sugar for Right (e1,e2).

Exercise: desugar list [✭]

What is the core OCaml expression to which [1;2;3] desugars?

□

Exercise: list not empty [✭✭]

Write a core OCaml function not_empty that returns 1 if a list is non-empty and 0 if the list is empty. Use the substitution model to check that your function behaves properly on these test cases:

not_empty []
not_empty [1]

□

Additional exercise: pattern matching [✭✭✭]

In core OCaml, there are only two patterns: Left x and Right x, where x is a variable name. But in full OCaml, patterns are far more general. Let's see how far we can generalize patterns in core OCaml.

Step 1: Here is a BNF grammar for patterns, and slightly revised BNF grammar for expressions:

p ::= i | (p1,p2) | Left p | Right p | x | _

e ::= ... 
    | match e with | p1 -> e1 | p2 -> e2 | ... | pn -> en

In the revised syntax for match, only the very first | on the line, immediately before the keyword match, is meta-syntax. The remaining four | on the line are syntax. Note that we require | before the first pattern.

Step 2: A value v matches a pattern p if by substituting any variables or wildcards in p with values, we can obtain exactly v. For example:

2 matches x because x{2/x} is 2.
Right(0,Left 0) matches Right(x,_) because Right(x,_){0/x}{Left 0/_} is Right(0,Left 0).

Let's define a new ternary relation called matches, guided by those examples:

v =~ p // s

Pronounce this relation as "v matches p producing substitutions s."

Here, s is a sequence of substitutions, such as {0/x}{Left 3/y}{(1,2)/z}. There is just a single rule for this relation:

v =~ p // s
  if v = p s

For example,

2 =~ x // {2/x}
  because 2 = x{2/x}

Step 3: To evaluate a match expression:

Evaluate the expression being matched to a value.
If that expression matches the first pattern, evaluate the expression corresponding to that pattern.
Otherwise, match against the second pattern, the third, etc.
If none of the patterns matches, evaluation is stuck: it cannot take any more steps.

Using those insights, complete the following evaluation rules by filling in the places marked with ???:

(* This rule should implement evaluation of e. *)
match e with | p1 -> e1 | p2 -> e2 | ... | pn -> en 
--> ???
  if ???

(* This rule implements moving past p1 to the next pattern. *)
match v with | p1 -> e1 | p2 -> e2 | ... | pn -> en 
--> match v with | p2 -> e2 | ... | pn -> en 
  if there does not exist an s such that ???

(* This rule implements matching v with p1 then proceeding to evaluate e1. *)
match v with | p1 -> e1 | p2 -> e2 | ... | pn -> en 
--> ??? (* something involving e1 *)
  if ???

Note that we don't need to write the following rule explicitly:

match v with |  -/->

Evaluation will get stuck at that point because none of the three other rules above will apply.

Step 4: Double check your rules by evaluating the following expression:

match (1 + 2, 3) with | (1,0) -> 4 | (1,x) -> x | (x,y) -> x + y

Recursion

One of the evaluation rules for let is

let x=v in e --> e{v/x}

We could try adapting that to let rec:

let rec x=v in e --> e{v/x}   (* broken *)

But that rule doesn't work properly, as we see in the following example:

  let rec fact = fun x ->
    if x <= 1 then 1 else x * (fact (x - 1)) in
  fact 3

-->

  (fun x -> if x <= 1 then 1 else x * (fact (x - 1)) 3

-->

  if 3 <= 1 then 1 else 3 * (fact (3 - 1))

--> 

  3 * (fact (3 - 1))

-->

  3 * (fact 2)

-/->

We're now stuck, because we need to evaluate fact, but it doesn't step. In essence, the semantic rule we used "forgot" the function value that should have been associated with fact.

A good way to fix this problem is to introduce a new language construct for recursion called simply rec. (Note that OCaml does not have any construct that corresponds directly to rec.) Formally, we extend the syntax for expressions as follows:

e ::= ...
    | rec f -> e

and add the following evaluation rule:

rec f -> e  -->  e{(rec f -> e)/f}

The intuitive reading of this rule is that when evaluating rec f -> e, we "unfold" f in the body of e. For example, here is an infinite loop coded with rec:

  rec f -> f

-->  (* step rec *)

  f{(rec f -> f)/f}

= (* substitute *)

  rec f -> f

--> (* step rec *)

  f{(rec f -> f)/f}

...

Now we can use rec to implement let rec. Anywhere let rec appears in a program:

let rec f = e1 in e2

we desugar (i.e., rewrite) it to

let f = rec f -> e1 in e2

Note that the second occurrence of f (inside the rec) shadows the first one. Going back to the fact example, its desugared version is

let fact = rec fact -> fun x ->
  if x <= 1 then 1 else x * (fact (x - 1)) in
fact 3

Exercise: let rec [✭✭✭]

Evaluate the following expression (17 steps, we think, though it does get pretty tedious). You may want to simplify your life by writing "F" in place of (rec fact -> fun x -> if x <= 1 then 1 else x * (fact (x-1)))

let rec fact = fun x ->
  if x <= 1 then 1 else x * (fact (x - 1)) in
fact 3

□