Lecture 9:
The Substitution Model

We now have a good intuitive grasp of the execution model of an SML program. Today we are going to introduce a formal set of rules that will make our knowledge precise.

We will use the substitution model to achieve four major goals:

It is important to note that our substitution model only addresses a subset of the purely functional part of SML that we have studied. The model does not constitute a complete description of SML, primarily because it does not handle side-effects (of which, except for printing, we don't yet know much). Still, the substitution model is complex enough for us to achieve the four goals listed above, making it one of the most important tools that we will use in this course.

As an aside, let us note that the study of simplified models is essential in many areas of science and engineering. The fact that Einstein developed model of physical reality that superseded that of Newton's, for example, did not invalidate the use of Newton's laws in the vast majority of (mechanical) engineering applications, nor did it make its study less relevant.

Our version of the substitution model will target the following subset of SML:

Expressions:

e ::= c                        (* constants *)
    | id                       (* variables *)
    | (fn id => e)             (* anonymous functions *) 
    | e1(e2)                   (* function applications *)
    | u e                      (* unary operations, ~, not, etc. *)
    | e1 b e2                  (* binary operations, +,*,etc. *)
    | (if e then e1 else e2)   (* if expressions *)
    | let d in e end           (* let expressions *)

Declarations:

d ::= val id = e               (* value declarations *)
    | fun id(id1) = e          (* function declarations *)

Values:

v ::= c                        (* constant values *)
    | (fn id => e)             (* anonymous functions *)

Note: In the next lecture we will add one more expression to the list.

Remember that in BNF notation the ::= operator means "is defined as," and the vertical bar | denotes alternatives. Also, note that for reasons of brevity we dropped all type specifications.

The constructs listed above are all well-known by now, but some of them are not as general as "true" SML allows for. In particular, we restricted let statements to have only one declaration, and functions to have only one argument. These restrictions will simplify our rules without significantly affecting the expressivity of the language that we model.

Anonymous functions and constants appear twice in the list, both as expressions and as values. You can understand the meaning of this duality if you analyze what is happening when you type, say, the integer constant 5 to the SML prompt. SML treats your input as an expression, a very trivial one in this case, and evaluates it.

When evaluated, the expression is "converted" into a value, that of the number 5, which has an internal representation very different from the character 5 you used to specify the original expression. This internal value could be used in further computations, but because your expression was so simple, there is nothing else to do with it. When SML is done evaluating an expression, it converts the value to a suitable external representation, in this case the character 5. While the input and the output look the same, we should keep in mind the fact that the first is an expression, while the second one is (an external representation of a) value.

A similar issue arises with anonymous functions: if you type an anonyous function to the SML prompt, SML will evaluate it to an internal representation of the function, which is a value in SML, and only prints out a simplified representation of this internal value. For lack of a better notation, we represent the value the same way as the input expression, but we remain aware of the subtle difference between them.

We are now going to provide rules for all the expressions and the declarations above:

Rule #E1 [constants]

eval(c) = c

Constants evaluate to themselves (but keep in mind the discussion above).

Rule #E2 [anonymous functions]

eval(fn id => e) = (fn id => e)

Anonymous functions evaluate to themselves (again, the discussion above indicates that the reality is more subtle than this rule seems to imply).

Rule #E3 [function calls]:

eval(e1(e2)) = v'  where
  (0) eval(e1) = (fn id => e)
  (1) eval(e2) = v
  (2) substitute([(id,v)],e) = e'
  (3) eval(e') = v'

This is the first complex rule that we encounter. It states, in essence, that there are four steps involved in a function application (or function call). First, we need to evaluate e1, which must evaluate to an (anonymous) function. In the second stage we evaluate the argument e2. Next we replace the free occurences of the function argument in the function body with the value of the function argument. Finally, we evaluate the substituted function body. The notation substitute([(id,v)],e) is equivalent to e{v/id}; we can use them interchangeably.

As it is visible in this example, function eval relies on recursive calls on subexpressions of the expression that it was originally called on. This is why the substitution model is also called the recursive substitution model.

We illustrate the application of this rule below; we use a trivial example because we rely only on rules we have aldready introduced:

eval((fn x => x) 5)
  step 0: eval(fn x => x) = (fn x => x)
  step 1: eval(5) = 5
  step 2: substitute([x, 5)], x) = 5
  step 3: eval(5) = 5

We have just proven that if the call the identity function with argument 5, we get 5 back. Do not let yourselves be deceived by the simplicity of this example!

Rule #E4 [unary operators]

eval(u e) = v where
  (0) eval(e) = v'
  (1) v = apply_unop(u,v')

To evaluate a unary operator we first evaluate its argument, then we apply the operator to the resulting value. For brevity, we will not detail function apply_unop any further - we will assume that it just works.

Rule #E5 [binary operators]

eval(e1 b e2) = v where
  (0) eval(e1) = v1
  (1) eval(e2) = v2
  (2) v = apply_binop(b,v1,v2)

To evaluate a binary operator we evaluate its first and second operand (argument, really), in this order, then we apply the binary operator to the resulting values. Again, we leave function apply_binop unspecified.

Rule #E6 [if expressions]

eval(if e then e1 else e2) = v' where
(0) eval(e) = v
(1) if v = true then v' = eval(e1)
(2) if v = false then v' = eval(e2)

To evaluate an if expression, we first evaluate the condition. If the condition evaluates to true, then the expression on the then branch is evaluated; if it evaluates to false, then expression on the else branch is evaluated. No matter what the condition evaluates to, only one branch of the if expression is evaluated.

Note that the conditional expressions in steps (1) and (2) above are not SML expressions - you should think of them as being conditionals expressed in a basic, simpler language, whose semantics you already know. You can even think of these as being written in English. If you assumed that these were SML if expressions, then no evaluation of if would ever finish, as rule E6 would be recursive with no base case (i.e. with no stopping condition). In effect, the rule would then state that "an if is an if which in turn is an if ..."

If we were designing a language, say a dialect of SML, we could change this rule as follows:

eval(if e then e1 else e2) = v' where
(0) eval(e) = v
(1) eval(e1) = v1
(2) eval(e2) = v2
(3) if v = true then v1
(4) if v = false then v2

Can you see what are the problems introduced by this modified rule (think in terms of efficiency and exceptions/program termination)?

Rule #D1[val declarations]

eval_decl(val p = e) = S where
  (0) eval(e) = v
  (1) create substitution S = [(p, v)]

This is a special rule that applies to val declarations, and one that does not produce a value, but a substitution. First, we evaluate the expression whose value will be associated with the identifier that appears in the val statement. Second, we create a substitution that associates identifier p with the value v of expression e.

Rule #E7 [let expressions]

eval(let d in e end) = v where
  (0) eval_decl(d) = S
  (1) substitute(S,e) = e'
  (2) eval(e') = v

The first step in evaluating a let expression is to evaluate its declaration (we know how to handle val declarations; we will discuss fun declarations next time). As a result of evaluating the declaration, we obtain a substitution; we use this to replace all free occurences in e of the identifier that appears in the substitution. Finally, we evaluate the result of the substitution.

Let us evaluate the following expression by using the substitution model:

let
  val x = 5
in
  let
    val y = x + 3
  in
    (fn u => x + y * u) 7
  end
end

To save space, we write the entire expression onto one line:

eval(let val x = 5 in let val y = x+3 in (fn u => x + y * u) 7 end end)
|
|  E7.0. eval_decl(val x = 5)
|  |
|  |  D1.0. eval(5)
|  |  |
|  |  |  E1. eval(5) = 5
|  |  |
|  |  |eval(5) = 5
|  |
|  |  D1.1. S = [(x, 5)]
|  |
|  | eval_decl(val x = 5) = [(x, 5)]
|
|  E7.1. substitute([(x, 5)], let val y = x + 3 in (fn u => x + y * u) 7 end)
|        = let val y = 5 + 3 in (fn u => 5 + y * u) 7 end
|
|  E7.2. eval(let val y = 5 + 3 in (fn u => 5 + y * u) 7 end)
|  |
|  |   E7.0. eval_decl(val y = 5 + 3) = ... = [(y, 8)]
|  |
|  |   E7.1. substitute([(y, 8)], (fn u => 5 + y * u) 7)
|  |         = (fn u => 5 + 8 * u) 7
|  |
|  |   E7.2. eval((fn u => 5 + 8 * u) 7)
|  |   |
|  |   |  E3.0. eval(fn u => 5 + 8 * u)
|  |   |  |
|  |   |  |  E2. eval(fn u => 5 + 8 * u) = (fn u => 5 + 8 * u)
|  |   |  |
|  |   |  |eval(fn u => 5 + 8 * u) = (fn u => 5 + 8 * u)
|  |   |
|  |   |  E3.1 eval(7)
|  |   |  |
|  |   |  | E1. eval(7) = 7
|  |   |  |
|  |   |  |eval(7) = 7
|  |   |
|  |   |  E3.2. substitute([(u, 7)], 5 + 8 * u))
|  |   |       = 5 + 8 * 7
|  |   |
|  |   |  E3.3. eval(5 + 8 * 7) = ... = 61
|  |   |
|  |   |eval((fn u => 5 + 8 * u) 7)  = 61
|  |
|  |eval(let val y = 5 + 3 in (fn u => 5 + y * u) 7 end) = 61
|
|eval(let val x = 5 in let val y = x + 3 in (fn u => x + y * u) 7 end end) = 61

As you will note, we have skipped some steps. It is clear that fully worked out applications of the substitution model are tedious to write and time-consuming. In our applications of the substitution model we will focus on the important steps only to keep the size of the resulting writeup manageable.

The derivation above emphasizes the recursive structure of the rules, and it provides the rule number and step number wherever applicable.