CS 312 Lecture 5
The Substitution Model of Evaluation

In this lecture, we examine how SML programs evaluate more closely, building a more formal and precise description of the evaluation process. This is a model of evaluation based on the basic notion of substitution. First, we need to talk about identifiers, because they are the things we substitute for.


The ability to define new identifiers is central to every high-level programming language. Identifiers are the way that programmers refer to constructs they create; different uses of identifiers correspond to the different abstraction mechanisms provided by the programming language. Let's take a closer look at the way that identifiers are used in ML.

In ML, identifiers may refer to variables, functions (which are really just variables), datatypes, datatype constructors, field names, type names introduced using type, type variables (if preceded by ') and to some other things we  haven't seen yet: modules and signatures. SML lets you use the same identifiers to refer to some of these things. For example, you can have a type named foo and a variable named foo at the same time. When an expression contains the identifier foo, what it refers to depends on how it is used. In the following code, the occurrences of "foo" that refer to a type are shown in green, the occurrences that are variables are shown in red and magenta. In this example there are actually two different variables named foo; the second one (in green) shadows the first one.

let type foo = int
    val foo: foo = 2
    val foo: {foo: foo} = {foo = foo}
  #foo foo

Clearly the ability to use names to refer to different kinds of things at the same time can be abused. Avoid writing code that looks like this example!


There are three different ways that one can use an identifier:

  1. A binding occurrence, which binds the identifier to a particular value or type. For example, in the expression let val x:int = 1 in x end, the first occurrence is a binding occurrence that binds x to 1. Each binding occurrence introduces a new variable, and this new variable has a scope: a part of the program in which uses of that identifier refer to the variable. In this case the scope of the variable x is the body of the let expression.
  2. A bound occurrence is a use of a variable in the scope of a variable binding. For each bound occurrence of a variable, there is a single corresponding binding of that variable. For example, in the expression (fn(x:int)=>x),  the second occurrence of x is a bound occurrence; its corresponding binding occurrence is the first occurrence. At run time this variable will be bound to whatever value is passed to the function when it is invoked.
  3. An unbound or free occurrence is a use of an identifier with no corresponding binding occurrence in whose scope. For example, in the expression let val y:foo = x+1 in y end, the use of x is an unbound occurrence because there is no containing binding of x. The identifier foo is also an unbound occurrence of a type identifier. A legal SML program cannot contain an unbound occurrence of an identifier. However, for the purpose of understanding how SML works, sometimes it is useful to write down syntactically legal fragments of SML programs and talk about the unbound variables that occur in them.

Given an occurrence of an identifier that is not a binding occurrence, there is a simple way to figure out whether it is bound or unbound, and if the former, to which binding occurrence the identifier is bound. An identifier is bound if it is in the scope of a binding occurrence. For ML programs, the scope of a variable can be seen by simply looking at the program text. If the variable lies within the scope of more than one binding occurrence, then one of those bindings shadows the rest. It will be the binding occurrence whose scope most tightly encloses the use of the identifier.

In SML it is possible to figure out just by looking at the program code which occurrence binds each use of a variable. A language with this property is said to have lexical (or static) scoping : the scope of each variable is apparent from the lexical form of the program, without knowing anything about how the program runs. The alternative to lexical scoping is dynamic scoping, in which a given variable occurrence may have different binding occurrences depending on how the program runs. In most modern languages, such as Java or C, variable have lexical scope. However, Perl and Python are examples of languages with dynamic variable scoping. Dynamic scope is harder to implement efficiently, and can lead to unpleasant surprises for programmers because variables don't always mean what they expect.


The SML prompt lets you type either a term or a declaration that binds a variable to a term. It evaluates the term to produce a value: a term that does not need any further evaluation. We can define values v as a syntactic class too. For now, we can think of values as just being the same as constants, though we'll see there is much more to them.

Running an ML program is just evaluating a term. What happens when we evaluate a term? In an imperative (non-functional) language like Java, we sometimes imagine that there is an idea of a "current statement" that is executing. This isn't a very good model for ML; it is better to think of ML programs as being evaluated in the same way that you would evaluate a mathematical expression. For example, if you see an expression like (1+2)*3, you know that you first evaluate the subexpression 1+2, getting a new expression 3*3. Then you evaluate 3*3. ML evaluation works the same way. As each point in time, the ML evaluator takes the left-most expression that is not a value and rewrites (or reduces) it to some simpler expression. Eventually the whole expression is a value and then evaluation stops: the program is done. Or maybe the expression never reduces to a value, in which case you have an infinite loop.

SML has a bunch of built-in rules for rewriting terms that go well beyond simple arithmetic. For example, consider the if expression. It has two important rewrite rules:

if true then e1 else e2     e1
if false then e1 else e2    e2

If the evaluator runs into an if expression, the first thing it does is try to reduce the conditional expression to either true or false. Then it can apply one of the two rules here.

For example, consider the term if 2=3 then "hello" else "good" ^ "bye". This term evaluates as follows:

if 2=3 then "hello" else "good" ^ "bye"
 if false then "hello" else "good" ^ "bye"
 "good" ^ "bye"

Notice that the term "good"^"bye" isn't evaluated to produce the string value "goodbye" until the If term is removed. This is because if is lazy about evaluating its then and else clauses. If it weren't lazy, it wouldn't work very well.

Evaluating the let term

The rewrite rule for the let expression introduces a new issue: how to deal with the bound variable. In the substitution model, the bound variable is replaced with the value that it is bound to. Evaluation of the let works by first evaluating all of the bindings. Then those bindings are substituted into the body of the let  expression (the expression in between in...end). For example, here is an evaluation using let :

let val x = 1+4 in x*3    let val x = 5 in x*3    5*3    15

Notice that the variable x is only substituted once there is a value (5) to substitute. That is, SML eagerly evaluates the binding for the variable. Most languages (e.g., Java) work this way. However, in a lazy language like Haskell, the term 1+4 would be substituted for x instead. This could make a difference if evaluating the term could create an exception, side effect, or an infinite loop.

Therefore, we can write the rule for rewriting let roughly as follows:

let val x:t = v in e end e (with occurrences of x replaced by v)

Remember that we use e to stand for an arbitrary expression (term), x to stand for an arbitrary identifier. We use v to stand for a value -- that is, a fully evaluated term. By writing v in the rule, we indicate that the rewriting rule for let cannot be used until the term bound to x is fully evaluated. Values can be constants, applications of datatype constructors or tuples to other values, or anonymous function expressions. In fact, we can write a grammar for values:

v ::= c  |  X(c)   |   fn x:t => e


When we wrote “with occurrences of x replaced by v”, above, we missed an important but subtle issue. The term e may contain occurrences of x whose binding occurrence is not this binding x:t = v. It doesn't make sense to substitute v for these occurrences. For example, consider evaluation of the expression:

let val x:int = 1
    fun f(x:int) = x
    val y:int = x+1
  fn(a:string) => x*2

The next step of evaluation replaces the green occurrences of x with 1, because these occurrences have the first declaration as their binding occurrence. Notice that the two occurrences of x inside the function f, which are respectively a binding and a bound occurrence, are not replaced. Thus, the result of this rewriting step should be:

let fun f(x:int) = x
    val y:int = 1+1
  fn(a:string) => 1*2

Let's write the substitution e{v/x} to mean the expression e with all unbound occurrences of x replaced by the value v. Then we can restate the rule for evaluating let more simply:

let val x:t = v in e end  e{v/x}

This works because any occurrence of x in e must be bound by exactly this declaration val x:t = v. Here are some examples of substitution:

x{2/x}  =  2
x{2/y}  =  x
(fn(y:int)=>x) {"hi"/x}  =  (fn(y:int)=>"hi")
f(x) { fn(y:int)=>y / f } =  (fn(y:int)=>y)(x)

One of the features that makes ML fairly unique is the ability to write complex patterns containing binding occurrences of variables. Pattern matching in ML causes these variables to be bound in such a way that the pattern matches the supplied value. This is can be a very concise and convenient way of binding variables. We can generalize the notation used above by writing e{v/p} to mean the expression e with all unbound occurrences of variables appearing in the pattern p replaced by the values obtained when p is matched against v. Using this notation, we can express the let rule simply:

let val p = v in e end  e{v/p}

What if a let expression introduces multiple declarations? Such an expression is identical in effect to a series of nested let expressions. Thus, we can use the following rewrite that pulls out the first declaration so the rules above apply.

let d1...dn  in e end  
  let d1 in  d2...dn in e end end

Evaluating functions

Function calls are the most interesting case. When a function is called, SML does a similar substitution: it substitutes the values passed as arguments into the body of the function. Suppose we define a function abs as follows:

fun abs(r: real):real =
  if r < 0.0 then ~r else r

We would like the evaluation of abs(2.0+1.0) to proceed roughly as follows:

abs(2.0+1.0)    abs(3.0)   if 3.0 < 0.0 then ~3.0 else 3.0
     if false then ~3.0 else 3.0  3.0

In fact, the fun keyword is really just syntactic sugar for binding an anonymous function. So when we evaluate the declaration of abs above, we are really binding the identifier abs to the value fn (r: real) => if r < 0.0 then ~r else r.

Therefore, the evaluation of a function call proceeds as in the following example:

let fun abs(r: real):real =
  if r < 0.0 then ~r else r
  abs(2.0 + 1.0)
 (fn (r: real) => if r < 0.0 then ~r else r)(2.0 + 1.0)
    (* replace occurrences of abs in let body with anonymous function *)
 (fn (r: real) => if r < 0.0 then ~r else r)(3.0)
 if 3.0 < 0.0 then ~3.0 else 3.0
    (* replace occurrences of r in function body with argument 3.0 *)
 if false then ~3.0 else 3.0

We can use the substitution operator to give a more precise rule for what happens when a function is called:

(fn( p )=> e )( v   e{v/p}

Some caveats

This is a model for how SML evaluates. The truth is that SML terms are compiled into machine code that executes much more efficiently than rewriting would. But that is much more complex to explain, and not that important for our purposes. The goal here is to allow us as programmers to understand what the program is going to do. We can do that much more clearly in terms of term rewriting than by thinking about the machine code, or for that matter in terms of the transistors in the computer and the electrons inside them. This evaluation model is an abstraction that hides complexity you don't need to know about. Understanding how programs execute in terms of these lower levels of abstraction is the topic of other courses, like CS 314 and CS 412.

Some aspects of the model should not be taken too literally. For example, you might think that function calls take longer if an argument variable appears many times in the body of the function. It might seem that calls to function f are faster if it is defined as fun f(x) = x*3 rather than as fun f(x) = x+x+x because the latter requires three substitutions. Actually the time required to pass arguments to a function typically depends only on the number of arguments. Chances are the definition on the right is at least as fast as that on the left.

The model as given also has one significant weakness: it doesn't explain how recursive functions work. The problem is that a function is in scope within its own body. Mutually recursive functions are also problematic, because each mutually recursive function is in scope within the bodies of all the others.

A way to understand this is that a recursive function is an infinite unfolding of the original definition. For example, in a function for computing factorials,

fun fact(n:int):int = if n = 0 then 1 else n*fact(n-1)
We can think of fact as being bound to an infinite anonymous function:
fn n => if n=0 then 1 else
    n*(fn n => if n = 0 then 1 else
	n*(fn n => if n = 0 then 1 else

It's probably easiest to think of it as an anonymous function that hasn't been infinitely unrolled like this, but rather contains a pointer to itself that expands out into the same full anonymous function whenever it is used:

For a fuller treatment of recursive definitions, take CS 411 or 611.