The Substitution Model

Note: A more complete version of this handout here.
Expressions: e ::=  	  
  c 					(* constants *)
| id 					(* variables *)
| (fn (id:t) => e)			(* anonymous functions *) 
| e1(e2) 				(* function applications *)
| u e					(* unary operations, ~, not, etc. *)
| e1 b e2				(* binary operations, +,*,etc. *)
| (if e then e1 else e2) 		(* if expressions *)
| let d in e end			(* let expressions *)
| (fun id(id1:t1):t2 = e)		(* recursive functions *)
Declarations: d ::=
  val id = e				(* value declarations *)
| fun id(id1:t1):t2 = e			(* function declarations *)
Values: v ::=
  c					(* constant values *)
| (fn (id:t):t' => e)			(* anonymous functions *)

Rule #E1 [constants]:  constants evaluate to themselves

eval(c) = c

Rule #E2 [functions]:  anonymous functions evaluate to themselves

eval(fn (id:t) => e) = (fn (id:t) => e)

Rule #E3 [function calls]:  to evaluate e1(e2),  evaluate e1 to a function (fn (id:t) => e), then evaluate e2 to a value v, then substitute v for the formal parameter id within the body of the function e to yield an expression e'.  Finally, evaluate e' to a value v'.  The result is v'.

eval(e1(e2)) = v'  where
  (0) eval(e1) = (fn (id:t) => e)
  (1) eval(e2) = v
  (2) substitute([(id,v)],e) = e'
  (3) eval(e') = v'

Rule #E4 [unary ops]: to evaluate u e where u is a unary operation such as not or ~, evaluate e to a value v', then perform the appropriate unary operation on the value v'  to get the result v.

eval(u e) = v where
  (0) eval(e) = v'
  (1) v = apply_unop(u,v')

Rule #E5 [binary ops]:  to evaluate e1 b e2 where b is a binary operation such as +, *, -, etc. evaluate e1 to a value v1, then evaluate e2 to a value v2, then perform the appropriate operation on the values v1 and v2 to get the result v.

eval(e1 b e2) = v where
  (0) eval(e1) = v1
  (1) eval(e2) = v2
  (2) v = apply_binop(b,v1,v2)

Rule #E6 [if]: to evaluate (if e then e1  else e2), evaluate e to a value v.  Then depending on the (boolean) value of v, the value is either the result of evaluating e1 or e2.

eval(if e then e1 else e2) = v' where

(0) eval(e) = v
(1) if v = true then v' = eval(e1)
(2) if v = false then v' = eval(e2)

Rule #E7 [let]: to evaluate let d in e end, evaluate the declaration d to get a substitution S.  Perform the substitution S on e yielding a new expression e'.  Then evaluate e' to get the final answer.

eval(let d in e end) = v where
  (0) eval_decl(d) = S
  (1) substitute(S,e) = e'
  (2) eval(e') = v

Rule #E8 [fun expressions]: to evaluate a recursive function expression (fun f(x:t):t' = e), we "unroll" the function once.  In other words, we return the anonymous function (fn (x:t) => e') where e' is the result of substituting (fun f(x:t):t' = e) for the identifier f within the body e. 

eval(fun f(x:t):t' = e) = (fn (x:t):t' => e') where
  substitute([f,(fun f(x:t):t' = e)],e) = e'. 

Rule #D1[val declarations]:  to evaluate a declaration val id = e, evaluate e to a value v and match v against the identifier id to yield a substitution S.  The substitution S is the result of the declaration.

eval_decl(val p = e) = S where
  (0) eval(e) = v
  (1) match(v,p) = S

Rule #D2[fun declarations]: to evaluate a function declaration fun f(x:t):t' =e, return the substitution that maps f to the function expression fun f(x:t):t'=e.

eval_decl(fun f(x:t):t' = e) = [(f,fun f(x:t):t'=e)]


Recall that a substitution is a finite map from identifiers (variables) to expressions.  We represent the substitution as a list of identifiers and expressions (i.e., [(x1,e1),(x2,e2),...,(xn,en)]).  We can perform a substitution S = [(x1,e1),(x2,e2),...,(xn,en)] on an expression or declaration by first substituting e1 for x1, then e2 for x2, ..., then xn for en.  That is, substitute([],e) = e and substitute((x,e')::rest,e) = substitute(rest, subst(x,e',e)), where we define subst as below.  The definition of subst needs an auxiliary function vars(p) which returns the set of variables that occur in a pattern.

subst(x,e,c) = c
subst(x,e,y) = e when x = y 
subst(x,e,y) = y when x <> y
subst(x,e,(fn (y:t) => e')) = (fn (y:t) => subst(x,e,e')) when x <> y
subst(x,e,(fn (y:t) => e')) = (fn (y:t) => e') when x = y
subst(x,e,e1(e2)) = (subst(x,e,e1))(subst(x,e,e2))
subst(x,e,u e') = u (subst(x,e,e'))
subst(x,e,e1 b e2) = (subst(x,e,e1)) b (subst(x,e,e2))
subst(x,e,(e1,...,en)) = (subst(x,e,e1),...,subst(x,e,en))
subst(x,e,#i e') = #i (subst(x,e,e'))
subst(x,e,{lab1=e1,...,labn=en}) = {lab1=subst(x,e,e1),...,labn=subst(x,e,en)}
subst(x,e,#lab e') = #lab (subst(x,e,e'))
subst(x,e,Id) = Id
subst(x,e,Id(e')) = Id(subst(x,e,e')
subst(x,e,if e1 then e2 else e3) = 
  if subst(x,e,e1) then subst(x,e,e2) else subst(x,e,e3))
subst(x,e,let val p = e1 in e2 end) = 
  let val p = subst(x,e,e1) in e2' end
    where     e2' = e2 if x is in vars(p)
    otherwise e2' = subst(x,e,e2)
subst(x,e,let fun f(y:t):t'=e1 in e2 end) = 
  let fun f(y:t):t'= e1' in e2' end
    where     e1' = e1 if x = f or x = y
    otherwise e1' = subst(x,e,e1) and
    where     e2' = e2 if x = f
    otherwise e2' = subst(x,e,e2) 
subst(x,e,fun f(y:t):t'=e1) = fun f(y:t):t'=e1'
    where     e1' = e1 if x = f or x = y
    otherwise e1' = subst(x,e,e1)

Intuitively we push the substitution down, replacing all free occurrences of x (i.e., uses of x) with the new expression e. Note that we stop pushing the substitution down when we encounter a binding occurrence of the same variable (i.e., a new definition of x that shadows the old one that we're substituting for.)  For instance, in (fn (x:t) => e'), x is a binding occurrence and shadows all older definitions of x.  Similarly, in (fun f(x:t):t' = e), both f and x are binding occurrences, and in a pattern, a variable is a binding occurrence.  All other occurrences of variables are said to be free occurrences.  


In general, we evaluate expressions in a left-to-right, inner-most to outer-most fashion.  

eval takes an expression and yields a value.

eval_decl takes a declaration and yields a substitution.

match takes an identified and a pattern and yields a substitution.

A substitution is a mapping from identifiers to expressions.  We represent the substitution as a list of identifiers and expressions here.

SML doesn't really have recursive function expressions, only recursive function declarations.  We need recursive function expressions to explain how things evaluate (not to write code.)

Most of the other SML expression forms are derived forms.  For instance:

   let d1 d2 ... dn in e end ::= 
     let d1 in let d2 in ... in let dn in e end ... end end

Thus let-expressions with more than one declaration can be treated as a sequence of nested let's with one declaration.  It's possible to also encode mutually-recursive functions, but we won't deal with this for now.

The substitution model presented here does not take into account:

However, it covers a relatively large portion of the language and is extremely simple.

CS312 home  2002 Cornell University Computer Science