CS 312 Lecture 16
Environment Model of Evaluation

Before we introduced mutable data the substitution model was a good way of reasoning about the values produced by SML programs.  The substitution model is based on the notion of substituting equivalent expressions for one another, much like simplification of algebraic expressions.  However, with mutable data, we now no longer can depend on the fact that a given expression always has the same value.  For instance, f(3) might well return different values each time it is called - for instance by keeping local state and adding its argument to the internal state variable and returning the result.  Thus we introduce the environment model in order to represent the evaluation of expressions that involve mutable data.

There are three key constructs in the environment model, all of which have to do with determining the values of variable names (identifiers):

In modeling the execution of an SML program, the evaluation of each expression is done with respect to some particular environment, which governs the bindings of the identifiers in that expression.  The binding of an identifier in a given environment is determined by finding the first frame in the environment (sequence of frames) that specifies a binding for that identifier.  That is, starting with a frame, if it has no binding for the name then the parent frame is checked, and so on up to TOP. As soon as a binding is found its value is the value of the identifier.  If no frame contains a binding then the variable is unbound.

There is always one current or active environment, which is the environment corresponding to the expression that is currently being evaluated.  There are generally many environments of which the current environment is just one.  These environments form a tree structure, with the frame TOP as the root of the tree.  TOP contains the bindings for all the built-in names that are accessible in the top-level read-eval-print loop (such as foldl, +, ....).

A new frame is created whenever a function is applied (called) and whenever a let expression is evaluated.  We will first consider let expressions.  We limit ourselves to let expressions that bind a single identifier.  A let expression binding more than one identifier is expanded into a nested set of let expressions, one per identifier, as discussed in recitation.

To evaluate the expression let val x = e1 in e2

  1. Evaluate e1 in the current environment.
  2. Extend the environment with a new frame that binds x to the value of e1.
  3. Evaluate e2 in this extended environment.
  4. Restore the environment e1 (removing the new frame from step 2)
  5. Return the value of e2

Consider the following simple example:

let val x = (4, ref 3) in #1 x end

We will denote the TOP level environment by a double box.  Since this expression is being evaluated at top level, the new frame in step 2 has TOP as its parent.  That frame binds x to the pair of 4 and a reference to 3.  Then at the time that e2 is evaluated, the current environment is the one specified by this new frame:

Thus the value of #1 x is 4 in this environment.  Once that expression is evaluated, the current environment returns to being the parent of that environment (TOP in this case).

Applying (or calling) a function also creates a new environment, again by adding a frame to an existing environment.  However unlike a let expression, which extends the environment where let expression is evaluated (the current environment), function application extends the environment in which the function was defined (not the current environment where the call is happening).  Thus we need to represent a function object in a manner that enables us to keep track of the environment where the function was defined.  This is commonly referred to as a closure, which is composed of the function text (the parameters and the code) together with a pointer to the current environment at the time that the function was defined.

Consider the following simple example,

let
  val x = 3
  val f = fn y: int => x
in
 f
end

As noted above, this is equivalent to two nested let expressions:

let
  val x = 3
in
  let
    val f = fn y: int => x
  in
   f
  end
end

We know from above that each of these let expressions creates a frame that extends the current environment.  The definition of the function f creates a closure that points to the current environment where the function was defined, namely the environment where the expression e1 of the inner let was evaluated (TOP extended by a single frame that binds x to 3). This is illustrated below:

Application of a function creates a new environment, not definition of a function.   The rule for function application is:

To evaluate a function application e1(e2)

  1. Evaluate e1 which must produce a closure as a value.
  2. Evaluate e2
  3. Save the current environment
  4. Retrieve the environment from the closure
  5. Extend the retrieved environment with a frame that binds the formal argument to the value of e2
  6. Evaluate the function body in the extended environment
  7. Restore the environment saved in step 3
  8. Return the result

Let's turn to a simple function application example, which slightly extends the previous example of a let, to bind one additional variable, x.  The code then applies the function f to x.  Note that there are now two variables x, and we see the how the environment model lookup rule causes the inner declaration of x to shadow, or hide, the outer one.

let
  val x = 3
  val f = fn y: int => x
  val x = 5
in
 f x
end

The following diagram illustrates the environment at the point that the function body is evaluated (step 6):

Recursive functions are handled a bit differently.  Note that the identifier f is not bound in the environment where the body of the function is evaluated (only y and the binding of x to 3 are part of that environment).  Thus if we tried to make a recursive call to f it would result in an error looking up the identifier f.  Recursive functions are defined using fun (or val rec).  In this case, the closure points to the frame where the new identifier is being bound, rather than to the parent of that frame.

Consider the following definition of the recursive function fact:

let
  fun fact(n:int) = if n = 0 then 1
                    else n * fact(n-1)
in
 fact 3
end

This creates a frame that binds the name fact to a closure, where the closure points to that same frame (rather than to the the parent frame where the function is evaluated as for an anonymous function considered above).  Then the body of the let expression is evaluated.  Each recursive call to fact creates a frame, all of which point to the frame where the identifier fact is bound (not to each other - a frame points to the environment where the function was defined).  However, there is still a control flow, which is that intermediate values need to be passed back in the recursive computation.  One way to keep track of that is using dotted arrows to note what environment is in effect once a function application returns a value.

Contrast the above use of true recursion with the following recursive function that uses refs to access the appropriate closure (function object):

let
  val x = ref (fn x: int => 1)
  val fact = fn (n:int) =>
                if n = 0 then 1
                else n * (!x)(n-1)
  val () = x := fact
in
 fact 3
end

The third val in the let is actually used for the effect that the := operator has, not for value.  The two closures (function objects) created here result from anonymous functions, that is they are analogous to the initial case considered above where the closure points to the parent frame not the frame where the name is defined.  Try this example to see that you understand how this computes the same result as the more standard definition of factorial above.