Environments and Closures

Consider our stack-based rewriting rules for the lambda calculus:

  e ::= i | x | \x.e | e1 e2
  v ::= i | \x.e
  F ::= [] e | v [] | 
  S ::= nil | F::S

  1. (F::S, i) => (S, F[i])

  2. (F::S, \x.e) => (S, F[\x.e])

  3. (S, v e) => ((v [])::S, e)          (e2 not a value)

  4. (S, e1 e2) => (([] e2)::S, e1)      (e1 not a value)

  5. (S, (\x.e) v) => (S, e[v/x])

Note that for a well-typed program (i.e., closed program), we
never have to deal with variables.  This is because via rule
5, we're always substituting a value for a variable.  Recall
that the rules for substitution look like this:

  i[v/x] = i
  x[v/x] = v
  y[v/y] = y   (y != x)
  (e1 e2)[v/x] = (e1[v/x]) (e2[v/x])
  (\x.e)[v/x] = \x.e
  (\y.e)[v/x] = \y.(e[v/x])

So that last rule is pretty expensive.  It requires that we construct
a whole new expression that is the same as e, except that we've
substituted v for x.  It seems silly to do this substitution and
construct a whole new program.  Wouldn't it be better to use some sort
of data structure to associate values with variables and then, if we
encounter a variable, just look up the associated value?  That is,
wouldn't it be better to just perform the substitution lazily?  

The answer is "yes!" but you have to be careful.  The naive approach,
where we have a single top-level store that maps variables to values
does not work out.  Let's see why.  First, let's change our configurations
so they are of the form (S, E, e) where as before, S is a stack, 
and e is an expression to be evaluated, and here, E is an environment---
a mapping from variables to values.  

  E in Env : Var -> Value

Naively, we might think that we could use the following rewriting
rules:

  0'. (S, E, x) => (S, E, E(x))
  
  1'. (F::S, E, i) => (S, E, F[i])

  2'. (F::S, E, \x.e) => (S, E, F[\x.e])

  3'. (S, E, v e) => ((v [])::S, E, e)          (e2 not a value)

  4'. (S, E, e1 e2) => (([] e2)::S, E, e1)      (e1 not a value)

  5'. (S, E, (\x.e) v) => (S, E[x->v], e)

The only differences with the previous rules are (a) we're passing
around an environment, (b) when we apply a function, we extend the
environment to map the formal variable to the actual value argument,
and (c) when we encounter a variable, we look up its value in the
environment.

Looks fine, no?  But consider this expression:

  (\x.((\z.x) ((\x.x) 3))) 2

If you use the first set of rules, this reduces to 2:

  (nil, (\x.((\z.x) ((\x.x) 3))) 2) =>    (via rule 5 and defn of substitution)

  (nil, (\z.2) ((\x.x) 3)) =>             (via rule 3)

  ((\z.2 [])::nil, (\x.x) 3) =>           (via rule 5 and defn of substitution)

  ((\z.2 [])::nil, 3) =>                  (via rule 1)

  (nil, (\z.2) 3) =>                      (via rule 5 and defn of substitution)

  (nil, 2)

If you use the second set of rules, this reduces to 3:

  (nil, {}, (\x.((\z.x) ((\x.x) 3))) 2) =>  (via rule 5')

  (nil, [x->2], (\z.x) ((\x.x) 3)) =>       (via rule 3')

  ((\z.x [])::nil, [x->2], (\x.x) 3) =>     (via rule 5')   ***

  ((\z.x [])::nil, [x->3], x) =>            (via rule 0')

  ((\z.x [])::nil, [x->3], 3) =>            (via rule 1')

  (nil, [x->3], (\z.x) 3) =>                (via rule 5')

  (nil, [x->3,z->3], x) =>                  (via rule 0')

  (nil, [x->3,z->3], 3)

The line marked with *** indicates where we seem to have gone wrong.
The problem appears to be that when entering a nested expression,
we overwrote the environment for the outer expression.  It's true
that within (\x.x) 3 we want to map x to 3.  But we don't want to
map the other x that corresponds to 2 to this value.  

You might think that the solution is to save and restore the environment
on the stack.  That is, define stacks to be a list of pairs of a frame 
and an environment:

  S ::= nil | (F,E)::S

and then change rules as follows:

  0''. (S, E, x) => (S, E, E(x))
  
  1''. ((F,E)::S, E', i) => (S, E, F[i])

  2''. ((F,E)::S, E', \x.e) => (S, E, F[\x.e])

  3''. (S, E, v e) => ((v [],E)::S, E, e)          (e2 not a value)

  4''. (S, E, e1 e2) => (([] e2,E)::S, E, e1)      (e1 not a value)

  5''. (S, E, (\x.e) v) => (S, E[x->v], e)

Now our example seems to go through:

  (nil, {}, (\x.((\z.x) ((\x.x) 3))) 2) =>  (via rule 5'')

  (nil, [x->2], (\z.x) ((\x.x) 3)) =>       (via rule 3')

  ((\z.x [],[x->2])::nil, [x->2], (\x.x) 3) =>     (via rule 5'')

  ((\z.x [],[x->2])::nil, [x->3], x) =>            (via rule 0'')

  ((\z.x [],[x->2])::nil, [x->3], 3) =>            (via rule 1'')

  (nil, [x->2], (\z.x) 3) =>                (via rule 5'')

  (nil, [x->2,z->3], x) =>                  (via rule 0'')

  (nil, [x->2,z->3], 2)

So, that seems to fix the problem and it is better than the previous
approach, but we still have a problem:

  let x = 2 in 
  let f = \z.x in
  let x = 3 in
  f 0

which is equivalent to:

  (\x.(\f.(\x.f 0) 3) (\z.x)) 2

This program should return 2 (try it out for yourself) but using the
rules above, it returns 3:

  (nil, {}, (\x.(\f.(\x.f 0) 3) (\z.x)) 2) => 

  (nil, [x->2], (\f.(\x.f 0) 3) (\z.x)) =>     ***

  (nil, [x->2,f->\z.x], (\x.f 0) 3) => 

  (nil, [f->\z.x,x->3], f 0) => 

  (([] 0,[f->\z.x,x->3])::nil, [f->\z.x,x->3], f) => 

  (([] 0,[f->\z.x,x->3])::nil, [f->\z.x,x->3], \z.x) => 

  (nil, [f->\z.x,x->3], (\z.x) 0) => 

  (nil, [f->\z.x,x->3,z->0], x) => 

  (nil, [f->\z.x,x->3,z->0], 3)

The problem here occurs when we enter f->\z.x in the environment.
At that point in time, x maps to 2, so applying f to some argument
should always return 2.  But later on, we overwrite the binding for
x to map it to 3.  

What's going on here?  To see the problem, note that what
we're really trying to do is take a configuration:

  (S, E, e)

and force it to evaluate the same way as:

  (S, subst(E,e))

where subst(E,e) is a multi-substitution:

  subst(E,i) = i
  subst(E,x) = E(x)
  subst(E,e1 e2) = (subst(E,e1)) (subst(E,e2))
  subst(E,\x.e) = \v.subst(E[x->v],e)

Now, if we look at what is happening, we see that when we get to a
lambda-expression, we're forgetting to push in the substitution!  So,
when we call the function, we've got the caller's environment lying
around instead of the substitution that was in effect when the
lambda was evaluated.  

The solution is to make it possible so that, if we ever call
a function, we evaluate the body of the function in *its*
environment instead of the current environment.  That means
that we need to save the function's environment, along with
the lambda-expression, so that the environment will be available
when we call the function.

The pair of a lambda expression and an environment is called
a *closure* (because the environment provides bindings for the
free variables of the lambda expression so it closes off the
expression.)  

To achieve this, we need to change the definition of values
to accomodate closures:

  v ::= i | <E,\x.e>

and we need to allow closures to show up in expressions:

  e ::= ... | <E,\x.e>

Note that lambda-expressions are no longer values -- rather,
we have to pair up a lambda with its environment to create
a closure value.

The rewriting rules become:

  (S, E, \x.e) => (S, E, <E,\x.e>)

  (S, E, x) => (S, E, E(x))

  ((F,E)::S, E', v) => (S, E, F[v])

  (S, E, v e) => ((v [],E)::S, E, e)          (e2 not a value)

  (S, E, e1 e2) => (([] e2,E)::S, E, e1)      (e1 not a value)

  (S, E', <E,\x.e> v) => (S, E[x->v], e)
  
Notice that in the first rule, a lambda expression evaluates
to a closure consisting of the lambda and the current environment.
In the last rule, when we apply a closure to a value, we use
the closure's environment, not the current environment, while
evaluating the body of the expression.