Allocation

We've been constructing lower and lower-level models for
our core functional language.  First, we introduced a stack
(to make control state explicit) and this allowed us to
easily talk about features such as exceptions and continuations.
Then, we introduced closures and this allowed us to avoid
the costly steps of substituting values for variables.

Now we want to model the fact that in reality, "large"
values that do not fit into a machine word (e.g., a pair
of integers or a closure) are typically allocated in memory.
Modelling allocation is important to give accurate space
and time bounds for real implementations.  For instance,
when you call a function such as:

  fun f(x) = let val x1 = (x,x)
                 val x2 = (x1,x1)
             in
                 (x2,x2)
             end

with the value 42, in our current model, you'll end up
with a result that looks like:

  (((42,42),(42,42)),((42,42),(42,42)))

In general, a function of the form:

  fun f(x) = let val x1 = (x,x)
                 val x2 = (x1,x1)
                   ...
                 val xn = (xn-1,xn-1)
             in
                (xn,xn)
             end

would seem to produce a value whose size is exponential in n (i.e.,
2^n).  In reality, implementations will share common values so that
we'll end up with something that's proportional to n.  But our model
does not take this into account.

To model sharing, we need to model allocation.  What we will do now is
construct a slightly lower level abstract machine where each large
value (e.g., a tuple or closure) is allocated in a "heap" and we will
pass around a pointer to that heap- allocated value.

In addition to heap-allocating data structures (such as tuples,
records, lists, etc.) we will need to allocate closures and
environments.  (They have to go somewhere!)  We could represent
environments as a fancy data structure such as a balanced binary
tree mapping variables to values, but most implementations don't
do this.  Instead, they get rid of variables through a simple
translation step and instead, lookup the value of a variable
by using an integer offset into some sort of data structure
(e.g., a vector/array or a linked list.)

So, we will work with a core language that is in DeBruijn
notation (discussed in class) and looks like this:

  locations    L in Loc
  expressions  e ::= env | v | <e1,e2> | #i e | \e | e1 e2
  small values v ::= i | L
  heap values  h ::= <v1,v2> | <\e,v>
  heaps        H : Loc -> HeapValue
  frames       F ::= <[],e> | <v,[]> | #i [] | [] e | v []
  stacks       S ::= nil | (S,v)::S

Expressions have only one "variable" called env.  This returns the
current value of the environment.  We represent an environment as a
linked-list of values, where we use pairs to represent the head and
tail of the list respectively.  Closures consist of code and
environment (which is just a value).

Heaps are mappings from locations to heap-values.  A heap value
is either a pair of small values, or a closure.  A normal
lambda-calculus expression can be compiled down to our
restriction expression language by keeping track of a mapping
from variables to integers as follows:

  Compile : SourceExp -> (Var -> Exp) -> Exp 

  Compile [x] f = #1 (f(x))
  Compile [e1 e2] f = (Compile[e1] f)(Compile[e2] f)
  Compile [\x.e] f = 
    let f'(y) = if y = x then env
                else #2(f(y))
    in
        \(Compile[e]f')
    end
  Compile[<e1,e2>] f = <Compile[e1]f, Compile[e2]f>
  Compile[#i e] f = #i (Compile[e]f)

Note that the inner-most lambda-bound variable gets mapped
to #1 env (the first component of the environment list)
and that other variables get mapped to #1 (#2 ... (#2 env))
where the number of #2's is proportional to the lexical
nesting depth of the free variable relative to its binding
occurrence.  Assuming expressions are closed, then we can
start off the translation with an empty map for f.

For example:

  Compile [\x.\y.<x,y>] {} = 
  \(Compile[\y.<x,y>]{x-> env}) =
  \\(Compile[<x,y>]{x-> #2 env, y-> env}) =
  \\<#1(#2 env),#1 env>

Now, the rewriting rules for our machine are as follows:

1. (H,(F,L)::S, L', v) => (H, S, L, F[v])

2. (H,S,L,<v1,v2>) => (H[L'-><v1,v2>],S,L,L')  (L' not in Dom(H))

3. (H,S,L,\e) => (H[L'-><\e,v>],S,L,L')  (L' not in Dom(H))

4. (H,S,L,#i L') => (H, S, L, vi) iff  H(L') = <v1,v2> and i = 1 or 2

5. (H,S,L,L1 v) => (H[L2-><v,v'>], S, L2, e)  iff 
      H(L1) = <\e,v'>

6. (H,S,L,<v,e>) => (H,(<v,[]>,L)::S,L,e)  (e not a small value)

7. (H,S,L,v e) => (H,(v [],L)::S,L,e)    (e not a small value)

8. (H,S,L,<e1,e2>) => (H,(<[],e2>,L)::S,L,e1)   (e1 not a small value)

9. (H,S,L,e1 e2) => (H,([] e,L)::S,L,e1)        (e1 not a small value)

The first rule is our usual rule for popping a stack frame when
we're done evaluating a sub-expression.  The next two rules deal
with heap-values---they allocate the heap values in the heap at
some fresh location L' and return L' as the result of the 
computation.  

Rule 4 is our new rule for projection.  Instead of operating on
the heap-value directly, we are given a pointer L' which should
be bound to a pair in the heap H.  That is, H(L') should be a
pair <v1,v2> and we simply return the appropriate component.

Rule 5, for applying a closure to an argument, is similar except
that we also have to make sure to run the body of the closure
within the environment of the closure.  Note that we also have
to extend the environment of the closure so the first component
is the argument to the function.  

The other rules (6-9) are just pushing the right kind of stack
frame onto the stack (saving the environment) and moving on to
evaluate a sub-expression.  

To run an expression e, we simply start it in the configuration:

  ({},nil,0,e)

where {} is the empty heap, nil is the empty stack, 0 is the
empty list (any integer would do) and e is the expression to 
be evaluated.  The terminal configurations are of the form:

  (H, nil, L, v)

and the resulting value is v.  

(As an exercise, you might want to work out the result of evaluating
a simple expression.)

--------------------------------------------------------------------
In the rules above, you can see that the heap never shrinks -- it
only grows.  In a real implementation, we have to periodically
reclaim unused locations in the heap.  In general, we could reclaim
any location that does not cause the abstract machine to get
stuck.  For instance, if location L is bound to the pair <3,4>
and we never dereference L by performing a #1 or a #2 on it,
then we can safely recycle L to hold some other value.  

Unfortunately, determining whether it's safe to recycle a location
is generally undecidable.  Instead, we must figure out some safe
approximation.  In a type-safe world, if a location isn't reachable
from the current expression, stack, or environment, by following
the pointer graph, then it turns out that the program can never
use that location in a way that would cause the program to get
stuck.  

We can formalize this as follows:  First, let us define the
free locations of a configuration (H,S,L,e) as:

  FL(H,S,L,e) = ({L} U FL(e) U FL(S) U FL(H)) \ Dom(H)

Here, FL(e) is the set of locations that occur within e,
and similarly FL(S) is the set of locations that occur
within S.  The set FL(H) is defined by:

  FL(H) = union for all L in Dom(H).FL(H(L))

That is, it's the set of locations that show up in the heap
values.  

We say a configuration (H,S,L,e) is closed if its set of
free locations is empty.  An important theorem is the following
"GC" theorem:

  if (H,S,L,e) is closed then for any extension H' to the heap,
  (H+H',S,L,e) behaves the same as (H,S,L,e).

That is, if (H+H',S,L,e) evaluates to the integer 3, then
(H,S,L,e) also evaluates to 3.  

So, the process of garbage collection can be specified abstractly as,
given a configuration (H,S,L,e), break the heap into two pieces H1 and
H2, such that (H1,S,L,e) remains closed, and then you can safely
recycle the locations in H2.  Note that H1 will have to contain the
environment L, any location reachable from S or e, and any location
reachable from those locations (and so on) in order to remain closed.

There are a lot of algorithms for implementing this specification,
(e.g., mark/sweep, copying collectors, reference counting, etc.)
that are all instances of this general specification.