Problem Set 5
Mini-ML Compiler

Issued: November 7
Partner Deadline: Novemver 15, 1:00am
Due: December 5, 1:00am


Please take the time to read this document carefully before starting the assignment (note the partner deadline up top). Much of what you need to do is understand the code that we've provided. You actually don't have to write much code.

You can find the latest source code for this problem set here. Since there are quite a few files making up the problem set, the source code is provided as a zip file. The zip file contain subdirectories and you will have to make sure the directory structure is correct so that sml can find all the files.

For this problem set, you may work with one (and only one) partner. Sign up with your partner as soon as possible. If you wish to work alone, please request permission from Prof. Zabih as soon as possible.

Quick Link to the Exercises.


Clarifications and Breaking News:

Updated: November 28

Added alternate G2 garbage collection algorithm. See garbage collection section.



Updated: November 3, 12:00AM

Announcement: As with previous problem sets, code that does not compile or compiles with warnings is likely to receive an automatic 0. You are not allowed to modify any signatures, and in general you are only allowed to modify pieces of code that we explicitly ask you to modify (adding helper functions is OK).


Overview

For this problem set, you will be working on a type checker, compiler, and abstract machine for implementing Mini-ML, a subset of SML. Unzip the code into a directory, change to that directory, fire up SML and type CM.make(); This should compile and load the code for the Mini-ML interactive system. You start up the Mini-ML interpreter by typing "MiniML.interpreter();", which will allow you to evaluate a subset of SML declarations and expressions. Some Mini-ML interpreter directives:

You should play around with the interpreter to learn what it can and cannot do. The syntax is very close to that of SML, with a few idiosyncrasies that you should get used to.

This language implementation consists of a number of components:

  1. a parser, which reads text from a file or the console and produces Mini-ML abstract syntax.
  2. a type checker, which type-checks the code.
  3. a compiler, which compiles the Mini-ML code down to a low-level language called Lambda.
  4. an optimizer which attempts to simplify the resulting Lambda expression.
  5. an interpreter for Lambda expressions which is based on the environment model.

In this problem set you will (1) fix some bugs and finish the type checker and (2) write a garbage collector for the interpreter

A major goal of this problem set is to have you work on a relatively large hunk of code written by someone else. Your first task is to read and play with the code a lot. Try to understand what it is doing, There will be many mysterious pieces; figuring out how the system is supposed to work is going to be the hardest part of the problem set. Go to office hours. Post on the newsgroup. Ask questions. (But since part of this PS is debugging, you must not discuss potential bugs in the newsgroup; email cs312@cs.cornell.edu instead.)

Another goal is to get you to think hard about how you effectively test a large piece of software. Finding all of the bugs in a program is really hard. What strategy will you use to find the bugs in the Mini-ML implementation?

Finally, we want to teach you a little something about how compilers and interpreters really work. Although this problem set is seriously scaled down, it is based on the same principles that underlie most modern language implementations. By understanding it, you will have a fairly realistic idea as to what really happens when you run an ML program. Good luck!


The Mini-ML Language

You can almost tell what Mini-ML supports by simply looking at the abstract syntax for the language (see absyn.sml.) It supports a few primitive operations (+, *, -, =, and ^), a few unary operations (~ and not), a few forms of constants (ints, reals, strings, and characters), variables, functions, applications, tuples, records, data constructors, case expressions, let expressions, recursive functions, and refs.

Not everything is supported in Mini-ML. For instance, comparison operations (<, >) are restricted to integers and reals and equality is defined only on integers, strings chars and refs. There are only a couple of built-in datatypes, notably:

  datatype bool = false | true
  datatype intlist = Nil | Cons of int*intlist
  datatype intoption = NONE | SOME of int

Mini-ML has no support for polymorphism and no support for type-inference. Rather, you have to write types on function arguments. (See the example .mml code.)

The patterns in Mini-ML are fairly primitive for example, they don't support the "as" patterns of full SML. In addition, functions can't pattern-match directly — you need to use an explicit case expression or a val-declaration to do any pattern matching. Finally, the type checker does not check that patterns are exhaustive and non-overlapping. This is not an easy algorithm to code up, so we didn't do it and nor do we expect you to do it.

Otherwise, Mini-ML is fairly complete. When you go to look for bugs, keep in mind that many things are intentionally omitted from Mini-ML. Omissions, such as the inability to compare two booleans or the lack of exhaustiveness checking in the type checker, are intentional and should not be considered bugs. If you aren't sure whether or not something is a bug, ask a consultant or send email to the course staff.

There may be bugs in the parser or the compiler, but we don't intend for you to find or fix them. If you do find a bug outside of the type checker, then please let us know as soon as possible. The bugs we intend for you to find are all in the type checker.

You may find that you need to use extra parentheses in expressions to make sure Mini-ML parses the expression correctly.


Type Checking

The parser returns an Abstract Syntax Tree (AST) which represents the code you've fed to the compiler. The type checker makes a pass over the AST, making sure that every expression is of the correct type. For example, the concatenation operation must have both arguments of type string . Sometimes the compiler needs to know type information to do its job. In these cases, we embed the necessary information into the AST before passing it along to the compiler.

We have provided a largely non-functional type checker for you, in the file type-check.sml. You will spend most of your time looking in here, so it would be a good idea to familiarize yourself with the code that is already written and how it works. The tcheck is the meat of this file. It walks down the tree, calling other helper functions as necessary and building a typing context with information that it uses during the walk. You will have to code a significant portion of this function. same_type does just what you'd expect it to do. The opArgType functions simply return the types you might find in the signature for that operator. Make sure you understand why these functions are needed in the code.

This type checker is built on the rules described in the SML type checking handout. You have seen most of these rules already, and they will be necessary to complete and debug the type checker.

The type checker can be thought of as an interpreter that builds the type of every subtree in the AST using information stored in the typing context. You need not be concerned with the details of the type context but should understand what it does and how it works.

NOTE: The AST returned by the parser may contain incomplete types. These are types that the parser could not be sure were valid at parse-time. A function, lookup_types, in type-check.sml converts these incomplete types to complete ones. Make sure that you call this function on all the types in the input AST in the code that you write.


MiniML Error Messages

You might see the following kinds of error messages in the MiniML interpreter:

  1. PARSE ERROR: ...

    This means that your code is not a valid MiniML syntax. There might be several reasons for that:

    If you want to find out whether some particular language feature (string patterns, pattern matching in function expressions, etc) is supposed to be supported by MiniML, take a look at absyn.sml . Is there a way to represent such expression using the MiniML datatype? If yes, then it's probably supported. If no, then obviously there is no way it could be supported.

  2. STATIC ERROR: ...

    This means that the code did not pass the typechecker. If you think that it ought to pass it, then it is probably a bug in typechecker.

  3. RUNTIME ERROR: ...

    This means that the code have typechecked and compiled without problems, but failed to execute properly (essentially, it crashed the interpreter). This can only happen if there is a bug in the system! Specifically, this could be caused by:

    There is one exception to the above — even a well-typed code may produce "RUNTIME ERROR: Match problem " since MiniML does not attempt to check patterns for completeness.

  4. Once you start modifying the code, you might also see some exceptions thrown if some invariants are broken by your code.


Down to the Metal

The following explains the compilation to Lambda, the Virtual Machine, various optimizers and the rest of the back-end. You may not need to know many of the details, but a general understanding will be required for exercise 2. Much of the VM is built on the environment model, which we will cover in class.


The Lambda Language

Rather than interpret Mini-ML abstract syntax directly, our implementation first compiles the code to a much simpler language called Lambda, which is based on the lambda-calculus.

The Lambda language (see lambda-sig.sml and lambda.sml) is a lot simpler than Mini-ML. In particular, Lambda provides no direct support for characters, strings, records, datatypes, refs, or pattern matching. Instead, these features must be represented using the primitives that Lambda provides. For instance, we can represent characters as integers, records as tuples, strings as tuples of integers, refs as 1-tuples, and datatypes as either integers or tuples. We can also compile pattern matching down to primitive if-then-else tests. Thus, Lambda is to Mini-ML as the Java Virtual Machine Language (JVML) is to Java. Indeed, Lambda is only a rather small step away from real machine code.

Start up the Mini-ML interactive system and start playing around. You'll find you can type in Mini-ML expressions and see how they are compiled to Lambda. Pay particular attention to how datatype constructors and pattern matching are implemented. Try out a few examples and see what happens.

Lambda also has one very strange feature — there are no variables! Instead, we represent variables as integers. The integers correspond to the position in the dynamic environment where we can find the value of the variable. The inner-most variable is always represented as "0", the next inner-most variable is "1", the next is "2" and so on. This approach to representing variables is known as deBruijn indices as it was invented by the Dutch logician N. G. deBruijn. For example, the Mini-ML expression:

fn (x:int) => (fn (y:int) => x+y)

is represented at the Lambda level as follows:

Fn_l(Fn_l(Binop_l(Var_l(1),Plus_l,Var_l(0))))

Notice that in the expression "x+y", x is represented as variable "1" because it is the outer-most variable, whereas y is represented as variable "0" because it is the inner-most variable. Also notice that Lambda functions do not have variables as they do in Mini-ML. This is because within the function, the "name" is 0 (unless we cross over into a nested function, in which case it becomes 1.)

deBruijn indices can make interpreters much faster because we don't have to keep a dictionary mapping variable names (strings) to offsets in the environment at run time. Rather, we can go ahead and figure out the offsets at compile time and then look up variable values using this offset at run time. On the other hand, it's really hard to read code with deBruijn indices. So, you'll find that the pretty-printer for Lambda that we've provided puts variable names back in (though those variable names may not be the same as the ones you used at the source level.)


The Compiler and Optimizer

The file compile.sml defines a compiler that maps Mini-ML expressions to Lambda expressions and the file lambda-opt.sml defines an optimizer that attempts to simplify Lambda expressions. You should study this code carefully to try and understand how it works. Most of the translation is fairly straightforward, but a few things are tricky. In particular, the compilation of pattern matching is fairly complicated. If you don't understand how something works, try it out in the interpreter and see what happens.

If you turn off the optimizer by typing ":o" at the Mini-ML top-level, you will notice that the code produced by the compiler is pretty ugly and inefficient. For instance, if you type in the Mini-ML expression:

  fn (x:int*int) => let val (w,z) = x in w+z end

you'll get the following Lambda expression as the result of compilation:

  (fn a => ((fn b => (fn c => (fn d => (fn e => (e + d)) (#2 c)) (#1 c)) b) a))

This code can be simplified considerably to something like:

  fn a => (#1 a) + (#2 a)

The optimizer works by simplifying expressions of the form ((fn x => e1) e2) into e1 with e2 substituted for x. Of course, it's not always safe to do this simplification. A simplification is safe if it doesn't change the semantics of the program in an observable way and if it doesn't take longer for the code to run and it doesn't take more space for the code to run. Here are some examples where such an optimization is not generally correct:


The Abstract Machine

The interpreter for Lambda is actually an abstract machine. The machine consists of a definition for primitive values (value.sml), a memory and registers (memory.sml), an operand stack (stack.sml), and the control logic for interpreting Lambda operations (lambda-interp.sml). The intention is to model what really goes on in machines, but to abstract from some specific details, such as what the machine code looks like.


The Abstract Machine Values

The values of the abstract machine are all "word-sized" values. By this, we mean that each value is meant to fit into a machine-level word (32-bits on the Intel x86 architecture.) The definition of values is as follows:

  datatype value =
    Int_v of int     (* an int or a char *)
  | Real_v of real   (* a real *)
  | Ptr_v of int     (* a pointer -- some index into the memory *)
  | Tag_v of int     (* a tag for the beginning of an object -- the int
          * is the number of words in the object (not including
          * the tag.) *)
  | Forward_v of int      (* a forwarding pointer -- used only by the gc *)
  | Code_v of Lambda.lexp (* "code" -- in practice, this would be the address
               * of the machine code for a function *)

Int_v values are used to represent integers and characters. Real_v values are used to represent real numbers. (Note that on a real 32-bit machine, SML real values take up 64 bits and so they do not fit into a single machine word---we're ignoring this fact for now.) Ptr_v values are used to represent pointers to memory-allocated values. Objects that are larger than a word, such as a tuple of 3 integers, are placed in memory and the address (i.e., a pointer to) these values is used as the representation.

The Code_v value holds a Lambda expression. In a real machine, this would also be a pointer to a hunk of machine code. But to avoid this level of detail, we simply represent the code as a Lambda expression.

The Tag_v and Forward_v values are used by the garbage collector. The Tag_v value is used in memory to record how big (in words) a given object is. So, for instance, a tuple (1,2,5) is allocated in memory with a Tag_v of 3 (because the tuple has three components), followed by an Int_v(1), then Int_v(2), and then Int_v(5). If the tag for the tuple is at address 42 in the memory array, then we use Ptr_v(42) to represent the tuple. Right now the system has no garbage collector. You will have to implement one in Exercise #2. For now, you do not have to worry about Forward_v values.

It should be noted that on a real machine, we don't have data constructors (e.g., Int_v, Real_v, Ptr_v, Code_v, etc.) to distinguish integers, reals, code, etc. Rather, we reserve some number of bits to encode this information. For instance, SML/NJ uses the least significant bit so that it can distinguish pointers from integers. That's why integers in SML/NJ only have 31-bits worth of precision.


The Memory and Registers

The memory for the abstract machine is straightforward: it's simply an array of values (initially all Int_v(0)). You'll find the function print_memory to be a very useful utility for examining the contents of memory when you work on your garbage collector. (Also, the Mini-ML top-level command ":d" dumps the contents of memory.)

In addition to the memory, our abstract machine includes a few registers. These are just references that are used by the interpreter. The register startptr is the position within memory where we should start allocation. The register allocptr is the next location in memory that we should use when we go to allocate something. The register limitptr is the last location that we can use when we're allocating something. Thus, to allocate an object such as (1,2,5), we would check that the current value (!limitptr)-(!allocptr) is at least 4 (one word for 1,2, and 5 respectively, plus one word for the tag.) If there is no space, then we must initiate a garbage collection try to reclaim space. Right now, this will just raise an exception and terminate. See the function malloc in lambda-interp.sml for details on how we allocate objects in memory.

There is one more register used by the evaluator, the dynamic_env. This is a value reference that corresponds to the dynamic environment of the environment evaluation model. In essence, dynamic_env is a linked list that contains all of the values for the variables that are in scope in the current expression being evaluated. Note that we represent the empty environment by the value Int_v(0), and a non-empty environment as a pointer to a 2-tuple, where the first component of the tuple is the value of some variable, and the second component is an environment. Note also that this tuple is allocated in the memory of the abstract machine.

To find the value of variable i, we simply walk down the linked list until we get to the ith value. See the function lookup in lambda-interp.sml for details.


The Operand Stack

The operand stack for the abstract machine is similar to the memory. However, the operand stack is used for temporary storage during evaluation. For instance, when we evaluate a compound expression such as "(3+4)+(5+3)", we first evaluate "(3+4)" to the value "7" and then push this on the operand stack. We then evaluate "(5+3)" to "8", pop the "7" off the stack, and then add the two primitive values to get the result "15". The stackptr register is used to keep track of the next available location on the stack.

Why do we save intermediate values on the stack? There are two reasons: First, in a real machine, these intermediate values would typically be saved in registers. However, there are only a small number of registers on any given machine (8 on the x86), and so for deeply nested expressions (or for recursive functions), we would have to write some of the registers to memory. Second, when we do a garbage collection, the collector needs to be able to find all of the pointers that the evaluator is currently using. By pushing these values on the operand stack, we have a convenient place to store them in case a collection occurs.

In a real machine, the operand stack contains additional control information for implementing procedures. In particular, on a real machine, when we do a procedure call, we push the return address on the operand stack so that we know what code to execute upon completion of the procedure. In our abstract machine, all of the control details are handled at the meta-level by SML/NJ. Thus, we don't need to push and pop return addresses.

Also in a real machine, we have only one memory. Usually, the code for a program is placed first in memory, followed by the data segment (our memory array) and then the stack segment (our stack array). Of course, not every program gets to use all of memory, but a real operating system with support for virtual memory gives each program the illusion that it has full access to all of memory. You'll learn more about memory segments and virtual memory in CS314 and CS414.


The Interpreter

The file lambda-interp.sml contains the bulk of the code for the Lambda interpreter. It starts with a bunch of utilities for allocating and initializing objects in memory, for reading and writing locations in memory, and for looking up variable values using the dynamic environment. The main function, eval, takes in a Lambda expression and returns a value.

Most of the code is fairly straightforward once you get the hang of the environment model. We'll just mention a few of the interesting cases here: For variables, we simply lookup the variable in the dynamic environment. (Remember that the variable is represented as a deBruijn index which tells us the position in the dynamic_env list where to find the value.) For tuples, we evaluate the subexpressions (saving the intermediate results on the stack), and then allocate and initialize a block of memory with the resulting values. We then return the a pointer to the block of memory as the result of the tuple.

For functions, we must create a closure object. Recall that in the environment model, we've delayed performing any substitutions of values for variables. Rather, the values for variables are looked up as they are needed during evaluation. This saves us from crawling over the code multiple times (once for substitution and once for evaluation.) However, when we encounter a function, we stop evaluating, so we must remember to continue substituting if the function is ever applied. A closure is a promise to continue the substitution. It consists of the code for the body of the function and the current dynamic environment (because this is the environment that we would've substituted if we had continued evaluation.)

When we apply a closure, we must extract the code and the environment of the closure. We add the argument to the environment and then evaluate the code of the closure in this new extended environment. After we get the result, we must restore the old environment.

The only other interesting case in the interpreter is for Letrec_l(e1,e2). Here, we're defining a recursive function (e1) and then evaluating e2. To evaluate the recursive function, we must create a closure. However, the closure needs to be self-referential. In particular, the first value in the closure's environment should be the closure itself to ensure that, whenever we do a recursive function call, we're calling the same function. We achieve this by creating a cyclic data structure. In particular, we first create a closure with a bogus environment. We then add that closure to the current dynamic environment. We then destructively modify the environment of the closure to be the new, extended environment. As a result, the closure refers to the new extended environment, which includes the closure as the first value. After we've built the cyclic closure, we evaluate the body of the letrec (that is, e2) in the extended environment. This allows e2 to call the recursive function. After we get the result of e2, we restore the old dynamic environment and return the result.

If the Letrec_l construct confuses you, go back and study the environment model handout carefully, and type in some recursive functions and see what happens. It may help if you insert some debugging print statements in the evaluator to see exactly what's going on.


Garbage Collection

Any virtual machine needs a system for memory management. Our Lambda evaluator is equipped with memory allocation functions and a garbage collector. There are many kinds of garbage collectors, including Copying, Mark and Sweep, Generational, Incremental, Mostly-Copying, etc.  Often, a good implementation will combine many of these techniques to achieve good performance.  You can learn about these techniques in a number of places---perhaps the best place to start is the Online GC FAQ.

For our purposes we will present two techniques: copying and generational. For the second exercise, you will implement a garbage collector using a combination of these two techniques.


Copying Garbage Collection

The goal of a garbage collector is to automatically discover and reclaim fragments of memory that will no longer be used by the computation. It turns out that during evaluation of a high-level program, we allocate lots of little objects that are only in use for short periods of time and can be effectively recycled.  

Most garbage collectors are based on the idea of reclaiming whole objects that are no longer reachable from a root set.  In the case of our interpreter, we only access memory objects through the stack and through pointers.  So any object that isn't reachable from the stack, following the pointers contained within other objects, can be safely reclaimed by a garbage collector.  

At an abstract level, all a copying collector does is start from a set of roots (in our case, the operand stack), and traverse all of the reachable memory-allocated objects, copying them from one half of memory into the other half.  The area of memory that we copy from is called old space and the area of memory that we copy to is called new space.  When we copy the reachable data, we compact it so that it is in a contiguous chunk.  So, in effect, we squeeze out the holes in memory that the garbage data occupied.  After the copy and compaction, we end up with a compacted copy of the data in new space data and a (hopefully) large, contiguous area of memory in new space in which we can quickly and easily allocate new objects.  The next time we do garbage collection, the roles of old space and new space will be reversed. 

For example, suppose memory looks something like this, where the colored boxes represent different objects, and the thin black box in the middle represents the half-way point in memory.

Obj 1

Obj 2

Obj 3

Obj 4

Obj 5

 

At this point, we've filled up half of memory and so we initiate a collection.  Old space is on the left and new space on the right.  Suppose further that only the red and light-blue boxes (objects 2 and 4) are reachable from the stack.  After copying and compacting, we would have a picture like this:

Obj 1 Obj 2 Obj 3 Obj 4 Obj 5   Obj 2' Obj 4'

Notice that we copied the live data (the red and light-blue objects) into new space, but left the unreachable data in the first half.  Now we can "throw away" the first half of memory (this doesn't really require any work):

  Obj 2 Obj 4

After copying the data into new space, we restart the computation where it left off.  The computation continues allocating objects, but this time allocates them in the other half of memory (i.e., new space).  The fact that we compacted the data makes it easy for the interpreter to allocate objects, because it has a large, contiguous hunk of free memory.  So, for instance, we might allocate a few more objects:

  Obj 2 Obj 4 Obj 6 Obj 7 Obj 8

When the new space fills up and we are ready to do another collection, we flip our notions of new and old.  Now old space is on the right and new space on the left.  Suppose now that the light-blue (Obj 4), yellow (Obj 6), and grey (Obj 8) boxes are the reachable live objects.  We copy them into the other half of memory and compact them, throwing away the old data:

Obj 4 Obj 6 Obj 8  

What happens if we do a copy but there's no extra space left over?  Typically, the garbage collector will ask the operating system for more memory.  If the OS says that there's no more available (virtual) memory, then the collector throws up its hands and terminates the whole program. 


Implementation Details

It is surprisingly simple to build a copying garbage collector. 

First, we need one more register scanptr, which will be used as an index into memory.  The scanptr initially contains the base address of the new space (i.e., the address of the first word where we will  copy objects.)  We also set the allocptr to the base address of the new space.  We will use the allocptr to remember where to allocate objects as we copy them from old space to new space. The purpose of the scanptr will be made clear below.

Second, starting from an array of roots (in our case, the values pushed on the operand stack), we examine each root to see whether or not it's a pointer.  If so, we copy the object that the pointer references from old space to new space.  We can figure out how big the object is, because it always starts with a Tag_v value specifying its length (but see below.)  As we copy the object, we place it where the allocptr currently is, and then increment the allocptr by the appropriate amount so that it points to the next available spot in new space.

After we copy the object, we need to update the array of roots so that it points to the new copy of the object.  We also need to leave a forwarding pointer in the object's previous location in old space indicating that the object has already been moved and where it was moved to, in case we run into this particular object again while traversing the graph of reachable objects in old space.  (This prevents us from going into an infinite loop if there are cycles in the graph, and there typically will be cycles if we use recursive functions.)  We can do this by overwriting the Tag_v value on the old copy of the object with a forwarding pointer, represented in our abstract machine by Forward_v(i) values, where i is the address in new space of the new copy of the object.

Before trying to copy any object, we should check first to see if the object has already been forwarded.  If so, then we should update the root with the address of the copied object that we find in the forwarding pointer.

So, we need a procedure forward which, when given an address a, (1) checks to see if the object at a has been forwarded or not -- if so, forward immediately returns the address of the forwarded object, (2) otherwise, copies the object from old space to new space, incrementing allocptr appropriately, (3)  overwrites the old objects Tag_v with a forwarding pointer to the new copy, (4) returns the address of the new copy of the object.

Processing the roots then becomes a simple loop which simply checks to see of a root value is a pointer, and if so, calls forward, and updates the root array with the new address of the object.

After processing the roots, you've managed to copy all of the data that are immediately reachable from the roots.  However, we must also process all of the data that are reachable from these objects (and then all the data reachable from those objects, and so on.)  This is the purpose of the scanptr register. 

After forwarding the root objects, we must then examine each of their components to see if there are any pointers.  Any pointers in those objects must also be forwarded from old space to new space.  The process of examining a copied object for pointers and forwarding those objects is called scanning the object. 

The scanptr register is used to keep track of which objects have been forwarded but not yet scanned.  In particular, the scanptr starts off as the same address as the allocptr.  After forwarding all of the roots, we start scanning the object pointed to by the scanptr.  This will be the first object that we copied during the root processing.  After scanning this object, we increment the scan pointer so that it points to the next object to be scanned.  The scan pointer thus moves only from left to right.

During scanning, we need to look at each value in the object.  If the value is a pointer, it should be forwardedThis may cause the allocptr to move.  But because the newly copied object comes after the object being scanned, we will eventually scan it and any objects that it references will also be copied into new space.

The whole process stops when the scanptr catches up to the allocptr, for then all reachable objects have been successfully forwarded from old space to new space.  At this point, we need to reset the limitptr and the startptr appropriately.  We also need to check that there is enough free space for the computation to make progress.  If not, then an OutOfMemory exception should be raised.  (Otherwise, the interpreter will go into an infinite loop trying to garbage collect forever.)

Note that we are effectively using a queue to keep track of those objects in the graph that have been forwarded but not yet scanned.  We insert items at the end of the queue (where allocptr points) and remove objects to be scanned from the front of the queue (where scanptr points).  It's possible to use a different data structure such as a stack to keep track of those objects that have been copied but not yet scanned, but doing so usually requires an extra hunk of memory. 

Note that by using a queue in the graph traversal, the copying collection effectively does a breadth-first traversal of the data.  If we used a stack instead of a queue, the traversal would be depth-first.  

You should note that although this is a fairly simple implementation of garbage collection, it uses many of the same concepts used in more complicated methods.


Generational Garbage Collection

One major problem with a copying garbage collector is that it has to look at the entire memory every time a garbage collection must happen. Objects in memory have an important property of temporal persistance. Objects that have been in memory for a long time will likely continue to stay in memory, whereas objects that have recently entered memory will likely be discarded very soon.

To exploit this principle, we can build what is known as a generational garbage collector. Objects will initially be allocated to a chunk of memory called the first generation, or G1. When G1 becomes full, we copy the live objects into another block of memory called the second generation, or G2, and free up the entire G1.

When G2 becomes full, we continue the process to copy the live objects into G3. This continues for some number of generations n. At this nth generation, Gn, we use some other method such as copying garbage collection to reclaim memory space.

NOTE: There is one problem with the generational approach. If an object in an older generation points to an object from a newer generation, when we garbage collect the newer generation, we will corrupt the older object. This can be fixed by keeping track of pointers across generations. Fortunately for us, we will restrict our garbage collector to work only for the functional set of MiniML. This means MiniML code with refs in it will not work under this garbage collector. We will leave it as a thought exercise to see why cross generational pointers are not an issue without side effects.

Of course, you can still use refs in your SML code to garbage collect. You are not required to deal with refs in MiniML when constructing the garbage collector, even though refs implemented in the type checker and compiler.


Combining the Two Techniques

For this problem set you will write a garbage collector with two generations, G1 and G2. The first generation (G1) contains the most recently created objects. When G1 is completely filled, its live objects graduate into the second generation (G2). G2 is implemented as a simple copying garbage collector. That is, G2 has two halves, G2A and G2B. When one half fills up, the garbage collector copies the live data into the other half.

The layout of memory is thus

G1

G2A

G2B

(You may assume that the memory size is divisible by 3.)
G2A and G2B act exactly as explained in copying garbage collection. If we need to garbage collect G2, we copy all the live data into the other half and begin to use the other half. If there still is not sufficient space to move all live objects into G2, we should throw an OutOfMemory exception.

Efficiency: Ensure that you do not examine all of G2 when looking for live objects in G1; doing so defeats the purpose of a generational garbage collector. Again, because we are ignoring MiniML refs in this assignment, objects in G2 DO NOT reference objects in G1.

UPDATE: there is another acceptable solution that may prove easier to implement. When you garbage collect G1, you garbage collect G1 assuming that there is enough space in G2 to put the live objects of G1 in. If this is turns out to be wrong, break out and then start a garbage collection of G1 and the current active portion of G2 into the other half of G2. You can use the stack as your initial list of roots, however, remember that you may now have Forward_v's pointing at Forward_v's in this case.


Exercise #1 — Testing and Finishing the Type Checker

You are working for CS312, Corp. Your colleague Bob was working on a type checker when one day, he went beserk and quit. Your boss has given you the responsibility of finishing Bob's type checker. This will include both writing code to complete the type checker as well as debugging existing code. You will find the complete ML type-checking handout linked here to be invaluable.

You must complete the function tcheck in type-check.sml. This will involve reading through and understanding the code that Bob has already written. Once you understand this code, you can begin finishing it and testing it. This is no easy task. As you will see, Bob was not a fan of commenting code and often wrote obscure lines. He also used a different style of ML that we have not used in this course, namely that of pattern matching in the function arguments. An oracle has provided a few comments to point you in the right direction as to where you need to code, but it is largely up to you to figure out what to do in this part of the assignment.

The oracle has also told you that Bob wrote at least three bugs in the type checker. You will need to find and fix as many bugs in the type checker (or elsewhere!) as you can. You will get credit for each bug that you find. In looking for the bugs, you want to come up with a testing strategy. Note that your testing should not only find the bugs in the existing code, but should also be comprehensive enough to make sure your type checker is fully functioning.

NOTE: We know of no bugs in the code specific to type checking patterns. There are comments in type-check.sml which point out which code this is. It is still a good idea, however, to test this code because it may rely on other code that has bugs in it.

If you are not sure whether or not something Bob wrote is a bug, see a consultant or send mail to cs312@cs.cornell.edu. Please do not post questions about bugs to the newsgroup.

  1. In the file doc.txt, give us the following: Please put the bugs in the same order in which the corresponding code goes in type-check.sml
  2. In type-check.sml, give us the fixed code. Make sure you clearly comment the changes you make to type-check.sml. Remember that because bug fixes tend to introduce new bugs, a good bug fix affects the code as little as possible while fixing the bug.

Keep in mind that we will grade your writing partly on how clear and concise it is. Similarly with your test cases and your bug fixes. Think simple, small, clear.


Exercise #2 — Garbage Collection

As Bob's co-workers might have guessed from the state of his desk and work area, garbage collection was not a priority for him. He had been putting off the garbage collector. It is now a task for you.

Your boss wants you to implement a garbage collector with two generations in the manner described above. Implement the garbage collector in gc.sml

To do this you will need to look at (but not change) memory.sml


Fun Karma Exercise

Redesign your generational garbage collector so it works in the presence of refs. This will involve constructing some sort of table of cross-generational pointers, specifically pointers from old generations to newer generations. In this case, you can probably make a simple linked list of all the pointers that go from G2 to G1.


What to Hand In

You should submit the following 3 files: