Bartlett's Algorithm

This is my best attempt at explaining Bartlett's Mostly Copying collection. A few points still elude my understanding. Hopefully these will be clarified with time.
The algorithm I am describing is the one used by Bartlett to deal with conservative objects within the heap. His algorithm in the case where all objects in the heap are known is straightforward, and much more efficient than the following. Furthermore he has published results regarding refinements such as generations, and using a mark-compact phase ocassionally to reduce fragmentation.

Bartlett breaks his heap up into uniformly sized pages. Associated with each page are two boolean variables: promoted, newSpace. Promoted is set to true whenever its page contains an object that cannot be copied because it is referenced by a conservative pointer. NewSpace is set to true whenever the page is in the new Space(commonly referred to as ToSpace). At the start of GC all pages have both of these variables set to false indicating that none of them are promoted and all reside in oldSpace(i.e. fromSpace). Another important detail to be aware of is, that in this approach all conservative objects reside outside the heap, and are thus not garbage collected.

The algorithm uses a Cheney style queue to do the copying. We will refer to it throughout as the Queue. Pages on the Queue are in newSpace, and are not promoted. Recall that enqueueing on a Cheney queue involves copying the object. We break up the algorithm into the following phases:

Phase I Scan the stack and static area for roots. Promote all pages containing roots. Then Enqueue all the objects on the promoted pages on the Queue, leaving forwarding pointers behind.
Phase II Perform copying collecting on the Queue, leaving forwarding pointers for the enqueued objects. But do not update the pointers within the objects. In other words all objects in the Queue point into fromSpace even after they have been scanned. Whenever a pointer outside the heap(i.e. a conservative object) is found, promote all pages pointed to by that object.
Phase III Rescan the Queue replacing all the pointers into unpromoted, pages in fromSpace with the forwarding pointers. Pointers into promoted pages are not affected. In this phase collect a linked list of all the promoted pages.
Phase IV Restore all the objects back onto promoted pages by tracing their forwarding pointers, and copying them back from newSpace. GC is now complete. Any page not promoted or in newSpace is free. The only task remaining to be done is reset the variables for all the pages that have been promoted or are in newSpace.

Below is a graphical representation of how the algorithm proceeds through the phases. The arrows show representative pointers. The colors have the following associations:
Initially
Phase I
Phase II
Phase III
Phase IV

Analysis

The above algorithm does two passes over the live data, plus work proportional to the number of pages pointed to by conservative objects(including the stack). However it requires very few data-structures, in particular it does not use any space in the header words. This makes the algorithm easier to modify for a BiBOP approach.

References

  1. Joel F. Bartlett. Mostly-Copying Garbage Collection Picks Up Generations and C++ Technical Report WRL Technical Note TN-12, Digital Equipment Corporation Western Research Laboratory, October 1989.
  2. Joel F. Bartlett. Compacting Garbage Collection with Ambiguous Roots Technical Report WRL Research Report 88/2, Digital Equipment Coprotation Western Research Laboratory, February 1988.
  3. Richard Jones and Rafael Lins. Garbage Collection: Algorithms for Automatic Dynamic Memory Management John Wiley & Sons, 1996. (ISBN 0 471 94148 4).