MCC 1.0

A Mostly-Copying Garbage Collector

Frederick Smith and Greg Morrisett

Description

MCC is the conservative collector for C used in the TIL/C compiler. It runs under Solaris on a 32-bit Sparc processor when compiled with the GNU C compiler (gcc 2.7).

MCC implements a variant of Bartlett's mostly-copying collection algorithm. The new element in our algorithm is a means of dealing with untyped objects in the heap efficiently. (See the references if this paragraph made no sense.)

This software is not easy to use! Rather it is intended as a reference for those needing to solve the many problems that naturally arise in implementing a garbage collector and in particular a mostly-copying collector. Of course we would be delighted if you chose to use the code as is in a project.

Please let us know if you find this code useful, or have any questions about the implementation.

This code cannot be used commercially because Bartlett (and Digital Equipment Corporation) own a patent (US Patent #251,554) on Mostly-Copying Collection. However, non-commercial uses are fine if you include the notice given below.

Status/Features

MCC has no bugs in the standard configuration to the best of our knowledge (non-generational, non-blacklisting, fully-inlined). The generational code works (compiles our benchmarks) but the policies have not been tuned. We have observed no benefit from using generations because the cost of stack scanning becomes prohibitively large when there are many minor collections. A smarter quasi-pointer detection algorithm would make a big difference.

Blacklisting and bitmasks on pages have not been maintained since they provided no performance improvement. They will not work without modification!

In normal use the collector is compiled so that the inner loop is "fully" inlined. This yields a tremendous performance improvement. If this is not desirable you need to select the appropriate targets from the makefile, and uncomment some prototypes. You should get no warnings when compiling with -Wall.

References

  1. Frederick Smith and Greg Morrisett. Comparing Mostly-Copying and Mark-Sweep Conservative Collection.
    to appear in the International Symposium on Memory Management 1998.(postscript)
    This is our latest paper on MCC. It presents a high-level overview of the algorithm and a careful analysis of our performance results. There are some very surprising effects related to the cache.
  2. Frederick Smith and Greg Morrisett. Mostly-Copying Collection: A Viable Alternative to Conservative Mark-Sweep. Technical Report TR97-1644, Cornell University, August 12, 1997.
    This was our original paper on MCC. It contains a detailed description of the implementation. Unfortunately some of the analyses are off because at the time we did not have access to good timers.
  3. The Boehm-Demers-Weiser Collector
    This collector is the best absolutely conservative collector we could find. It is the one we give our performance results relative to. Any one interested in Conservative Collection should know about this. Boehm has many links to papers, and related GC sites you might want to check out as well.
  4. CMM - The Customizable Memory Manager by Giuseppe Attardi and Tito Flagella.
    CMM is an incredibly cool memory manager for C++ that allows the user to select different collection strategies for different portions of the heap. The default collection strategy is mostly-copying collection. This site contains source code as well as several papers giving the algorithms, and some performance numbers for CMM.
    Bartlett's Mostly-Copying Garbage Collector
    You can find all of Bartlett's papers and implementations at this site. This represents the original work on mostly-copying collection.
  5. Paul Wilson's garbage collection ftp archive and GC survey(~379Kb).
    If you are not familiar with garbage collection you have to read Paul Wilson's survey. There are other good texts but this is the place to start.

Files

Header FilesDescription
header.h Describes the header word format. Contains macros for constructing and analyzing header words.
dataStructures.h Describes the types for all global datastructures including pages, heap blocks, and gcInfo (our one big internal datastructure).
macros.h Describes many macros used throughout the system including some user macros.
prototypes.h Describes the prototypes for all functions that are machine-independent and unrelated to debugging.
machine.h Describes machine (and OS) specific routines and constants.
object.h Describes macros for performing standard tasks on objects, such as pinning, forwarding, and extracting the header word.
pageHeader.h Describes macros for manipulating the page header to determine the current status of a page.
debug.h Describes all functions used for debugging and some debugging macros.
C Files Description
collect.c The function that performs the actual garbage collection. It should not be invoked by the user.
conservative.c All the conservative aspects of collection. Including quasi-pointer detection, and stack scanning.
debug.c Debugging code. Can be used to time various stages of collection.
deque.c The only abstract data type we use in the collector. This deque is used as both a stack and a queue. It is used for the mark-stack, and the list of untyped objects in the heap (cObjects).
external.c These are the functions that the user should call. There are substitutes for malloc, and routines for allocating large objects. gcAllocSpace should be used if inlining the allocation routines.
header.c Currently unused. Contains a function to compute object sizes that has since been replaced by a macro. May still be useful during debugging.
heap.c These functions manage the heap at a gross level. They allocate large heap blocks (1/2 MB or 1MB, can be configured) and free them.
implicitQueue.c The implicit queue refers to the cheney queue.
inline.c This file is just a hack to manually inline the inner loop.
machine.c Contains machine and OS specific code to find the beginning and end of the stack, static area, etc ...
object.c Routines for processing an object during collection.
page.c Routines to get, and return pages to the collectors internal memory management.
parseHeader.c A simple test function that can be used to analyze a header word of an object. This file should be compiled separately and is a debugging aid only. It is not needed by the collector.

Copyright Notice

(c) Frederick Smith, Greg Morrisett. October 1998, all rights reserved.

Copyright 1990-1993 Digital Equipment Corporation
All Rights Reserved

Permission to use, copy, and modify this software and its documentation is hereby granted only under the following terms and conditions. Both the above copyright notice and this permission notice must appear in all copies of the software, derivative works or modified versions, and any portions thereof, and both notices must appear in supporting documentation.

Users of this software agree to the terms and conditions set forth herein, and hereby grant back to Digital a non-exclusive, unrestricted, royalty-free right and license under any changes, enhancements or extensions made to the core functions of the software, including but not limited to those affording compatibility with other hardware or software environments, but excluding applications which incorporate this software. Users further agree to use their best efforts to return to Digital any such changes, enhancements or extensions that they make and inform Digital of noteworthy uses of this software. Correspondence should be provided to Digital at:

Director of Licensing Western Research Laboratory
Digital Equipment Corporation
250 University Avenue
Palo Alto, California 94301

This software may be distributed (but not offered for sale or transferred for compensation) to third parties, provided such third parties agree to abide by the terms and conditions of this notice.

THE SOFTWARE IS PROVIDED "AS IS" AND DIGITAL EQUIPMENT CORP. DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL DIGITAL EQUIPMENT CORPORATION BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.


Last updated October 12,1998.