Open and Extensible Compiler Implementations
(DRAFT)

1. Overview

This note discusses a bunch of papers that I have read that I feel are relevant to way that we ``package'' the compiler technology that our group produces. In particular, I think that material discussed here is relevant to development of,

The Bernoulli sparse compiler,
Data-shackling,
Source-to-source Matlab transformations, and
An Open/MP-like compiler.

We will start by introducing the concept of ``Open Implementations''. We will suggest that one of the key questions in designing an open implementation is deciding how much language or compiler support is required to support a particular open implementation interface. The bulk of the note will be used to present systems and approaches that use varying degrees of language and compiler support. Where possible, systems with a numerical bent are highlighted.

2. Open Implementations

The core theme of these papers is that of ``Open Implementations''. An open implementation is a system that gives some measure of control over implementation decisions to the user. An open sparse matrix library implementation might, for instance, allow the user to determine the storage format used for the matrices.

The term, open implementation, has coined by Kiczales's group at PARC. The group's Open Implementation home page can be found at [26].

2.1 OIAD

The first questions to be asked in designing an open implementation are,

Over what should the user be given control, and
How much control should the user be given.

[14] discusses design guidelines for ``open implementations''. This table, adapted from one at the end of the paper, summaries the design guidelines presented in the paper,

Interface Style How Strategy is Selected Tradeoffs When appropriate

Style A - No implementation strategy control interface Module can do whatever it wants. Can be adaptive. Same as Black-Box Abstraction One implementation strategy is enough. Or, module can adequately pick the strategy.

Style B - Client provides declarative information about its use. ``sequential file scan.'' Module matches against user's usage. Usage does not uniquely specify implementation. Patterns are hidden; why is the user to know which usage patterns will result in which implementation? When the module can do an adequate job with the declarative usage description.

Style C - Client specifies the implementation strategy. ``LRU cache management.'' User selects strategy. User may pick the wrong strategy. Fixed number of strategies. There are few candidate implementation strategies, but it is difficult to choose among them automatically

Style D - Client provides the implementation strategy. ``Here a function pointer for the policy.'' User provides the strategy. User has complete control, but the module might be very difficult to engineer. The user may find implementing the strategies difficult. It is not feasible for the module to implement alll strategies.

[22] presents a methodology for developing the interfaces and designs for the sorts of open implementations designed in [14].

2.2 Levels of Language Support

The next question to ask is,

How much language or compiler support is required to implement desired interface?

3. Little or no language support

Some interfaces require little or no language support above and beyond that provided by the target language. Below are some systems that fall into this category.

3.1 Sparse BLAS

The Sparse BLAS ([12] provides three levels of interface. The highest, or user, level interface specifies a smaller number of matrix kernels that can be passed a parameter stored in a number of different sparse matrix formats. This consistutes a style C interface.

3.2 PETSc

PETSc ([6], [5], [28]) allows the user to select matrix storage formats, preconditioners, solver algorithms, etc. from a smaller list of options (style C interface). This is done completely within ANSI C. This system could be extended to allow user-provided implementations (style D interface), but for some reason, the developers have chosen not to do so.

3.3 Runtime MOP

The Common Lisp Object System (CLOS) ([4]) implements an open object system on top of Common Lisp. The system is open in the sense that the user has access to a metaobject protocol (MOP) with which to change such things as object layout, dispatch mechanism, and inheritance behavior. Although many Common Lisp systems provide extensive support for CLOS, this is not required; the core of CLOS can be built directly on Common Lisp ([13]).

In the case of CLOS, the user's code and meta-code is executed as the same time, at run-time. These is not the case with all MOP's, as we will see in Section 8.

4. Generic Programming and Templates

Another approarch to implementing open implementations is by using techniques that fall under the umbrella of ``generic programming''. Language constructs for generic programming are found in many languages For example, ``templates'' in C++ are such a construct.

Template constructs (C++ templates in particular) require some additional syntax to the core language and often require relatively sophisticated compiler optimization to obtain good performance. However, these optimizations are generally specific to the language constructs and not to the particular problems in which they are used.

Good introductions to generic programming and templates can be found in any good C++ book. This section discusses approaches above and beyond their usual use.

4.1 Mixin Layers

A mixin class is an abstract class that is intended to be combined with a concrete class in order to enrich the concrete class's behavior. For instance, an abstract class the counted the number of times the number of times that the insert method was invoked would be considered a mixin class. It could be combined with a concrete list class in order to produce a list class that counted the number of items inserted.

([34]) suggests that there are situations in which stand-alone mixin classes are not enough: coupled mixin classes are required. In order to maintain the consistency between these classes, and in order to simplify the instantiation of these classes, the authors propose the use of meta-mixin classes, or mixin layers.

Here is how it might be done in C++: Suppose that we have several container classes, of which BINTREE is one example,

template<class Entry> class BINTREE {
    class Node { ... }
    class Container { ... insert and delete methods ...}
}

Here is a mixin layer for adding time-stamping to one of these container classes:

template<class Super> class TIMESTAMP : public Super {
    class Node : public Super::Node { ... }
    class Container : public Super::Container {
        ... override insert and delete methods ... }
}

These classes can be instantiated as,

    x = new TIMESTAMP< BINTREE< int > >

4.2 Views

In a generic algorithm, there are two types: the abstract type, which is used within the generic algorithm, and the application type, which is used outside of the generic algorithm. This paper addresses the problem of mapping between the two.

Conceptually, this can be done by thinking of the abstract type as a record with a set of basis variables. These variables form a basis, in the sense that, a change to one do not effect any of the others; the variables are ortogonal. Associated with each basis variable are accessor and mutator methods. The mapping problem can be restated as, how can such a record be constructed that maps a particular application type onto a given abstract type.

If one were to use this approach in an object oriented language, the application and abstract types would be implemented using distinct classes, which mapping between the two would be implemented as a "wrapper" or "ambassador" class that was a subclass of the abstract class. The programmer would have to write this class themselves, and there would be some amount of overhead associated with the translation at runtime.

Novak proposes two things ([23]):

1.: a GUI tool for quickly constructing the "view" between the two types. He discusses such a tool, which is targetted for a numerical domain. The user specifies equational constraints between the basis variables of the two classes, and the tool uses Mathematica to automatically derive a map between the basis variables.
2.: compiler support for reducing the overhead of the mapping. It's not clear to me how these is fundamentally different from, say, templates in C++.

Clusters ([24]) are to views what mixin layers ([34]) are to mixins.

4.3 TNT

In the Template Numerical Toolkit (TNT) ([31] [32]), dense and sparse vectors and arrays are built on top of the STL. Several canonical representations are provided for each type. For instance, various implementations of sparse vectors are built on top of the STL's list<>, map<>, and vector<> templates.

The usual operators are defined. Algorithms for direct methods and iterative methods are provided.

It is suggested that parallelism be handled in one of the following manners,

HPC++ pragmas.
Parallel versions of the STL.
OS threads and function points.
via parallel BLAS kernels.
distributed vectors in a SPMD setting.

4.4 BLITZ++

The BLITZ++ numerical library ([7]) uses very sophisticated C++ template code in order to obtain high-performance implemenations. These codes do not require additions to ANSI C++, but they do require sophisticated template optimizations in the base C++ compiler.

In [40], the author builds upon the material in [39] and [38] and shows various examples of how templates and inlining in C++ can be used to write programs that are succint and yet are optimized to the point that they competitive with hand-optimized F77 and F90 codes.

[41] introduces the Array templates of Blitz++. Expression templates are used to build efficient loops from high-level array notation (e.g., a = b + c). Many kinds of array and vector notations are provided. Loop transformations are done, but it is not shown how they are done.

5. Language Support - Aspect-Oriented Programming

Some interfaces require extensions to the base language or special purpose optimizations be performed by the base language compiler. Traditionally, a library writer in this situation had only two options: abandon the effort or design an implement a new language system (i.e., compiler, optimizations, debugger, etc.). Neither choice is very attractive.

Recently a third alternative has been proposed: extend the existing base-language compiler. In order to assist the library writer in implementing this third choice, two bodies of research have been pursued. The first is a vocabulary and methodology for incorporating language and compiler extensions into interface designs; the second is a body of work on compiler archicture for the open compiler implementer to draw upon. We will address the former in this section and the later in the subsequent sections.

Kiczales's group at PARC has developed the idea of Aspect-Oriented Programming (AOP) to discuss the sorts of language and compiler extensions that are sometimes required. The Aspect-Oriented Programming home page at Xerox PARC is at [2].

Traditionally, a programmer first decomposes an application into loosely coupled components (e.g., modules, objects). The more loosely couple the components, the better design. The application, then, consists of, ([15]),

(i): a component language,
(ii): a compiler (or interpreter), and
(iii): a program written in the component language that implements the application.

It has be observed, however, that a better design does not necessarily lend itself to an efficient implementation. A good example of this is sparse matrix computations. An interface for a generic sparse matrix is fairly simple to design and implement, but will likely result in a lousy implementation. On the one hand, the library writer needs to implement a storage format, but, on the other hand, the properties that determine which format is appropriate are not available to the library writer. In the AOP vocabulary, the decision of how to store a sparse matrix cross-cuts the interface.

The key idea behind AOP is that solutions to these cross-cutting concerns can often be abstracted from the specific problem in much the same way that the components are. In the AOP vocabularly, these cross-cutting solutions are called aspects. An application, in this framework, might consist of,

(i.a): a component language,
(i.b): one or more aspect languages,
(ii): an Aspect Weaver (TM)¹ for combining the programms written in the various languages,
(iii.a): a component program, and
(iii.b): one or more aspect programs

Compiler optimizations that we know and love, like loop transformations, message aggregation, storage layout optimization, are all aspects, albeit, very general and sophisticated ones. What the AOP people are suggesting is that we consider extensions and optimizations that may not be general purpose at all. That is, the programmer may be able to identify problem- or domain-specific extensions or optimizations that can be packaged as aspects.

An introduction to AOP concepts can be found in [15]. AOP ideas applied to sparse matrix computations are discussed in [16], [30], [29].

Here are several questions that come to mind about designing an AOP system,

Should the component and aspect languages be the same? In CLOS, they are (both Common Lisp). In the Bernoulli Sparse Compiler, they are not (BMF + ML).
What sort of aspect can be specified? In CLOS, many different aspects of the underlying implementation can be changed by the user. In the Bernoulli Sparse Compiler, aspect programs can be provided for only one language feature.
Who should write the aspect programs?
- The user? In this case, an open compiler implementation is required. The component and aspect language should probably be the same, so that the user does not have to master two different languages.
- Someone supporting users in a specific domain? A library writer, perhaps. In this case, the compiler needs to be open, but the component and aspect languages do not have to be the same.
- The compiler writer? The compiler writer can do anything that they want in this case.

6. High-level language constructions

Some instances of AOP systems are systems that provide language constructs that allow the programmer to write modules in a manner that separate the use of data structures from their implementations.

These systems provide open implementations of data structures, but they themselves are not open systems. That is, these systems do not allow the user to provide additional language or compiler extensions.

6.1 The Bernoulli Sparse Compiler

Yes, the Bernoulli Sparse Compiler is an AOP system, and we didn't even realize it. The component language is BMF, the aspect exposed to the user is sparse matrix storage format implementation (i.e., black-boxes), and the aspect language is O'Caml.

6.2 DiSTil

The DiSTiL ([33]) system provides two sets of extensions to C:

1.: Extensions for declaring data structures via abstract specifications. The system is then responsible for generating efficient implementations of the data structures from these specifications.
2.: New operators for accessing and manipulating these data structures. Some of these operators leverage relational database ideas (ie, accessing via a selection operator, cursors).

6.3 Adaptive programming

One of the key ideas of adaptive programming is to design ``structure-shy'' programs.

For instance, suppose that design a class heirarchy for the AST's to be used by a compiler. Traditionally, if the compiler writer wants to compute ``the set of expressions in the program that reference A'', then they need to write code that will first access the program object of the class Program, and then access objects of the Procedure class, then access objects of the Statement class, and then access objects of the Expression class. While each of these classes may be relatively self-contained, the code which uses them is not: it hard-codes the entire class hierarchy whenever the AST is travered!

An alterative is to add a traverse construct to the component language that allows the programmer to more or less directly say

traverse from Program p to Expression e do
    if e.accesses("A") then
        result.add(e);
    end if
end traverse

and allow the compiler to figure out how the class hierarchy needs to be traversed in order to implement this. The advantage of using the traverse construct is that the class heirarchy can be changed (e.g., Procedure becomes Method) without having to change this code fragment at all.

A CACM article on adapative programming can be found at [20], and the home page for the Demeter research project can be found at [19].

[9] shows how the ``traversal'' construct from adaptive programming can be implemented in an open compiler using these two mechanisms,

a mechanism for adding syntactic extensions to the component programming language.
a compile-time metaobject protocal for examining a program's class graph.

7. Transformation systems

If taken to its logical conclusion, it seems that an AOP system should allow the user to specify their own aspects. This requires an open compiler implementation that accepts user language and compiler extensions. In this section we discuss compiler systems that provide a transformational framework and provide a mechanism for user-provided program transformations.

7.1 Polya

*todo*:

7.2 Intentional Programming

Intentional Programming ([1]) is basically a program transformation system with the following additions,

The primitive constructs of the base programming language are expressed as ``intentions'', and are not treated differently than user created intentions.
The order in which transformations are applied is user-specified and deterministic. This is the big different with other transformation systems.
A editor and browser is provided.

The idea is that the user adds their own domain specific intentions to the intentions already in the system, and then turns the transformation crank to get the final code.

8. Compile-time MOP

A different approach to designing an open compiler is to expose the compiler's data structures to the user. These data structures are often implemented in an object oriented manner, and the user changes the compiler's default behaviour by creating new classes, specializing methods, etc. This approach can be characterized as a compile-time metaobject protocol (MOP).

The compile-time MOP's discussed in the literature seem to built upon either the program's parse tree, or the class graph infered from the program.

8.1 Engler's hack to `lcc`

The author ([10]) takes lcc as a base implementation. He layered a slightly cleaned up IR on top of lcc's IR, and layered other high-level interfaces on top of this. A user can then implement compiler transformations and optimizations that use these informations. He then set up lcc so that the user can specify modules on the command line that will be dynamically linked and invoked by the compiler.

The author give numerous applications of this really simple hack:

Adding type-aware function (e.g., Pascal's writeln) to C.
Adding code to check all system call result values.
Various optimizations of struct declarations to optimize for space or speed (i.e., avoid unaligned fields).
Using type information to optimize calls to memcpy.
Message aggregation for RPC calls.

The author claims that this took less than a man-month's worth of work.

Strictly speaking, this is not a ``MOP'' design, but it has a similar flavor to many of the other compile-time MOP systems, and it demonstrates the simpilicity and flexibility of this approach.

8.2 Open C++

A compile-time MOP for C++ ([35], [27]).

The system is implemented as a preprocessor that reads Open C++ and writes C++. The meta-classes are the classes of the AST's used by the preprocessor. The user's meta-code is linked into the preprocessor prior to running. The preprocessor simply invokes ``Compiler'' methods on each of the meta-objects (i.e., AST objects) until the code stabilizes.

The paper describes how lexical extensions can implemented for C++.

The examples given of use of this system are,

Adding persistent objects
Generating run-time meta-objects that describe the class hierarchy.
Optimizations of arithmatic operations of a Matrix class.
Determining object layout
Determinging class implementation (e.g., Set("size<=1000;sorted")).

8.3 PARC/MIT Open Scheme compilers

Sequential: [18]. Parallel: [21]

9. Miscellaneous Compiler Architecure

In this section, we discuss work on compiler architecture from systems that are not targetted as being open or extensible. We discuss them here because they might be appropriate for an open compiler implementation.

9.1 FLINT

Strongly-typed intermediate form ([36]).

The FLINT system has multiple front-ends, a single monolithic middle-end, and multiple back-ends. The middle-end performs the usual optimizations. Adding a new language front-end might (probably will) require augmenting the FLINT IR with new types and operators and updating the middle-end optimizations for the new types and operators. The system is designed to be expressive, but not extensible.

9.2 Zephyr

This system ([43]) takes an abstract description of an AST type and produces code that in some language (e.g, C++, Java, ML) implements,

1.: the constructs and destructors of the type
2.: pickling and unpickling routines for a common binary data file format.

It's powerful enough to handle the AST types in a real system (eg, SUIF).

9.3 JTS

JTS ([8]) provides two basic tools -

Jak: Language extensions to Java that provide (a) quasiquote/unquote capabilities for generating AST's of Java programs, and (b) hygenic renaming of temporary variables in the generated code.

Bali: Essentially a parser generator. Takes as input a grammer with an associated class hierarchy for the AST, and generates a preprocessor for that grammer. Language extension can be done by composing Bali grammers.

9.4 `cmcc`

This group ([3]) assumes a single source language. Thus, their intermediate form is fixed. They provide frameworks (ie, sets of templates) for the implementation of

data-flow analysis
machine code generation
instruction scheduling
register allocation

9.5 Vortex

*todo*:

([11])

9.6 SUIF 2

AST's in SUIF 2 ([37]) are constructed by declaring subclasses of abstract AST classes. The hope is that analysis and optimization routines can be written to use only the methods of the abstract classes. That way, someone using SUIF 2 can add new programming constructs to the AST without having to change the analysis and optimization routines.

Bibliography

1: William Aitken, Brian Dickens, Paul Kwiatkowski, Oege de Moor, David Richter, and Charles Simonyi.
Transformation in intentional programming, September 19, 1997.
http://www.research.microsoft.com/ip/overview/TrafoInIP.ps.
Abstract: Intentional programming is a new paradigm in software engineering that allows programming languages to be implemented in a highly extensible manner. In particular, the programmer can specify new abstractions that are specific to his problem domain, while simultaneously recording any domain specific optimizations that may apply to such new abstractions. This paper describes a system that implements intentional programming, focusing on the facilities for program transformation. The key difference with other approaches lies in the way the order of transformation is controlled: great emphasis is placed on specifying that order in a compositional fashion, so that transformations are easily re-used.

Keyword: program transformation, domain-specific languages, optimization, design composition, extensible programming languages
2: Aspect-oriented programming.
http://www.parc.xerox.com/spl/projects/aop/.
May 2, 1998.
3: Ali-Reza Adl-Tabatabai, Thomas Gross, and Guei-Yuan Lueh.
Code reuse in an optimizing compiler.
In OOPSLA '96 [25], pages 51-68.
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/ali/www/oopsla96.ps.
Abstract: This paper describes how the cmcc compiler reuses code -- both internally (reuse between different modules) and externally (reuse between versions for different target machines). The key to reuse are the application frameworks developed for global data-flow analysis, code generation, instruction scheduling, and register allocation. The code produced by cmcc is as good as the code produced by the native compilers for the MIPS and SPARC, although significantly less resources have been spent on cmcc (overall, about 6 man years by 2.5 persons). cmcc is implemented in C++, which allows for a compact expression of the frameworks as class hierarchies. The results support the claim that suitable frameworks facilitate reuse and thereby significantly improve developer effectiveness.
4: Daniel G. Bobrow, Linda G. DeMichiel, Richard P. Gabriel, Sonya E. Keene, Gregor Kiczales, and David A. Moon.
Common lisp object system specification.
SIGPLAN Notices, 23, September 1988.
Keyword: CLOS, metaobject protocol, Lisp
5: Satish Balay, William Groopp, Lois Curfman McInnes, and Barry Smith.
PETSc 2.0 users manual.
Technical Report ANL-95/11 - Revision 2.0.15, Mathematics and Computer Science Division, Argonne National Laboratory, 1996.
6: Satish Balay, William D. Gropp, Lois Curfman McInnes, and Barry F. Smith.
Efficient management of parallelism in object-oriented numerical software libaries.
In E. Arge, A.M. Bruaset, and H.P. Langtangen, editors, Modern Software Tools in Scientific Computing. Birkhauser Press, 1997.
*todo*: http://???
Abstract: Parallel numerical software based on the message-passing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high efficiency and ease of use. The PETSc 2.0 package uses object-oriented programming to conceal the details of the message passing, without concealing the paralleism, a a high-quality set of numerical software libraries. In fact, the programming model used by PETSc is also the most appropriate for NUMA shared-memory machines, since they require the same careful attention to memory hierarchies as do distributed-memory machines. Thus, the concepts discussed are appropriate for all scalable computing systems. The PETSc libraries provide many of the data structures and numerical kernels required for the scalable solution of PDEs, offering performance portability

Keyword: PETSc, numerical libraries, parallel libraries, sparse computations
7: Blitz++ home page.
http://monet.uwaterloo.ca/blitz/.
May 2, 1998.
8: Don Batory, Bernie Lofaso, and Yannis Smaragdakis.
JTS: Tools for implementing domain-specific languages.
In Accepted to the 5th International Conference on Software Reuse, Victoria, British Columbia, June 1998.
ftp://ftp.cs.utexas.edu/pub/predator/jts.ps.
Abstract: The Jakarta Tool Suite (JTS) aims to reduce substantially the cost of generator development by providing domain independent tools for creating domain-specific languages and component-based generators called GenVoca generators. JTS is a set of precompiler-compiler tools for extending industrial programming languages (e.g., Java) with domain-specific constructs. JTS is itself a GenVoca generator, where precompilers for JTS-extended languages are constructed from components.

Keyword: Java, extensible compilers, domain-specific languages, macros
1: Cristina Videira Lopesand Karl J. Lieberherr.
Ap/s++: Case-study of a mop for purposes of software evolution.
a revise version of this paper to appear in the proceedings of Reflection '96.
ftp://www.ccs.neu.edu/pub/people/crista/papers/reflection96.ps.
Abstract: We study a recent programming paradigm known as Adaptive Programming (AP) as an ideal candidate for a metaobject protocal (MOP) for object-oriented programming languages; we call it the AP^MOP. The major benefit of the AP^MOP is to provide a mechanism for writing base-level programs in a structure-shy manner. Doing so, the prograams are more robust to changes in the structural aspects of the applications. We describe AP/S++, an implementation of the AP^MOP using the Scheme-based, object-oriented language S++. AP/S++ is a compile-time MOP and has no negative effects on the run-time performance of programs. The constributions of this paper are: (i) to show a new application for reflection; (ii) to clearly identify the abstraction boundaries of AP; and (iii) to propose an implementation of the AP^MOP that can easily be reproduced in many object-oriented programming languages.

Keyword: adaptive programming, metaobject protocals
10: Dawson R. Engler.
Incorporating application semantics and control into compilation.
In USENIX DSL '97 [42].
http://www.pdos.lcs.mit.edu/ engler/magik.ps.
Abstract: Programmers have traditionally been passive users of compilers, rather than active exploiters of their transformational abilities. This paper presents MAGIK, a system that allows programmers to easily and modularly incorporate application-specific extensions into the compilation process. The MAGIK system gives programmers two significant capabilities. First, it provides mechanisms that implementors can use to incorporate application semantics into compilation, thereby enabling both optimizations and semantic checking impossible by other means. Second, since extensions are invoked during the translation from source to machine code, code transformations (such as software fault isolation [14]) can be performed with full access to the symbol and data flow information available to the compiler proper, allowing them both to exploit source semantics and to have their transformations (automatically) optimized as any other code.

Keyword: Magik, compiler intrastructure, open compilers, lcc
11: Jeffrey Dean, Greg DeFouw, David Grove, Vass Litvinov, and Craig Chambers.
Vortex: An optimizing compiler for object-oriented languages.
In OOPSLA '96 [25], pages 83-100.
ftp://ftp.cs.washington.edu/homes/chambers/hybrid.ps.Z.
Abstract: Previously, techniques such as class hierarchy analysis and profile-guided receiver class prediction have been demonstrated to greatly improve the performance of applications written in pure object-oriented languages, but the degree to which these results are transferable to applications written in hybrid languages has been unclear. In part to answer this question, we have developed the Vortex compiler infrastructure, a language-independent optimizing compiler for object-oriented languages, with front-ends for Cecil, C++, Java, and Modula-3. In this paper, we describe the Vortex compiler's intermediate language, internal structure, and optimization suite, and then we report the results of experiments assessing the effectiveness of different combinations of optimizations on sizable applications across these four languages. We characterize the benchmark programs in terms of a collection of static and dynamic metrics, intended to quantify aspects of the "object-oriented- ness" of a program.

Keyword: vortex, compiler infrastructure, compiling object-oriented languages
12: BLAS Techinal Forum.
Sparse BLAS library: Lite and toolkit level specifications, January 1997.
editted by Roldan Pozoand Micheal A. Heroux and Karin A. Remington.
http://math.nist.gov/spblas/blast.ps.
Keyword: sparse blas
13: Gregor Kiczales, Jim des Rivières, and Daniel G. Bobrow.
The Art of the Metaobject Protocol.
MIT Press, Cambridge, MA, 1991.
Keyword: metaobject protocol, CLOS, Lisp, object-oriented languages
14: Gregor Kiczales, John Lamping, Cristina Videira Lopes, Chris Maeda, Anurag Medhekar, and Gail Murphy.
Open implementation design guidelines, 1997.
http://www.parc.xerox.com/spl/projects/oi-at-parc/ourpapers/oidg.ps.gz.
Abstract: Designing reusable software modules can be extremely difficult. The design must be balanced between being general enough to address the needs of a wide range of clients and being focused enough to truly satisfy the requirements of each specific client. One area where it can be particularly difficult to strike this balance is in the implementation strategies, tuned for a wide range of clients, aren't necessily optimal for each specific client -- this is especially an issue for modules that are intended to be reusable and yet provide high-performance. An examination of existing software systems shows that an increasingly important technique for handling this problem is to design the module's interface in such a way that the client can assist of participate in the selection of the module's implementation strategy. We call this approach open implemention. When designing the interface to a module that allows its clients some control over its implementation strategy, it is important to retain, as much as possible, the advantages of traditional closed implementation modules. This paper explores issues in the design of interfaces to open implementation modules. We identify key design choices, and present guidelines for deciding which choices are likely to work best in particular situations.

Keyword: open implementations, software design, software reuse
15: Gregor Kiczales, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Videira Lopes, Jean-Marc Loingtier, and John Irwin.
Aspect-oriented programming.
Technical Report SPL97-008 P9710042, Xerox Palo Alto Research Center, February 1997.
http://www.parc.xerox.com/spl/projects/aop/tr-aop.htm.
Abstract: We have found many programming problems for which neither procedural nor object-oriented programming techniques are sufficient to clearly capture some of the important design decisions the program must implement. This causes the implementation of those design decisions to be scattered throughout the code, resulting in "tangled" code that is excessively difficult to develop and maintain. We present an analysis of why it is that such design decisions have been so difficult to clearly capture in actual code. We call the issues these decisions address ASPECTS, and say that the reason they have been hard to capture is that they CROSS-CUT the system's basic functionality. We present the basis for a new programming technique, called aspect-oriented programming, that makes it possible to clearly express programs involving such aspects, including appropriate isolation, composition and reuse of the aspect code. The discussion is rooted in systems we have built using aspect-oriented programming. Object-oriented programming has been presented as a technology that can fundamentally aid software engineering, because the underlying object model provides a better fit with real domain problems. But we have found many programming problems where OOP techniques are not sufficient to clearly capture all the important design decisions the program must implement. Instead, it seems that there are some programming problems that fit neither the OOP approach nor the procedural approach it replaces.

Keyword: aspect-oriented programming, open compilers
16: John Irwin, Jean-Marc Loingtier, John Gilbert, Gregor Kiczales, John Lamping, Anurag Mendhekar, and Tatiana Shpeisman.
Aspect-oriented programming of sparse matrix code.
Technical Report SPL97-007 P9710045, Xerox Palo Alto Research Center, February 1997.
http://www.parc.xerox.com/spl/projects/aop/tr-aml.htm.
Abstract: The expressiveness conferred by high-level and object-oriented languages is often impaired by concerns that cross-cut the basic functionality. Execution time, data representation, and numerical stability are three such concerns that are of great interest to numerical analysts. Using aspect-oriented programming we have created AML, a system for sparse matrix computation that deals with these concerns separately and explicitly while preserving the expressiveness of the original functional language. The resulting code maintains the efficiency of highly tuned low-level code, yet is ten times shorter.

Keyword: sparse compiling, aspect-oriented programming, open compiling
17: IMSA '92 Proceedings (Workshop on Reflection and Meta-level Architectures), 1992.
18: John Lamping, Gregor Kiczales, Luis H. Rodriguez, and Erik Ruf.
An archutecture for an open compiler.
In IMSA '92 [17].
http://www.parc.xerox.com/spl/groups/eca/pubs/complete.html.
Abstract: This is a progress report on an experiment to build a compile-time metaobject protocol for Scheme. The compilation setting raises issues not present in runtime oriented MOP's, due to the complexity of the domain and the coupling between different parts. To address the complexity of the domain, we have developed a structure that decomposes the description of an implementation into a combination of many small, partially interacting choices. To address the coupling, we have developed a decision making process that allows implementation choices to be made by a collaboration between user interventions and default decision making

Keyword: Scheme, open implementation
19: Karl J. Lieberherr.
Demeter's www home.
http://www.ccs.neu.edu/research/demeter/.
May 2, 1998.
Keyword: adaptive programming
20: Karl J. Lieberherr, Ignacio Silva-Lepe, and Cun Xiao.
Adapative object-oriented programming using graph-based customization.
Communications of the ACM, 37(5):94-101, may 1994.
Keyword: adapative programming, Demeter
21: Luis H. Rodriguez.
A study on the virability of a production-quality metaobject protocol-based statically parallelizing compiler.
In IMSA '92 [17].
Abstract: A compiler that automatically chooses a program's parallelization if often unable to choose either the best one or the particular one that a programmer has in mind. This has led to systems that provide the programmer with explicit control over a program's parallelization, for example, via compiler pragmas. The pragma approach is like the metaobject protocol (MOP) approach in that pragmas provide control over what would otherwise be hidden aspects of an implementation. However, it differs because the set of pragmas is fixed, thereby limiting the amount of control provided. We investigated whether it was possible to increase the amount of control using the full MOP approach. We were in fact successful, but the resulting MOP differs from previous ones in that it is present at compile-time rather than at run-time. In this paper, we compare the MOP approach with other approaches, and discuss what is needed in order to produce a production-quality MOP-based statically parallelizing compiler.

Keyword: metaobject protocol, open implementation, parallelizing compiler
22: Chris Maeda, Arthur Lee, Gail Murphy, and Gregor Kiczales.
Open implementation analysis and design, 1997.
http://www.parc.xerox.com/spl/projects/oi-at-parc/ourpapers/oiad.ps.gz.
Abstract: This paper describes a methodology for designing Open Implementations - software modules that can adapt or change their internals to accommodate the needs of different clients. Analysis techniques are used for capturing domain knowledge, user requirements, and domain properties that influence the module's eventual implementation. Design techniques are used for determining and refining the interfaces by which clients control the modules implementation strategies. The methodology has evolved over the past two uears in several pilot projects.

Keyword: software design methodology, software reuse, open implementation, framework, metaobject protocal
23: Jr. Novak, Gordon S.
Creation of views for reuse of software with different data representations.
IEEE Transactions on Software Engineering, 21(5):993-1005, December 1995.
http://www.cs.utexas.edu/users/novak/tose95.html.
Abstract: Software reuse is inhibited by the many different ways in which equivalent data can be represented. We describe methods by which views can be constructed semi-automatically to describe how application data types correspond to the abstract types that are used in numerical generic algorithms. Given such views, specialized versions of generic algorithms that operate directly on the application data can be produced by compilation. This enables reuse of the generic algoriths for an application with minimal effort. Graphical user interfaces allow views to be specified easily and rapidly. Algorithms are presented for deriving, by symbolic algebra, equations that related the variables used in the application data to the variables needed for the generic algorithms. Arbitrary application data structures are allowed. Units of measurement are converted as needed. These techniques allow reuse of a single version of a generic algorithm for a variety of possible data representations and programming languages. These techniques can also be applied in data conversion and in object-oriented, functional, and transformational programming.

Keyword: view, program transformation, generic algorithm, software reuse, data conversion, visual programming, symbolic algebra, abstract data type
24: Jr. Novak, Gordon S.
Software reuse by specialization of generic procedures through views.
IEEE Transactions on Software Engineering, 23(7), July 1997.
http://www.cs.utexas.edu/users/novak/tose97.html.
Abstract: A generic procedure can be specialized, by compilation through views, to operate directly on concrete data. A view is a computational mapping that describes how a concrete type implements an abstract type. Clusters of related views are needed for specialization of generic procedures that involve several types or several views of a single type. A user interface that reasons about relationshipts between concrete types and abstract types allows view clusters to be created easily. These techniques allow rapid specialization of generic procedures for applications.

Keyword: software reuse, view, generic algorithm, generic procedure, algorithm specialization, partial evaluation, direction-manipulation editor, abstract data type
25: Eleventh Annual Conference on Object-Oriented Programming Systems, Languages and Applications, San Jose, CA, October 6-10, 1996.
26: Open implementation home page.
http://www.parc.xerox.com/spl/projects/oi/.
May 2, 1998.
27: Openc++ home page.
http://www-masuda.is.s.u-tokyo.ac.jp/openc++.html.
May 3, 1998.
28: Petsc - the portable, extensible toolkit for scientific computation.
http://www.mcs.anl.gov/petsc/.
May 5, 1998.
29: Bill Pugh and Tatiana Shpeisman.
Compiling efficient sparse matrix code: LU factorization.
white paper, November 18, 1996.
30: William Pugh and Tatiana Shpeisman.
Generation of the efficient code for sparse matrix computations.
submitted to ???, February 1, 1998.
Abstract: Developing computational codes that compute efficiently with sparse matrices is a difficult and error-prone process that requires a lot of specialized knowledge from the programmer. The resulting codes are very complicated due to the specialized data structures used to store nonzero elements of the matrices. Automatic generation of sparse code from the corresponding dense version would simplify the programmer's task, provided that a compiler-generated code is fast enough to be used instead of a hand-written code. Generating efficient sparse matrix code is complicated, as the performance depends on many factors. We propose a new Sparse Intermediate Program Representation (SIPR) that spearates the issue of maintaining complicated data structures from the actual matrix computations to be performed. Cost analysis of SIPR allows for the prediction of the program efficiency, and provides a solid basis for choosing between many possible sparse implementations. The SIPR framework allows the use of techniques that are frequently used in the hand-written code but previously were not considered for compiler-generated codes due to their complexity.

Keyword: spare compilation
31: Roldan Pozo.
Template numerical toolkit for linear algebra: High performance programming with c++ and the standard template library.
In Environments and Tools for Parallel Scientific Computing III, Faverages de la Tour, France, August 21-23, 1996.
http://math.nist.gov/tnt/tnt.ps.gz.
Abstract: We present a new C++ library design for linear algebra computations on high performance architectures. The Template Numerical Toolkit (TNT) for Linear Algebra is a successor to the Lapack++, Sparselib++, and IML++ packages, providing support for direct and iterative solvers. Its goal is to formally integrate these ideas into a generic algorithm library supporting user-defined data types and data neutrality. The design of the core library utilizes components from the C++ Standard Template Library (STL) and the basic parallel extensions defined in HPC++.

Keyword: TNT, sparse libraries
32: Roldan Pozo.
A template numerical toolkit for linear algebra.
http://math.nist.gov/tnt/.
May 4, 1998.
33: Yannis Smaragdakis and Don Batory.
DiSTiL: A transformation library for data structures.
In USENIX DSL '97 [42].
ftp://ftp.cs.utexas.edu/pub/predator/distil.ps.
Abstract: DiSTiL is a software generator that implements a declarative domain-specific language (DSL) for container data structures. DiSTiL is a represenative of a new approach to domain-specific language implementation. Instead of being the usual one-of-akind stand-alone compiler, DiSTiL is an extenion library for the Intentional Programming (IP) ttransformation system (currently under development by Microsoft Research). DiSTiL relies on several reusable, general-purpose infrastructure tools offered by IP that substantially simplify DSL implementation.

Keyword: domain-specific language, intentional programming, container classes, automatically generated data structures
34: Yannis Smaragdakis and Don Batory.
Implementing layered designs with mixin layers.
In ECOOP '98, July 1998.
ftp://ftp.cs.utexas.edu/pub/predator/templates.ps.
Abstract: Mixin layers are a technique for implementing layered object-oriented designs (e.g. collaboration-based designs). Mixin layers are similar to abstract subclasses (mixin classes) but scaled to a multiple-class granularity. We describe mixin layers from a programming language viewpoint, discuss checking the consistency of a mixin layer composition, and analyze the language support issues involved.

Keyword: mixins, object-oriented programming, C++, CLOS
35: Shigeru Chiba.
A metaobject protocol for c++.
In Tenth Annual Conference on Object-Oriented Programming Systems, Languages and Applications, pages 285-299, Austin, TX, October 1995.
http://www-masuda.is.s.u-tokyo.ac.jp/publications/chiba-oopsla95.ps.gz.
Abstract: This paper presents a metaobject protocol (MOP) for C++. This MOP was designed to bring the power of meta-programming to C++ programmers. It avoids penalties on runtime performance by adopting a new meta-architecture in which the metaobjects control the compilation of programs instead of being active during program execution. This allows the MOP to be used to implement libraries of efficient, transparent language extensions.

Keyword: metaobject protocol, C++, open implementation
36: Zhong Shao.
Typed common intermediate format.
In USENIX DSL '97 [42].
http://flint.cs.yale.edu/flint/publications/tcif.html.
Abstract: Application languages are very effective in solving specific software problems. Unfortunately, they pose great challenges to reliable and efficient implementations. In fact, most existing application languages are slow, interpreted, and have poor interoperability with general-purpose languages. This paper presents a framework for building a high quality systems environment for multiple advanced languages. Our key innovation is the use of a common typed intermediate language, named FLINT, to model the semantics and interactions of various language-specific features. FLINT is based on a predicative variant of the Girard-Reynolds polymorphic calculus F-omega, extended with a very rich set of primitive types and functions. FLINT provides a common compiler infrastructure that can be quickly adapted to generate compilers for new general-purpose and domain-specific languages. With its single unified type system, FLINT serves as a great platform for reasoning about cross-language inter-operations. FLINT types act as a glue to connect language features that complicate interoperability, such as mixed data representations, multiple function calling conventions, and different memory management protocols. In addition, because all runtime representations are determined by FLINT types, languages compiled under FLINT can share the same system-wide garbage collector and foreign function call interface.

Keyword: flint, intermediate forms, compiler infrastructure
37: The SUIF 2.0 compiler system.
SUIF Compiler System.
http://www-suif.stanford.edu/suif/suif2.0/index.html.
May 2, 1998.
38: Todd Veldhuizen.
Expression templates.
C++ Report, 7(5):26-31, June 1995.
http://monet.uwaterloo.ca/ tveldhui/papers/Expression-Templates/exprtmp l.html.
Abstract: Expression Templates is a new C++ technique for passing expressions as function arguments. The expression can be inlined into the function body, which results in faster and more convenient code thatn C-style callback functions. This technique can be used to evaluate vector and matrix expressions in a single pass without temporaries. In preliminary benchmark results, one compiler evaluates vector expressions at 95-99.5% efficieny of hand-coded C using this technique (for long vectors). The speed is 2-15 times that of a conventional C++ vector class.

Keyword: C++, templates, metaprogramming
39: Todd Veldhuizen.
Using c++ template metaprograms.
C++ Report, 7(4):36-43, May 1995.
Reprinted in the book C++ Gems edited by Stanley Lippman.
http://monet.uwaterloo.ca/ tveldhui/papers/Template-Metaprograms/meta-a rt.html.
Abstract: No abstract.

Keyword: C++, templates, metaprogramming
40: Todd Veldhuizen.
Will c++ be faster than fortran.
In ISCOPE '97, 1997.
http://monet.uwaterloo.ca/ tveldhui/papers/iscope97.ps.
Abstract: After years of being dismissed as too slow for scientific computing, C++ has caught up with Fortran and appears ready to give it stuff competition. We survey the reasons for the historically poor performance of C++ (pairwise expression evaluation, the abstraction penalty, aliasing ambiguities) and explain how these problems have been resolved. C++ can be faster than Fortran for some applications, due to template techniques (such as expression templates and template metaprograms) which permit optimizations beyond the ability of current Fortran compilers.

Keyword: Blitz++, C++, templates, metaprogramming
41: Todd Veldhuizen.
The blitz++ array model.
In ISCOPE '98, 1998.
to appear.
http://monet.uwaterloo.ca/blitz/papers/iscope98.ps.
Abstract: This paper describes Blitz++ arrays, which offer functionality and effciency competitive with Fortran 90, but without any language extensions. One of the key ideas behind Blitz++ is moving features out of compilers and into libraries. Blitz++ handles parsing and analysis of array expressions on its own, and performs optimizations (particularly loop transformations) which have until now been the responsibility of compilers. This is done using the expression template technique. Blitz++ provides many features unavailable in Fortran 90, such as arbitrary transpose operations, array renaming, tensor notation, scans, partial and block reductions. Blitz++ arrays achieve a median performance of 95-98% that of Fortran over 23 loop kernels on the Cray T3E.

Keyword: Blitz++, C++, templates, metaprogramming
42: 1997 Usenix Conference on Domain-Specific Languages, Santa Barbara, CA, October 1997.
http://www.usenix.org/publications/library/proceedings/dsl97/.
43: Daniel C. Wang, Andrew W. Appel, Jeff L. Korn, and Christopher S. Serra.
The zephyr abstract syntax description language.
In USENIX DSL '97 [42].
http://www.cs.princeton.edu/zephyr/ASDL/docs/slides/ztalk-dsk97.ps.
Abstract: The Zephyr Abstract Syntax Description Language (ASDL) describes the abstract syntax of compiler intermediate representations (IRs) and other tree-like data structures. Just as the lexical and syntactic structures of programming languages are describe with regular expressions and context free grammars, ASDL provides a concise notation for describing the abstract syntax of programming languages. Tools can convert ASDL descriptions into the appropriate data-structure definitions and functions to convert the data-structures to or from a standard flattened representation. This makes it easier to build compiler components that interoperate. Although ASDL lacks subtyping and inheritance, it is able is able to describe the Stanford University Intermediate Format (SUIF) compiler IR, originally implemented in C++. We have built a tool that converts ASDL info C, C++ Java, and ML data-structure definitions and conversion functions. We have also built a graphical browser-editor of ASDL data structures. ASDL shares features found in many network interface description languages (IDLs), algebraic data types, and languages such as ASN.1 and SGML. Compared to other alternatives ASDL is simple and powerful. This document describes ASDL in detail and presents an initial evaluation of ASDL.

Keyword: Zephyr, compiler infrastructure, abstract syntax trees

Footnotes

... (TM)¹: Yes, (TM).

Paul Stodghill
1998-10-23

Interface Style	How Strategy is Selected	Tradeoffs	When appropriate
Style A - No implementation strategy control interface	Module can do whatever it wants. Can be adaptive.	Same as Black-Box Abstraction	One implementation strategy is enough. Or, module can adequately pick the strategy.
Style B - Client provides declarative information about its use. ``sequential file scan.''	Module matches against user's usage.	Usage does not uniquely specify implementation. Patterns are hidden; why is the user to know which usage patterns will result in which implementation?	When the module can do an adequate job with the declarative usage description.
Style C - Client specifies the implementation strategy. ``LRU cache management.''	User selects strategy.	User may pick the wrong strategy. Fixed number of strategies.	There are few candidate implementation strategies, but it is difficult to choose among them automatically
Style D - Client provides the implementation strategy. ``Here a function pointer for the policy.''	User provides the strategy.	User has complete control, but the module might be very difficult to engineer. The user may find implementing the strategies difficult.	It is not feasible for the module to implement alll strategies.

Open and Extensible Compiler Implementations (DRAFT)

8. Compile-time MOP

Footnotes

Open and Extensible Compiler Implementations
(DRAFT)