A Common Bernoulli Compiler Framework
(DRAFT - October 23, 1998)
Paul Stodghill
This document summaries some of the issues and ideas that were raised in
our meeting on May 13. The basic thrust of this is to come up with a
strategy and implementation plan for developing a common framework for
compiler development for the group.
As I see it, our priorities in this effort, in rapidly decreasing order of
importance, should be,
- 1.
- The framework should make it easier for us to mix and match compiler
modules that have been developed within the group.
- 2.
- The framework should make it easier for us to import foreign modules
into our systems.
- 5.
- The framework should not impede the process of ``delivering'' a
compiler system to users outside of our group.
- 10.
- Certain components of the framework should be a solid pieces of
engineering that can be marketed in their own right (e.g., as a
competitor to the NCI).
- 100.
- The framework may end up providing a vehicle for experimenting
with ``open compiling'' and ``aspect-oriented programming'' ideas.
I strongly believe that one of the first problems to be tackled in
developing a common compiler framework is determining how modules
will communicate with one another. I do not mean specifying the interfaces
between modules; I mean specifying the underlying technology that will be
used to pass data between modules.
Some of the functionality that must be provided by such a substrate is,
- Naming. How does one module written in one language reference
another module written in a second language?
- Invocation. How does one module invoke another.
- Data representation. How can module parameters and results be encoded
in a language independent manner?
As was pointed out at the meeting, there are at least two general
approaches to this problem:
- Modules are programs. Data is passed via the file system.
- Modules coexist in a single executable. Data must be translated from
one language representation to another.
These two general approaches are not mutually exclusive. In particular, the
first (one module per executable) can be built on top of the second
(multiple modules per executable). The opposite is not true.
I intend to take the code that currently exists within the Bernoulli
compiler, generalize it and separate it out into a separate package,
tentatively called the Bernoulli Open Compiler Infrastructure (BOCI). I
intend to implement both the ``modules as programs'' and ``single
executable'' approaches into this system and to provide interoperability
between the implementation languages that are currently used within the
group (C, C++, O'Caml, Java).
My intention is to make the BOCI so simple enough to use that others in the
group will find it worth using. The net gain for the people that use the
BOCI will be instant low-level interoperability with other BOCI codes.
Suppose that one of us (call him, the developer) develops a module that
performs a particular optimization (say, constant propagation (CP)) on top of
the BOCI. Later, another of us (call him, the client), working on a
different compiler system realizes that he needs a module to perform CP.
The client would like to use the developer's CP module, but what if the
client cannot easily convert their program representation to the one
specified by the developer's interface?
Here are a some approaches that might be taken:
- The client can negotiate with the developer to change the interface.
Perhaps the developer can come up with a more general abstraction of the
CP problem, and use that as the basis for an improved interface.
- If the developer has used an OO language for the CP module, perhaps
the client can use inheritance, and so on, to extend the developer's CP
implementation to meet their needs.
Unfortunately, there is no general solution to this problem. As a last
resort, the client can decide to implement their own CP module.
There is no magic here. We will just have to start packaging and using
modules and learn from our mistakes.
Here are my plans for the BSCT,
- Implement the BOCI described above.
- Carefully carve the existing BSCT implementation into modules that
are built on top of the BOCI.
- Reconstitute the interfaces of their modules so that they use
abstractions appropriate for their domain. This is in contrast to what
is done now, which is to use the BMF AST as the interface for all
modules.
- Shop around for dense optimizations that can be used in place or in
conjunction with the existing dense optimizations.
- Shop around for a different input language for the Bernoulli Sparse
Compiler (BSC). C? Fortran? Who knows...Once I have selected a
front-end, I will port it to the BOCI so that it will be available for
other group members to use.
The net gain from this plan will be that,
- Compared with the current implementation, which is a single monolithic
system, the new BSCT will truly be a toolkit of loosely coupled modules
that can be used within other compiler systems.
- By selecting a better language, the BSC will be much more usable and
accessible.
Suppose that Vijay discovers that he needs certain dense optimizations to
be performed by his compiler.
Perhaps the dense optimizations that I have found to use in my
reconstituted BSCT meet his needs. If he ports the Matlab front-end to
BOCI, then he can use those optimizations (and I can use his front-end).
Perhaps these optimizations don't meet his needs because the interface for
program representation is wrong. In this case, there are several options,
- He and I can work together to reformulate the interface of the
existing dense optimizations so that it will meet both of our needs.
- He and I can develop an interface that works for both of us, and
then we develop an implementation for that interface.
The net gains in this scenario are as follows,
- By porting the Matlab front-end to BOCI, Vijay has made it easily
accessible to everyone else in the group.
- By sharing a single implementation of the dense optimizations, he and
I split the work of maintaining it. By pooling our resources and
spreading the work, we will each end up with a set of dense optimizations
that are more complete and robust than would have been the case if we had
written our own individual implementations.
Here is what I see as being require of us to establish a common framework
for compiler development:
- We need to establish some low level mechanisms for module
interoperability. I propose (and will implement) the BOCI to meet this
need.
- Each of us needs to make a commitment to develop and maintain modules
that will used by other group members.
- Each of us needs to make a commitment to, whenever possible, use the
modules that have been developed within the group.
Once the low level interoperability mechanisms are sorted out, the
difficult work lies in specifying and then iteratively improving the
interfaces of the modules that we develop.
Paul Stodghill
1998-10-23