Home
Resume [PS] [PDF]
Research
Publications


Current Research
My research centers on developing the compiler technology for highperformance
systems. My recent work falls broadly into three categories:
generic
programming with sparse matrices, memory hierarchy
optimizations, and program analysis. Currently
I am considering numerical applications; however, many of the techniques
are applicable in other areas such as databases and multimedia.
Generic programming for sparse matrix computations:
The main focus of my dissertation is a novel generic programming system
for simplifying the job of writing computational science codes that use
sparse matrices. What distinguishes this system from traditional generic
programming systems is that it separates the API used for writing generic
programs from the API used to describe data structures. Restructuring compiler
technology is used to "instantiate" generic programs into efficient sparse
matrix programs.

APIs: No single API is appropriate for generic programming with
sparse matrices. A highlevel API hides details of compressed formats from
the compiler, resulting in poor performance; while a lowlevel API is not
suitable for writing generic programs. In the Bernoulli generic programming
system, generic programs are dense matrix programs; i.e. the programmer
API is that of a randomaccess data structure. A lowlevel API describes
sparse matrices as indexedsequentialaccess data structures. The design
of the two APIs is explained in detail in this
paper. More examples of compressed formats are available here.

Restructuring compiler: Efficient sparse matrix code has the following
properties:

It is organized as datacentric computation that enumerates the
nonzero elements of sparse matrices and performs computations with these
elements as they are enumerated.

It enumerates multiple sparse matrices together. There are a number of
ways of performing common enumerations which are closely related
to join strategies in database systems such as mergejoin and hashjoin.

It preserves the semantics of the original dense matrix code. The technology
used for restructuring imperfectlynested codes is similar to the one developed
for locality enhancement.
The restructuring compiler technology for instantiating generic programs
into efficient sparse matrix programs is described in this
paper.

C++ implementation: The Bernoulli Generic Matrix Library (BGML)
is the C++ implementation of the sparse matrix abstraction. It serves as
the "glue" between the APIs and the restructuring compiler. This
paper discusses the implementation techniques used to get the most
performance from the BGML.
Exploiting memory hierarchies:
The memory systems of modern computers are organized as a hierarchy
in which the latency of memory accesses increases by roughly an order of
magnitude from one level of the hierarchy to the next. Therefore, a program
runs well only if most of its data accesses are satisfied by the faster
levels of the memory hierarchy.

Locality optimizations reduce
memory latency by restructuring codes to enhance locality of reference.
Restructuring of perfectlynested loops (where all assignment statements
are contained in the innermost loop) is well understood. In practice, however,
most loop nests are imperfectlynested.
This
paper develops a theory of imperfectlynested loop transformations
for locality enhancement, and
this paper
presents a systematic approach to tiling imperfectlynested loops.

Prefetching is a technique for tolerating memory latency by moving
data close to the processor before it is actually needed. While interning
at HP Labs Cambridge, I studied the data cache performance of SpecInt95
and other integer benchmarks on an embedded VLIW processor with very complex
memory hierarchy. The results of this study and the potential for prefetching
are discussed here.
Program analysis:
Before locality optimizations can be performed, the source
program must be analyzed to ensure that the proposed transformation preserves
the semantics of the program. The most commonly used analysis technique
is dependence analysis. Whenever that analysis is inexact, the compiler
has to be conservative and assume that restructuring is illegal. Fractal
symbolic analysis is a new analysis technique for use in restructuring
compilers to verify legality of program transformations. It combines the
power of symbolic program analysis with the tractability of dependence
analysis, and it permits restructuring of far
more complex codes than current technology can handle.
Other Work
Watermarking MPEG / JPEG
is a very interesting project I worked on for my Multimedia class at Cornell.
The project involved the design, implementation, and testing of a JPEG
codec with builtin digital watermarking. The watermarking was based on
the idea of using the roundoff error during the quantization stage of
DCT compression, and did not require any changes to the standard JPEG decoder.
Errorcorrecting codes were used to improve the robustness of the watermark.
The method was demonstrated to work very well against common signal processing
and geometric distortion attacks.

Errorcorrecting codes: For my M.S.
thesis at Sofia University, I studied the covering radii of binary
linear errorcorrecting codes. I computed the exact minimal covering radii
t[n,k] for all selfcomplementary codes of dimension k<7, as well as
lower and upper bounds on t[n,k] for selfcomplementary codes of length
n<65. I also developed an efficient algorithm for computing the weight
distribution of coset leaders of binary linear codes with repeated coordinates.
