
Our research has led to several algorithmic breakthroughs in the area of program analysis. These algorithms are being used widely in industry and academia.
APT: A Data Structure for Optimal Control Dependence Computation
Keshav Pingali and Gianfranco Bilardi,
SIGPLAN '95: Conference on Programming Language Design and Implementation,
June 1995, San Diego, CA.
The Program Structure Tree: Computing Control Regions in Linear Time
Richard Johnson, David Pearson and Keshav Pingali,
SIGPLAN '94: Conference on Programming Language Design and Implementation,
June 1994, Orlando, FL.
Dependence Based Program Analysis
Richard Johnson and Keshav Pingali,
SIGPLAN '93: Conference on Programming Language Design and Implementation,
June 1993, Albuquerque, NM.
Dependence Flow Graphs: An Algebraic Approach to Program Dependencies
Keshav Pingali, Micah Beck, Richard Johnson, Mayan Moudgill, Paul Stodghill
POPL '91: 18th Symposium on Principles of Programming Languages,
January 1991, Orlando, FL.
Optimal Control Dependence and the Roman Chariots Problem
Journal paper in preparation
From Control Flow to dataflow
Micah Beck, Richard Johnson, Keshav Pingali
Journal of Parallel and Distributed Computing,
Volume 12, 1991.
Available as Cornell CS TR89-1050
Efficient Program Analysis Using Dependence Flow Graphs
Richard Johnson,
November 1994, PhD Thesis, Cornell University.
Available as TR94-1464
Many numerical routines like matrix multiplication and matrix vector product contain nested loops which can be permuted arbitrarily without changing the output. However, the performance of the different permutations can vary quite dramatically. For example, on a parallel machine, it is desirable to move parallel loops into the outermost position in loop nests.
Loop permutation must usually be combined with other loop transformations like skewing, reversal and scaling for maximum advantage. We have invented a loop transformation framework based on integer non-singular matrices. This framework synthesizes desirable sequences of loop transformations to enahnce parallelism and locality of reference. These tools are implemented in the LAMBDA loop transformation toolkit.
A paper on LAMBDA won the best paper award at ASPLOS V. LAMBDA has been adopted by a major workstation manufacturer as the loop transformation technology for its entire compiler product line.
Access Normalization: Loop Restructuring for NUMA Compilers
Wei Li and Keshav Pingali
ASPLOS V: 5th International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS V)
September 1992, Boston, MA
An
extended version appeared in ACM Transactions on Computer Systems,
November 1993
A Singular Loop Transformation Framework Based on Nonsingular Matrices
Wei Li and Keshav Pingali
International Journal of Parallel Processing,
Volume 22, Number 2, April 1994.
Compiling for NUMA Parallel Machines
Wei Li,
November 1994, PhD Thesis, Cornell University.
Available as TR94-1469
Data and computation alignment is an important part of compiling sequential programs for architectures with non-uniform memory access. In HPF, alignment directives are provided by the programmer. We have developed a framework for computing alignment directives automatically. In this framework, the alignment problem is reduced to the standard linear algebra problem of finding a basis for the null space of a matrix, which can be solved using standard techniques for integer Gaussian elimination. We also solve the problem of replicating read-only data to eliminate communication. Our matrix-based approach leads to algorithms which are simpler and faster than existing algorithms for the alignment problem.
Solving Alignment Using Elementary Linear Algebra
David Bau, Vladimir Kotlyar, Induprakas Kodukula, Keshav Pingali,Paul Stodghill
Proceedings of the 7th Annual Workshop on Languages and Compilers for
Parallel Computers (LCPC),
Springer-Verlag Lecture Notes in Computer Science Number ??,
August 1994, Ithaca, NY.
The conjugate gradient (CG) method is a popular Krylov space method for solving systems of linear equations of the form Ax = b, where A is a symmetric positive-definite matrix. In this paper, we show how restructuring compiler technology can be applied to parallelize the CG program, exploiting the sparsity of A to reduce computation and communication requirements. On the IBM SP-2, the performance of our compiled code is comparable to that of handwritten library code.
Automatic Parallelization of the Conjugate Gradient Algorithm
Vladimir Kotlyar, Keshav Pingali,Paul Stodghill
Proceedings of the 8th Annual Workshop on Languages and Compilers for
Parallel Computers (LCPC),
Springer-Verlag Lecture Notes in Computer Science Number ??,
August 1995, Columbus, OH.
We have developed a technique to software pipeline a loop for minimal register pressure without sacrificing the loop's minimum execution time. This novel bidirectional slack-scheduling method has been implemented in a Fortran compiler and tested on many scientific benchmarks. The empirical results---when measured against an absolute lower bound on execution time, and against a novel schedule-independent absolute lower bound on register pressure---indicate near-optimal performance.
Lifetime-Sensitive Modulo Scheduling
Richard Huff
SIGPLAN '93: Conference on Programming Language Design and Implementation,
June 1993, Albuquerque, NM.
We have designed a novel mechanism that implements register renaming, dynamic speculation and precise interrupts. Renaming of registers is performed during the instruction fetch stage instead of the decode stage, and the mechanism is designed to operate in parallel with the tag match logic used by most cache designs. It is estimated that the critical path of the mechanism requires approximately the same number of logic levels as the tag match logic, and therefore should not impact cycle time.
Register Renaming and Dynamic Speculation: an Alternative Approach
Mayan Moudgill, Keshav Pingali, Stamatis Vassiliadis
Proceedings of the 26th International Symposium on Microarchitecture,
December 1993, Austin, TX.
Implementing and Exploiting Static Speculation on Multiple Instruction Issue Processors
Mayan Moudgill,
August 1994, PhD Thesis, Cornell University.
Available as Cornell CS TR94-1440
We have developed a new technique for compiled zero delay logic simulation which partitions the circuit into fanout free regions (FFRs), transforms each region into a linear sized BDD (binary decision diagram), and converts each BDD into executable code. Our approach gets some of the advantages of oblivious as well as demand-driven evaluation. On standard benchmarks, we observed a performance improvement of upto 67%.
Fast Compiled Logic Simulation Using Linear BDDs
Sudeep Gupta and Keshav Pingali,
June 1995, TR95-1522, Cornell University.
The addition of logic variables to functional languages gives the programmer novel and powerful tools such as incremental definition of data structures through constraint intersection. A number of such `hybrid' languages, like FGL + LV, Id and Qute, have been implemented and are in active use. Pure functional and logic programming languages can be given elegant abstract semantics as functions and relations over values. We use the notion of closure operators on a Scott domain to give elegant semantics to such languages. For first-order languages, we obtain a full abstraction result.
A Fully Abstract Semantics for a Functional Langauge with Logic Variables,
Radha Jagadeesan,Keshav Pingali,Prakash Panagaden
ACM Transactions on Programming Languages and Systems (TOPLAS),
Volume 13, Number 4, October 1991.
Available as Cornell CS TR89-969.
Lazy Evaluation and the Logic Variable
Keshav Pingali
Book Chapter, Research Topics in Functional Languages,
Edited by David Turner, Addison-Wesley Publishers.
Available as Cornell CS TR87-877.
Accumulators: New Logic Variable Abstractions for Functional Languages
Keshav Pingali, Kattamuri Ekanadham
Theoretical Computer Science,
Volume 81, 1991.
Available as Cornell CS TR88-952.
I-structures: Data Structures for Parallel Computing
Arvind, Rishiyur Nikhil, Keshav Pingali
ACM Transactions on Programming Languages and Systems,
Volume 11, Number 4, October 1989.
Available as Cornell CS TR87-810.
Abstract Semantics for a Higher-Order Functional Language with Logic Variables
Radha Jagadeesan and Keshav Pingali
POPL 92: 19th Symposium on Principles of Programming Languages,
January 1992, Albuquerque, NM.
Available as Cornell CS TR91-1220.
Investigations into Abstractions and Concurrency
Radha Jagadeesan
August 1991, PhD Thesis, Cornell University.
Available as Cornell CS TR91-1231.