Research
My research focuses on the areas of programming languages and compiler construction. I have interest in the design and use of program analyses for program understanding, verification, and transformation. I have done some work on the following topics:
- Compile-Time Memory Management
- Static Memory-Leak Detection for C programs
- Shape Analysis
- Building Lightweight Method Summaries
Compile-Time Memory Management
My main line of research has been studying the use of static analyses to provide compile-time automatic memory management in Java and to detect memory errors in Java programs extended with explicit deallocation. In particular, I have worked in the following two projects:
Jfree: Compile-time deallocation of individual objects
This work focuses on extending Java with explicit deallocation via free statements. The idea is to detect, via static analysis, when the last reference to an object is removed. At that point we can safely reclaim the object by inserting a free statement.
We have developed two systems that use this approach to compile-time memory management:
Jreg: Support for region-based memory management in Java
- Our first approach used a powerful shape analysis technique(CR:ISMM06). This technique essentially models a static reference-counting garbage collector: we model what are the references to an object and detect when references to the object are lost. At that point we can reclaim the object. Moreover, we can deallocate non-recursive structures by feeding the analysis results to a new iteration of the analysis. Overall, the technique is powerful and can scale to the SPECjvm98 benchmarks. On average it can reclaim about 54% of the memory in these programs. But,
- It's a whole-program analysis, which makes the technique expensive, and
- It can't deallocate recursive structures.
- Our second approach was to develop a tool that identifies unique variables and fields in programs(CR:ISMM07). When the unique reference is dead, it's value becomes unreachable and can be reclaimed. This approach was successful in that:
- It can reclaim recursive structures when fields in the structures are unique. This is done by adding "destructors" to Java classes.
- It avoided a whole-program analysis by using precomputed method summaries (CR:CC07).
- It can achieve similar savings than the previous approach, but its modular design improved the performance by an order of magnitude.
We provide mechanisms to support for region-based memory management in Java. The system groups objects with similar lifetimes together into regions, and deallocates all the objects in a region at once. This project has focused on:
More details about both Jreg and Jfree, as well as software releases, can be found on the F-Rex project web-page.
- The use of static analyses and transformations to automatically identify regions and object lifetimes (CR:ISMM04).
- Bytecode extensions for regions and memory de-allocation.
- Extensions to have run-time support for regions in virtual machines.
- Verifying memory safety of region-annotated bytecodes (CR:BYTE05).
Static Memory-Leak Detection for C programs
I have recently worked on FastCheck (CPR:PLDI07), a static analysis tool for detecting memory leaks in C programs. FastCheck tracks the flow of values from allocation points to deallocation points using a sparse representation of the program. Our sparse representation consists of a guarded value-flow graph. The graph captures def-use relations and value flows via program assignments. Edges in the graph are annotated with guards that describe branch conditions in the program. The value-flow represented by an edge can occur when the branches indicated by the edge's guard are taken.The memory leak analysis is reduced to a reachability problem over the guarded value-flow graph. Given an allocation, we first find if the allocated value flows to any free statement. If so, we use the guards to reason about the branches that must be taken for a deallocation to happen. With the aid of a SAT solver, FastCheck can determine when a combination of branch decisions leads to no deallocation and produces a memory leak. The sparse program representation allows FastCheck to focus on the portions of the program relevant to the leak problem. This makes FastCheck efficient in practice, and allows it to report concise error messages. We have run FastCheck on the SPEC 2000 benchmarks and on two open-source applications, bash and sshd, and found around 60 memory leaks. FastCheck prioritizes the warnings it generates and achieves a false positive rate below 20% for these programs.
More details and software releases will be available directly at the FastCheck webpage.
Shape Analysis
Maintaining Structural Invariants in Shape Analysis with Local ReasoningWe study a shape analysis algorithm using local reasoning to analyze heap structures with structural invariants, such as doubly-linked lists (CR:VMCAI07). Our algorithm abstracts and analyzes one single heap cell at a time. In order to maintain the structural invariants, the analysis uses a local heap abstraction that models the sub-heap consisting of one cell and its immediate neighbors. The proposed algorithm can successfully analyze standard doubly-linked list manipulations. We include a soundness proof in a longer version of our paper (CR:TR06).
Lightweight Method Summaries
When program analyses reason about the potential effects of invoking methods, they typically choose one of the following two approaches: either perform an expensive inter-procedural analysis; or use a worst-case approximation of the possible method effects. The former affects the scalability and modularity of the analysis, whereas the latter can affect its precision. A middle ground is to summarize method behavior using method effect summaries, thus avoiding the costly inter-procedural computations or imprecise assumptions, at the expense of requiring method summaries to be provided from an external source.We have built a k-limited escape and effect analysis system that allows us to generate lightweight method summaries(CR:CC07). We have been able to use our technique to generate summaries for the entire GNU Classpath libraries (browse these summaries in our web interface). In our experience the computed summaries can help analysis clients to approximate the effects of methods calls and avoid worst-case assumptions.