One of last year's CS612 projects (described here) was implement a method described by Chatterjee, et al. for exactly counting the number of cache misses that would result from executing the loop nest. This method works by analyzing a loop nest and constructing Presberger formula, a subset of the first order logic, whose solutions correspond to cache misses that will occur during execution. These formulae are simplified using Omega and then the solutions are counted, possibly using a package called PolyLib.
The problem that we found with this method was that, when applied to 2- or 4-way set associative caches, the amount of time required to count the solutions for a single loop nest became quite large. The students who did the project came to the conclusion that it was not practical to implement this method in a restructuring compiler.
However, we do not always need to know the exact number of cache misses in order to do program transformations. When we reasoned about cache locality in class, we used reuse distance to predict cache behavior. The larger the reuse distance between two reference, the more likely it is that the line will no longer be in cache when it accessed a second time.
"Reuse distance" is a simpler idea than "cache miss." In particular, in order compute the reuse distance between two reference, we do not need to know the size of the cache or its associativity. The only cache parameter required is the line size. By eliminating cache size and associativity from the calculation, it is possible that computing reuse distance exactly is more feasible.
The goals of the project is to develop a method for using Presberger formulae for exactly computing the reuse distance between references in a simple loop nest.