Tiling is one of the most important locality enhancement techniques for loop-nests since it permits the exploitation of data reuse in multiple loops in a loop-nest.
An important parameter for tiling is the size of the tiles. If there are n loops, then there are n parameters T1, T2, ... , Tn to be determined one for each loop. These parameters affect the performance of the resulting code. Tiling with these parameters set to 1, for example, is equivalent to the original code with no tiling. Reuse improves as the tile-size increases, as long as the data used in the tile resides in the cache. Tiling with very large tile-sizes also does not have improve the performance as the data will not fit into cache. Clearly there is an optimal tile-size depending on the characteristics of the cache being used and the actual code being tiled.
There has been some research in this area which attempts to determine the optimal tile-size. The fundamental idea in these attempts is to ensure that the data used by a tile fits into cache. If we assume a fully-associative cache with an optimal replacement policy, this requires that the amount of data touched by a tile should be less than the cache size. We would want the largest tile size that satisfies this condition. This condition can be used to determine bounds on the tile-size. Unfortunately, caches are not fully associative. As a result, conflicts arise when data used in the same tile map to the same cache locations. Certain forms of conflicts (due to self-interference, that is, interference due to data brought in by the same array reference) can be eliminated by making sure the tile-size chosen avoids interference.
Recent research in our group has shown how to tile imperfectly-nested loops in a similar manner by embedding the statements in a fully permutable iteration space. Once this embedding has been done, the code resembles a perfectly nested loop-nest and can be tiled.
The goals of the project are the following :-
You should look at matrix multiplication, Cholesky and Jacobi for this project. These codes are in the first tech-report.