To illustrate the functionality of this new version, there are three examples, all in the BCC project. These are DependenceAnalysis, TestTile and TestUnimodularTransform. These are passes p3, p4 and p5 respectively in BCC.cs. You can comment out two of them and leave only one to experiment how they behave. The line to comment is pX.execute(ref e), where X is 3, 4 or 5. It is easiest if you experiment with the following input: ------------------------------ const int SIZE = 500; int main(int argc, char* argv[]) { int i, j, k; for(i=0; i<=SIZE-1; ++i) for(j=0; j<=SIZE-1; ++j) for(k=0; k<=SIZE-1; ++k) C[i][j] += A[i][k] * B[k][j]; return 1; } ------------------------------ The result from DependenceAnalysis.cs is: Parsing (C:\user\BC\BCC\bin\Debug\test4.c)... done! Ambiguity resolutions: 4 (1 Syntactic + 3 Semantic) PASS: NodeInitializer... Done! PASS: ArrayFixup... Done! PASS: SymbolTableBuilder... Done! PASS: DependenceAnalysis... Function: main WAW between C[i][j] and C[i][j]: (0, 0, +) RAR between A[i][k] and A[i][k]: (0, +, 0) RAR between B[k][j] and B[k][j]: (+, 0, 0) Done! PASS: Simplifier... Done! PASS: NodeDeinitializer... Done! Time: 00:00:01.8626784 The result from TestTile.cs is: const int SIZE = 500; int main(int argc, char * argv[]) { int i; int j; int k; { int _i; int _j; int _k; for (_i = 0; _i <= SIZE - 1; _i += 10) for (_j = 0; _j <= SIZE - 1; _j += 10) for (_k = 0; _k <= SIZE - 1; _k += 10) for (i = _i; i <= min(_i + 10, SIZE - 1); i += 1) for (j = _j; j <= min(_j + 10, SIZE - 1); j += 1) for (k = _k; k <= min(_k + 10, SIZE - 1); k += 1) C[i][j] += A[i][k] * B[k][j]; } return 1; } and finally the result from TestUnimodularTransform.cs is: const int SIZE = 500; int main(int argc, char * argv[]) { int i; int j; int k; { int _i; int _j; int _k; for (_i = 0; _i <= -1 + SIZE; _i += 1) for (_j = 0; _j <= -1 + SIZE; _j += 1) for (_k = 2 * _i; _k <= -1 + 2 * _i + SIZE; _k += 1) { i = -2 * _i + _k; j = _j; k = _i; C[i][j] += A[i][k] * B[k][j]; } } return 1; } As you might have already guessed, these three examples do the following: 1. DependenceAnalysis.cs is a paragon solution to HW2. It also has reuse vectors. You can use this for HW3. 2. TestTile.cs finds a loopnest inside a function and tiles it with square tiles of size 10. 3. TestUnimodularTransform.cs finds a 3-deep loopnest inside a function and transforms it with the unimodular matrix [[1 0 0][0 1 0][2 0 1]]. As you can say from the source of 2 and 3, they are just examples, not general purpose routines. Running them successfully on anything but the input described above is not guaranteed to work. They should serve you as an example of what you might need to do as part of HW3. There are a bunch of interesting changes to BCCK with a lot of new functionality: 1. Omega is handled much better a. You can walk the DNF b. You can use && and || in formulas, so no excessive parethesizing is needed anymore... c. You can reuse the variables you define... (a favourite trouble in HW2). d. A lot of useful relations are implemented for you e. Translation of native AST expressions to/from Omega expressions is implemented 2. Passes work a bit differently. a. Inside VisitXXX the first parameter is of type AbstractNode and is passed by reference. This lets you change the node you are visiting to something completely different. b. A simplifier pass has been implemented to optimize expressions like 0*x 1*x x+0 and so on. These are generated a lot in unimodular transforms... 3. Integer Linear Programming Stuff is implemented a. A very powerful Matrix class b. Null spaces, Kernels, ColumnEchelonForm, ... c. ... You can also run DependenceAnalysis.cs on the example input for HW2: void test1 (int N) { int i, j, k; for(i=0; i<=N-1; ++i) for(j=0; j<=N-1; ++j) for(k=0; k<=N-1; ++k) A[i][k] = B[i][j] * C[j][k]; } void test2 (int N) { int i, j; for(i=0; i<=N-1; ++i) for(j=i; j<=N-1; ++j) A[j] = A[2*i+1]; } void test3 (int N) { int i, j, k; for(i=0; i<=N-1; ++i) for(j=0; j<=i-1; ++j) for(k=i; k<=N-1; ++k) A[i][j] = A[i][j-4] * A[k][i-1]; } void test4 (int N) { int i; for(i=0; i<=100-1; ++i) x[i+100] = 2*x[i]; } void test5 (int N) { int i; for(i=0; i<=100-1; ++i) x[i+100] = 2*x[i+1]; } The results are: Parsing (C:\user\BC\BCC\bin\Debug\test2.c)... done! Ambiguity resolutions: 20 (5 Syntactic + 15 Semantic) PASS: NodeInitializer... Done! PASS: ArrayFixup... Done! PASS: SymbolTableBuilder... Done! PASS: DependenceAnalysis... Function: test1 WAW between A[i][k] and A[i][k]: (0, +, 0) RAR between B[i][j] and B[i][j]: (0, 0, +) RAR between C[j][k] and C[j][k]: (+, 0, 0) Function: test2 WAW between A[j] and A[j]: (+, 0) WAR between A[j] and A[2 * i + 1]: (+, *) (0, *) RAW between A[2 * i + 1] and A[j]: (+, *) (0, *) RAR between A[2 * i + 1] and A[2 * i + 1]: (0, +) Function: test3 WAW between A[i][j] and A[i][j]: (0, 0, +) WAR between A[i][j] and A[i][j - 4]: (0, 4, *) WAR between A[i][j] and A[k][i - 1]: (0, 0, 0) RAW between A[i][j - 4] and A[i][j]: RAR between A[i][j - 4] and A[i][j - 4]: (0, 0, +) RAR between A[i][j - 4] and A[k][i - 1]: RAW between A[k][i - 1] and A[i][j]: (0, +, *) (0, 0, *) (+, *, *) RAR between A[k][i - 1] and A[i][j - 4]: (+, *, *) RAR between A[k][i - 1] and A[k][i - 1]: (0, +, 0) Function: test4 WAW between x[i + 100] and x[i + 100]: WAR between x[i + 100] and x[i]: RAW between x[i] and x[i + 100]: RAR between x[i] and x[i]: Function: test5 WAW between x[i + 100] and x[i + 100]: WAR between x[i + 100] and x[i + 1]: (99) RAW between x[i + 1] and x[i + 100]: RAR between x[i + 1] and x[i + 1]: Done! PASS: Simplifier... Done! PASS: NodeDeinitializer... Done! Time: 00:00:02.7139024 Last but VERY IMPORTANT: All the tools and examples so far require your For loops to be Normalized. A normalized for loop satisfies the following conditions: it looks like: for ([v] = [lb]; [v] <= [ub]; [v]++) body where [v] is a namedexpression (an identifier) [lb] and [ub] are affine expressions Note that some of the inputs provided with the problem statements do not meet this requirement. Change your loops so that they do! Happy programming! Kamen