CS 4120: Introduction to Compilers
Fall 2011

Changes:

Programming Assignment 6: Optimization

Due: Wednesday, November 23

The goal of this assignment is to improve the quality of your code by implementing high-quality register allocation and various optimizations in your compiler. In addition to code, your compiler will generate CFG diagrams showing the effects of various optimizations. You will also submit test programs that show off the effectiveness of your optimizer. Using various benchmark programs, we will compare the performance of your compiler against the compilers produced by other groups, and the compiler that has the most effective optimizations will earn a bonus.

Required optimizations

For this assignment, there are three optimizations that you are required to implement:

Register allocation (reg)
We expect you to implement register allocation with move coalescing. You must use a live variables analysis, enabling reuse of the same register for multiple variables. You will need to handle spilling, and you are required to implement move coalescing. Move coalescing will also make it easier to debug other optimizations because it eliminates unnecessary temporaries.
Unreachable code elimination (uce)
We expect unreachable code to be eliminated. This should work in conjunction with constant folding (controlled by the Ocf flag), and, if you implement it, constant propagation. One option is to do this using conditional constant propagation.
Common subexpression elimination (cse).
This is a global optimization, so it will require an available expressions analysis.

For each optimization, you should think carefully about what program representation it is best done on. We suggest working through some example programs to convince yourself that you have the right analysis and program transformations worked out before doing implementation.

Additional optimizations

Groups taking 4121 are required to implement at least one of the following optimizations; groups taking 5121 are required to implement at least two:

If there is some other optimization you would like to do in place of the optimizations on this list, consult the course staff.

Control-flow graphs

To be able to optimize code successfully, your compiler must be able to construct control-flow graphs for IR, and it must be able to flatten CFGs back into inline code for code generation purposes.

The compiler must also be able to generate displayable versions of CFGs in dot format, allowing easily visualization with Graphviz or other tools that accept this format. You will find this capability useful for debugging your optimizations, so get it working early on!

Interface

Your driver must support at least the following command-line interface:

java -jar <YOUR_JAR> <OPTIONS> <SOURCE_FILE>

Options

+O<opt>
Enable optimization <opt>. If this kind of option is used, all other optimizations are off by default unless otherwise enabled. The following optimization names will be standard, though your compiler probably will not implement all or even most of them:
reg
Register allocation
cse
Common subexpression elimination
dce
Dead code elimination
uce
Unreachable code elimination
cf
Constant folding
cp
Constant propagation
copy
Copy propagation
alg
Algebraic optimizations (identities and reassociation)
vn
Local value numbering
licm
Loop-invariant code motion
mc
Move coalescing
pre
Partial redundancy elimination
lu
Loop unrolling
inl
Inlining
-O
Disable all optimizations.
-O<opt>
Disable optimization <opt>. Other optimizations are on by default, unless a +O option is used.
-o OUTPUT_FILE
Specify the file to write the output to. If this option is not specified, and the file name SOURCE_FILE ends with .i9, replace that with .s. If SOURCE_FILE has a different extension, append .s instead.
--dump_ast <phase>
Print the abstract syntax tree for the program, as in previous assignments. The argument <phase> specifies at what point in program transformation the AST should be printed. The phases initial and final, must be understood, where final indicates after all optimizations (if any) are complete. You may add additional phases if you wish, but you should document them.
--dump_ir <phase>
Print the IR for the program, as in previous assignments. The argument The argument <phase> works in the same way as for --dump_ast.
--dump_cfg <phase>
Print out the control-flow graph for the program in dot format. The argument <phase> works in the same way as for the previous two options.

As in previous assignments, your driver should accept the file name of an Iota9 source file. If the file is invalid Iota (syntactic or semantic), you should report a helpful error message including the position of the invalid code and a description of the problem. If the program is valid, you should compile it, and output the assembly to a file, with the name determined as given above in the description of the -o option.

For example, java -jar acm22_ahj5_mo85.jar +Oreg hello.i9 should output the assembly code the compiler produces for the program hello.i9 into hello.s, using register allocation but no other optimizations.

You may add additional flags and options as long as you support these requirements.

Evaluating performance

Correctness is still essential, but now we also care about performance. We will evaluate the effectiveness of your compilers on a variety of small benchmark programs. The group whose compiler generates the best code will earn a bonus. Compilers that generate broken code will be penalized. The real bragging rights will be earned after PA7, when we will have a final performance contest with a wider variety of benchmark programs.

Submitting benchmarks

For fun and good karma, you are encouraged to submit programs that you think are good performance benchmarks.

Each benchmark test case will be a valid Xi source file. It should conform to the format restrictions described in the previous programming assignment description for functional test cases.

We encourage each group to submit up to three short benchmark programs. Designing these benchmarks early is likely to help you develop more effective optimizations. We suggest you aim for benchmarks that take between 1 and 5 seconds unoptimized. Very short time intervals are hard to measure reliably; and long-running benchmarks will make your performance testing runs too time-consuming. Where possible, your benchmarks should also produce verifiable results, and ideally, self-verify.

If we do use your benchmarks to evaluate compiler projects, we are likely to change them, so don't try to specialize your compilers to recognize your own benchmarks.

Building on PA5

As before, you are building upon your work from Programming Assignment 5. The protocol is the same as in prior assignments: you are required to develop and implement tests that expose any problems with your implementation, and then fix the problems.

Submission instructions

As in previous assignments, please ensure that all your Java code lives in a package containing the NetId of at least one of your group members. Your code should compile without errors or warnings. Your code should not contain any superfluous print or debug statements.

Submit the following: