# Lesson 3: Local Analysis & Optimization

## Gist

### Intro & Simple DCE Passes

Contrasting local vs. global vs. interprocedural analysis.

Then, our first optimizations!

• Definition of dead code elimination (DCE).
• Globally unused instructions.
• Derive an algorithm for deleting them.
• Iterating to convergence.
• Then implement it.
• Locally killed instructions.
• The limits of local reasoning: why can’t we do this globally?
• Derive an algorithm for removing them.
• Then implement that too.
• Let’s try our new optimization pass out on the Bril benchmarks.
• When working on your implementations, try them on `simple.bril`, `reassign.bril`, and other examples in the DCE test directory to see if they actually work.

• First, combine all your implementations into one command somehow (including iterating to convergence). Try out your pass—something like this:

``````\$ bril2json < bench.bril | python3 tdce.py | bril2txt
``````
• Next, try using `wc` to check static code size differences:

``````\$ bril2json < bench.bril | wc -l
\$ bril2json < bench.bril | python3 tdce.py | wc -l
``````
• Then profiling to measure dynamic instruction count:

``````\$ bril2json < bench.bril | brili -p 5
\$ bril2json < bench.bril | python3 tdce.py | brili -p 5
``````

You can see my implementation in `tdce.py` in the examples directory in the Bril repository.

### Local Value Numbering

• Local value numbering.
• Consider the common thread between dead code elimination (DCE), copy propagation, and common subexpression elimination. In some compilers classes/textbooks, these are all individual optimizations.
• Value numbering is a general framework for understanding & optimizing computations.
• If you can deeply understand the mystical metaphysics of value numbering, you will have gotten most of what you need to get out of this part of 6120.
• Extending LVN.
• LVN can subsume constant folding, copy propagation, and algebraic identities. You will need to extend it with language semantics.
• Write complete pseudocode for the base LVN algorithm, and work out where the “extension points” need to be to capture those optimizations.

Here’s the pseudocode from the video:

``````table = mapping from value tuples to canonical variables,
with each row numbered
var2num = mapping from variable names to their current
value numbers (i.e., rows in table)

for instr in block:
value = (instr.op, var2num[instr.args[0]], ...)

if value in table:
# The value has been computed before; reuse it.
num, var = table[value]
replace instr with copy of var

else:
# A newly computed value.
num = fresh value number

dest = instr.dest
if instr will be overwritten later:
dest = fresh variable name
instr.dest = dest
else:
dest = instr.dest

table[value] = num, dest

for a in instr.args:
replace a with table[var2num[a]].var

var2num[instr.dest] = num
``````

You can see my implementation in `lvn.py` in the examples directory in the Bril repository. But seriously, don’t be tempted! You want to write your implementation without looking at mine!

As part of your tasks for this lesson, you will implement your first two optimizations. The two main things you want your optimizations to do are:

1. Not break programs.
2. Make programs faster, most of the time.

As with every task in this class, part of the work is checking that you have done what you set out to do—in this case, that your optimizations do those two things. Think carefully about how to make a convincing case for each of those criteria.

One tempting methodology might be to handwrite a few small test-case Bril programs (or, worse, borrow the woefully inadequate ones sitting around in the Bril git repository), run them through your optimizations, and look at them to check whether they look right. This does not amount to convincing evidence (maybe you can think of a few specific reasons why).

While there are many ways to be convincing, a pretty good way might be to run your optimization on every single available Bril benchmark, systematically check that it still produces the right output for at least one input, and collect aggregate statistics about some metric you’re interested in. This is a nice way to check for unexpected behavior in programs that you didn’t carefully write yourself to test the cases you’re thinking of.

If this is the route you choose, you can do it however you like, I have made a simple tool that you can consider using, called Brench. Brench is not very fancy; it does three things:

1. It makes it easy to run a long list of inputs through several different commands. (For example, you can run a long list of Bril benchmarks through an “interpret” command and an “optimize-and-then-interpret” command.)
2. It checks that all the commands agree on their output. (So, in our use case, it checks that optimizing the benchmark doesn’t change its output when interpreted.)
3. It can collect a statistic from each command for comparison. (Like the number of dynamic instructions the interpreter executed, which is a pretty good metric for standard optimizations.)

Those three things are probably what you want to do to make a convincing case for an optimization’s correctness and effectiveness, whether or not you use Brench. It’s there if you want it, but feel free to go your own way!