Lesson 11: Dynamic Compilers

Gist

Motivation

So far, the compilers you have written in 6120 have been ahead-of-time (AOT) or static compilers: they compile the code to some other form in one phase, and then execution happens later—and has nothing to do with the compiler. Just-in-time (JIT) or dynamic compilers mix together compilation and execution: they translate your code when it's time to run it, so they are intimately involved in program execution.

Why would you want a dynamic compiler? There are two main reasons:

Here are some semi-related terminology rants:

Anatomy of a JIT

Here's what you need to make a basic JIT:

Refining things further, there are two main strategies for JIT compilation:

One final dimension in the JIT design space is tiering. You can think of the interpreter and the compiler in the basic recipe above as different optimization tiers: one that's fast to get started with but slow to execute, and one that's slower to start up but makes actual execution go much faster. To further mitigate the trade-off between startup (compilation) time and steady-state (execution) performance, modern JITs often use many more than two tiers: an interpreter and a whole suite of different compilers, from quick and dirty to slow and awesome. For example, see this fairly old blog post from the JavaScriptCore developers that breaks down the three tiers they had in 2014 and motivated the addition of their fourth tier.

A Bit More on Tracing

As an example, let's try tracing this program:

function main(x) {
  y = x + 1;
  if (x < 100) {
    z = f(y);
  } else {
    z = g(y);
  }
  return z;
}

function f(a) {
  return a - 1;
}

Don't worry about g; we won't use it in this walkthrough.

Imagine that we have decided that the main function is hot enough to warrant compiling. The general JIT idea is that we'll execute it once and emit some code that optimizes for the case that future executions will be similar to the one we observe. In a tracing JIT specifically, the plan is to produce code for the single path through the CFG that the execution takes for that given input, rather than trying to compile the entire function.

So say that we run main(42). Then as we trace execution, we will first run:

y = x + 1;

And we start building our trace by emitting exactly the same code. We next need to trace the condition, x < 100. Because the condition is true in this execution, we want to emit a trace that assumes that it will be true—i.e., it only includes the code for the "then" branch of this if. But we also want our emitted trace to check that this assumption still holds. To do this, we'll emit a special guard statement in our trace that "bails out" to deoptimization if the condition does not hold. So our trace after this step looks like:

y = x + 1;
guard(x < 100);

We can continue tracing in the appropriate branch of the if. Let's try an intraprocedural style first—we can just copy and paste the code as we execute it, appending it to the end of the trace:

y = x + 1;
guard(x < 100);
z = f(y);
return z;

This completes the trace from the entry of main. In a proper JIT, we could then compile this straight-line (modulo guards) code to machine instructions. Whenever execution reaches the top of main in the future, we can jump to this code instead of interpreting the function. If the guard fails, we will need to deoptimize—returning to slower, correct execution.

In this trace, we "stepped over" the invocation of f—but an even more convenient thing to do is to trace interprocedurally, following the control flow through the call. That would produce an even simpler trace like this:

y = x + 1;
guard(x < 100);
a = y;
z = a - 1;
return z;

The nice thing about tracing is that it produces straight-line code that can be very amenable to optimization. This code doesn't modify variables (it happens to be in SSA form), for example, so it's safe to reorder things—importantly, we can put the guard up front to fail fast:

guard(x < 100);
y = x + 1;
a = y;
z = a - 1;
return z;

Then a simple LVN pass with canonicalization can clean things up quite a bit:

guard(x < 100);
return x;

So in the case where this trace applies, i.e., x is less than 100, we can eliminate code that would be necessary in a more straightforward compilation.

Tasks

The task is to implement a trace-based speculative optimizer for Bril. You'll implement the same concept as in a tracing JIT, but in a profile-guided AOT setting: profiling, transformation, and execution will be distinct phases. The idea is to implement the "heavy lifting" for a trace-based JIT without needing all the scaffolding that a complete JIT requires, such as on-stack replacement.

Concretely, there are two main phases:

Start by reading the documentation for the speculation extension (and watch the video!). That should give you an idea of what's required to augment a program with a speculative execution of an extracted trace. Then make a plan for how you'll hack the interpreter to produce one of those traces.

Here's a recipe: