Lesson 14: Fast Compilers
Gist
- The motivation for fast compilers.
- The motivation in dynamic compilers is probably self evident.
- In AOT compilers, I personally think compiler performance matters in two regimes:
- For small projects, it affects programmer productivity via the compile-edit-run cycle. (Can you compare how you felt while programming in (for example) Scala or C++ vs. Go or OCaml?)
- For large projects, compilation time can be a huge resource expenditure. If it takes multiple hours to rebuild your main product, can you think of “business” reasons why a company might care about compiler efficiency?
- The ingredients in compiler performance.
- Asymptotic complexity of the algorithms.
- Host-language efficiency taxes.
- Complexity of the source language’s grammar (people love to focus on this one).
- Data structures.
- Some evidence-free generalities about compiler performance.
- The terrible thing is that compilers tend to have a flat profile: there is no single performance bottleneck. Making a compiler go faster usually requires a huge number of small optimizations.
- See, for example, Nicholas Nethercote’s series of blog posts titled “How to speed up the Rust compiler in…”, which illustrate some of these small wins in a production compiler.
- This is related to why compilers are so often single-threaded, and why there are almost no (but not zero) GPU-accelerated compilers.
- It is therefore a good idea to proactively focus on performance at the design stage (even though this is “the root of all evil.”)
- The terrible thing is that compilers tend to have a flat profile: there is no single performance bottleneck. Making a compiler go faster usually requires a huge number of small optimizations.
- Let’s focus on data structures.
- One reason: It’s a cross-cutting way to address performance across a flat profile.
- We will learn about one extremely simple trick that you can apply in many ways. It is definitely not the only ingredient in a fast compiler, nor is it even clearly the most important one. But it is probably an important one.
- Proceed to this blog post about data structure flattening.
- Construct a simple interpreter, and try it out.
- Introduce a flat representation, and adjust the
interp
function accordingly. - Discuss:
- Why should this change be good for performance?
- Are there any implications for ergonomics?
- A little performance evaluation.
- A random program generator, so we can skip I/O.
- I endorse Hyperfine for quick-and-dirty measurements like this:
hyperfine -w1 -r3 './target/release/flatcalc gen_interp 1234'
- I endorse samply as a simple way to see where the time goes:
samply record --no-open -- ./target/profiling/flatcalc gen_interp 1234
- Use my
bench.sh
to reproduce a full set of results.
- Consider a different kind of interpreter that exploits the flat representation. Reflect on how we have essentially reinvented the idea of a bytecode interpreter.
Tasks
There are no tasks for this lesson. Good luck on your course project!
This is the last lesson of CS 6120. If you’re taking the self-guided version of the course, please fill out this feedback form. I would love to know who has taken the class—and I want your opinions about how to make it better!