SELF love: The origins of dynamic compilers
Background
In the late 1980s, dynamic languages were admired for their expressive power yet dismissed as impractical for performance-critical work. The fastest Smalltalk runtimes of the day still ran an order of magnitude slower than optimized C, leaving object-oriented enthusiasts to choose between elegance and efficiency. It is in this landscape that Chambers, Ungar, and Lee introduced SELF—a language that eliminates classes entirely, treats every state access as a message send, and relies on prototypes for inheritance. Their 1989 OOPSLA paper, An Efficient Implementation of SELF, a Dynamically-Typed Object-Oriented Language Based on Prototypes, is not a language manifesto so much as a bold engineering claim: with the right virtual-machine architecture, these radical semantics need not carry a performance tax.
Highlights
The authors begin with a discussion of downsides to typical implementations of SELF. First, the object prototype model, while it allows incredible flexibility, drastically increases the storage required for any SELF program. Secondly, message passing induces large penalties on runtime, as searching for a matching object slot can be very costly due to SELF’s multiple inheritance. The authors attack these twin costs of space and time with a set of mutually reinforcing techniques:
-
Maps. Maps provide implicit “hidden classes” for prototypes. Instead of storing slot names in every object, the VM factors layout and constant slots into a shared map. A Cartesian point shrinks from ten machine words to three, matching the compactness of class-based objects without reviving classes.
-
Segregated heap for faster scanning. Byte arrays live in their own region, so the garbage collector can sweep ordinary objects word-by-word without pausing to distinguish raw bytes from tagged pointers. This allowed SELF to scan memory at nearly double the rate of the fastest Smalltalk-80 implementation.
-
Customized compilation. At first send, SELF compiles a receiver-specific native method, locking in the receiver’s map and propagating that static knowledge through the body of the method. Message splitting and static type anticipation. When control-flow merges blur types, the compiler clones hot call-sites, keeps a “likely” branch where the receiver’s type is known, and leaves a slower fallback branch for rarer cases. The same mechanism lets the compiler anticipate that “+” or “<” will probably see integers and inlines those primitives aggressively.
-
Zero-friction tooling. Each compiled method records its dependents; thus, a slot update invalidates only the affected machine code, leaving the rest of the system untouched. Rich metadata lets the debugger reconstruct inlined frames so developers see a clean, source-level stack even after heavy optimization.
Taken together, these ideas deliver a dynamic language that is only four to five times slower than C on the same hardware — a leap forward at the time — and lay the conceptual groundwork for the hidden-class JITs that now power JavaScript, Ruby, and more.
Analysis
In the over 30 years since this paper was published, there have been significant advances in the world of fast, efficient JIT compilers. Many of these developments stemmed directly from the groundwork laid in this paper, giving rise to the JIT compilers we know today - including the Java JIT compiler. After this paper was published, the described VM morphed into the Strongtalk engine which, after Sun bought Animorphic Systems in 1997, became the core of Java’s HotSpot VM. HotSpot still identifies “hot” call-sites, speculatively inlines them, and rolls back if reality diverges — precisely the speculative compilation playbook SELF introduced.
Beyond the HotSpot VM, there are many other patterns in modern JIT compilers that can be attributed to this paper.
The idea of utilizing maps to provide hidden classes lives on inside every modern JavaScript engine. V8, SpiderMonkey, and Chakra assign each object a lightweight shape (often literally named Map) so that property look-ups devolve to a pointer compare plus a constant offset. Chrome’s V8 documentation even cites maps as the cornerstone of its object model.
This paper’s monomorphic inline cache cut the cost of a send to a single indirect jump. A few years later, the same group extended the idea to polymorphic inline caches, now standard in V8, HotSpot, PyPy, and Graal. The core intuition—let the first few executions specialize the call-site—remains a workhorse optimization.
Other ideas that are still used today include: speculative compilation, as profile-guided inlining with cheap de-optimizations fuels every tiered JIT pipeline today; code invalidation via dependency lists, which enable the edit-refresh cycles developers now take for granted; and object shape sharing, which is critical for memory as programs continue to get larger (for example: Chrome routinely juggles millions of JS objects; sharing one descriptor per shape tames this footprint).
However, that is not to say that modern JIT compilers are perfect. Indeed, many of the downsides from SELF’s implementation continue to be an issue today. Most notably is the extra startup latency and overhead JIT compilers incure. Customized machine code multiplies binary size, thereby increasing startup time. On mobile, cold-start budgets are measured in tens of milliseconds, prompting projects (such as V8’s Sparkplug baseline) to curb SELF-style mega-methods. Furthermore, the rich PC-to-byte-code maps that make de-optimization and on-the-fly debugging possible also swell metadata. That footprint is incurred even for users who never open DevTools. Finally, prototype mutation remains a fertile ground for attacks such as prototype pollution. Static languages dodge whole classes of such issues by construction.
Looking towards the future
The era of web development was characterized by a constant tug-of-war between static and dynamic compilers. During the 2000s and 2010s, developers embraced the “let the JIT sort it out” mindset and overwhelmingly used JavaScript. Today, the majority of developers are switching to TypeScript, which layers a static type system over top JavaScript. Yet despite this top-level switch in language, at runtime those .ts files are converted to .js files and still funnel into the same hidden-class JS JIT compiler that SELF inspired. The major languages of the 2020s therefore inhabit a middle ground: dynamic under the hood, statically flavored at the surface. That compromise echoes the paper’s own wager that runtime type knowledge, harnessed smartly, can rival compile-time guarantees.
Recently, ARM’s big-little cores, WebAssembly’s rise, and the carbon cost of always-on optimisation are reopening the question of whether speculation is worth its wattage. Yet the conceptual toolkit SELF gifted — inline caches, speculative inlining, object shapes, and more — keeps resurfacing. Though today’s dynamic languages and JIT compilers look very differeny, the DNA is still recognisably SELF.
Discussion Questions:
- Section 6 shows SELF’s compiler emitting scope descriptions and a bidirectional PC↔byte‑code map so the debugger can rebuild inlined stack frames and update them after hot re‑compilation. This metadata adds memory overhead and tooling complexity—problems today’s HotSpot safepoints and V8’s de‑opt paths still wrestle with. Is heavyweight JIT‑aware debugging worth the complexity, or is line‑by‑line interpretation enough?
- One of the main benefits of SELF is that it enables a lot more flexibility and expressiveness. However, the prototype-based system also enables new security bugs, such as prototype pollution, and exhibits generally slower performance than static languages (as evidenced by this paper). Where should language designers draw the line between prototype flexibility and performance/security? When would a prototype language be worth these tradeoffs?
- SELF’s compiler emits a new machine‑code body for each receiver map but the evaluation focuses only on execution time. Modern systems hit instruction cache limits (Android dex, WebAssembly bundle sizes). Customized compilation duplicates code; when does code‑size blow‑up outweigh speed? Have processors’ larger caches and better branch predictors made code bloat less worrisome?
- The authors argue that, with aggressive JITing, “runtime type information is just as good as static type information” for performance. Since 1989, we’ve seen gradual‐typing hybrids (TypeScript, Python mypy), powerful inference (Rust, Swift) and optional dynamic escape hatches in C#. Will the equilibrium keep favoring mixed static/dynamic systems, or will one paradigm “win” the way static scoping did?