[bracket]: Bridging Bril and Racket Educational Compilers

by Joseph Maheshe December 16, 2025

Code: https://github.com/maheshejs/cs6120pr

What was the goal?

The goal of this project was to connect two educationally-focused intermediate languages: Racket’s exprs-lang-v9 subset from UBC’s CPSC411 (Compiler Construction) course and Bril, the Big Red Intermediate Language. While Bril already has TypeScript and Rust frontends, these support only limited subsets of their source languages. Racket’s exprs-lang-v9, in contrast, is a more expressive subset that includes first-class procedures, making it an ideal candidate for generating interesting Bril benchmarks. The project aimed to compile exprs-lang-v9 programs—which normally compile all the way down to x64 machine code into Bril programs using the dynamic type extension.

What did I do?

Design Overview

My compiler takes programs in proc-imp-cmf-lang-v7 (an intermediate representation in the CPSC411 compiler pipeline) and lowers them to Bril JSON. This particular IR was chosen because it sits at the sweet spot after all high-level abstractions have been lowered but before architecture-specific concerns like register allocation.

Choosing this intermediate language was itself a significant design decision. Initially, I considered starting very close to assembly since Bril is close to assembly. However, x64 and lower-level languages in the CPSC411 pipeline allow copying a label to a location (treating code as data), which Bril does not support. This would have required designing an abstract interpreter to uncover labels in locations and abstract physical locations to abstract locations, plus extensive instruction patching. That turned out to be cumbersome.

Instead, I chose proc-imp-cmf-lang-v7, which perfectly satisfied my requirements:

It already uses abstract locations (before register allocation and spillage)
It hasn’t yet transformed calls into low-level calling conventions
Its predicates can be easily transformed to basic blocks with appropriate labels
Calls can be directly translated to Bril’s function calls

Here’s the grammar for proc-imp-cmf-lang-v7:

proc-imp-cmf-lang-v7 : grammar?

p         ::= (module (define label (lambda (aloc ...) entry)) ... entry)
entry     ::= tail
pred      ::= (relop opand opand) | (true) | (false) | (not pred)
          |   (begin effect ... pred) | (if pred pred pred)
tail      ::= value | (call triv opand ...) 
          |   (begin effect ... tail) | (if pred tail tail)
value     ::= triv | (binop opand opand) | (call triv opand ...)
effect    ::= (set! aloc value) | (begin effect ... effect)
          |   (if pred effect effect)
opand     ::= int64 | aloc
triv      ::= opand | label
binop     ::= * | + | - | bitwise-and | bitwise-ior 
          |   bitwise-xor | arithmetic-shift-right
relop     ::= < | <= | = | >= | > | !=
aloc      ::= aloc?
label     ::= label?
int64     ::= int64?

The programs being compiled originate from exprs-lang-v7, a much higher-level source language with dynamic types and first-class data. Here’s its grammar:

exprs-lang-v7 : grammar?

p         ::= (module (define x (lambda (x ...) value)) ... value)
value     ::= triv | (let ([x value] ...) value)
          |   (if value value value) | (call value value ...)
triv      ::= x | fixnum | #t | #f | empty 
          |   (void) | (error uint8) | ascii-char-literal
x         ::= name? | prim-f
prim-f    ::= binop | unop
binop     ::= * | + | - | eq? | < | <= | > | >=
unop      ::= fixnum? | boolean? | empty? | void?
          |   ascii-char? | error? | not
uint8     ::= uint8?
ascii-char-literal ::= ascii-char-literal?
fixnum    ::= int61?

The key design challenge was bridging the semantic gap between these educational IRs. Racket’s exprs-lang-v7 uses a dynamically-typed, tagged object representation where values are ptrs (64-bit words with tag bits encoding type information), while Bril uses explicit any types with runtime type checking.

Implementation: The `bracket` Pass

The compiler is implemented as a single pass called bracket that performs a complete translation from proc-imp-cmf-lang-v7 to Bril JSON.

Tagged Object Representation Translation

The source language uses a sophisticated tagging scheme where:

Fixnums use 3 least-significant bits as a primary tag (#b000), with the remaining 61 bits holding the actual integer value shifted left by 3
Non-fixnum immediates use a 3-bit primary tag (#b110) plus 5 additional bits for secondary tags

This representation enables efficient operations: for example, (* 8 n) directly represents fixnum n, allowing addition and subtraction without untagging since 8x + 8y = 8(x + y).

Since Bril’s core instruction set doesn’t include bitwise operations like bitwise-and, bitwise-ior, bitwise-xor, and arithmetic-shift-right, I used GenAI to generate a standard library that implements these operations using only Bril’s integer arithmetic and boolean logic:

@bit(x: int): int - Extracts the least significant bit
@bitand(a: int, b: int): int - Implements bitwise AND through loops
@bitor(a: int, b: int): int - Implements bitwise OR
@bitxor(a: int, b: int): int - Implements bitwise XOR
@ashr(x: int, n: int): int - Implements arithmetic shift right with sign extension

I implemented the logic in bracket to recognize when a binop is a bitwise operation and generate the appropriate calls to these standard library functions. These functions are prepended to every compiled program.

Control Flow Translation

The source IR uses predicates (relops, boolean combinators) that must be converted to Bril’s branch instructions. My compile-pred function:

Translates relops directly to Bril comparison operations
Handles != by compiling as (not (= op1 op2))
Transforms not, (true), and (false) appropriately
Threads effects through predicate computation
Handles nested if expressions by creating fresh labels

Function Calls and Tail Calls

The compiler distinguishes between tail calls and non-tail calls:

Tail calls in main untag the result (divide by 8) before printing. Since print is not supported in exprs-lang-v7 and its pipelined IRs, the translation replaces main’s ret instructions with print to output the result in Bril.
Tail calls in other functions emit a ret instruction
Non-tail calls store results in fresh temporaries
All arguments are evaluated to temporaries before calling

Module Structure

The source program consists of a module with multiple function definitions plus a main entry expression. I compile this by:

Creating a Bril function for each defined function with any-typed parameters
Creating a special main function that compiles the module’s entry expression
Prepending the standard library functions to the output JSON

Technical Details

The compiler uses helper functions for:

Converting abstract locations and labels to Bril variable/function names
Generating unique temporary variable and label names
Creating Bril instruction JSON structures
Handling operands (integers need const instructions, variables are used directly)
Accumulating instructions through emit closures for straightforward linear translation

Were you successful?

I evaluated success through correctness and expressiveness, with analysis of translation overhead.

Correctness Evaluation

I reimplemented 10 core Bril benchmarks in exprs-lang-v7 and thoroughly tested bracket’s translation by comparing interpreter results:

ackermann, delannoy, fitsinside, loopfact, recfact
catalan, fib_recursive, gcd, mccarthy91, triangle

All 10 benchmarks produced correct outputs matching the original Bril versions.

Lines of Code Comparison

Benchmark	SLOC (exprs-lang-v7)	TLOC (Bril)*	Original LOC	Overhead
`ackermann`	10	155	31	5.0×
`catalan`	16	309	33	9.4×
`delannoy`	11	160	63	2.5×
`fib_recursive`	11	182	39	4.7×
`fitsinside`	8	129	15	8.6×
`gcd`	13	149	40	3.7×
`loopfact`	12	181	27	6.7×
`mccarthy91`	8	179	22	8.1×
`recfact`	8	181	31	5.8×
`triangle`	8	134	20	6.7×

*Excluding standard library (132 LOC)

Dynamic Instruction Count Comparison

Benchmark	Dynamic Instructions (Bril)	Original Dynamic Instructions	Overhead
`ackermann`	100,136,893	1,464,231	68.4×
`catalan`	40,068,566	659,378	60.8×
`delannoy`	242,519,048	5,748,752	42.2×
`fib_recursive`	132,182	2,693	49.1×
`fitsinside`	1,242	10	124.2×
`gcd`	3,407	46	74.1×
`loopfact`	9,988	116	86.1×
`mccarthy91`	191,429	1,385	138.2×
`recfact`	8,265	104	79.5×
`triangle`	31,684	188	168.5×

Average overhead: 89.1× dynamic instructions, 6.2× static lines of code

Expressiveness of exprs-lang-v7

exprs-lang-v7 is significantly more expressive than Bril for writing these benchmarks. Most could be written in under 15 lines. For example, factorial using recursion:

(module
    (define fact
      (lambda (n)
        (if (call <= n 1)
          1
          (call * (call fact (call - n 1)) n))))
    (let ([arg.x 8])
      (call fact arg.x)))

This concise implementation demonstrates the expressive power of the Racket subset with first-class functions, natural recursion, and nested expressions.

Translation Overhead Analysis

However, this expressiveness comes at significant cost. The factorial example translates to approximately 180 lines of Bril code (not counting the standard library) and executes 8,265 dynamic instructions compared to 104 dynamic instructions in the original hand-written Bril benchmark.

This dramatic 79× overhead comes primarily from:

Tagged object representation: Every integer operation requires tag checking, tagging, and untagging
Dynamic type checking: Safe primitive operations perform runtime type checks
Bitwise operation simulation: Each bitwise operation expands to a loop with 15-20+ instructions

This demonstrates the fundamental tradeoff: exprs-lang-v7’s high-level abstractions and dynamic typing make writing correct programs much easier, but translation to Bril’s lower-level model introduces substantial overhead. The generated code is correct but far from optimal.

Limitations and Scope Decisions

First-Class Procedures and Closures

I had originally planned to include 5 additional benchmarks using higher-order functions and continuation-passing style. However, fully implementing exprs-lang-v9 with closure conventions proved infeasible because:

Bril doesn’t support labels-as-values
Supporting true closures would require defunctionalization, converting higher-order functions into first-order data structures plus a dispatcher, a compiler technique that I learned during the project
This transformation seemed complex enough to warrant its own project

Heap Allocation and Structured Data

exprs-lang-v9 builds on exprs-lang-v8, which introduces structured data (pairs, vectors) and heap allocation. While the primitives map fairly directly to Bril’s memory extension, there are subtle differences:

exprs-lang-v8 doesn’t free memory; manual memory management in Bril would be required
exprs-lang-v8 allows byte-level offsets; Bril’s ptradd increments by word size (8 bytes)
These differences would require additional translation logic and memory management infrastructure

These limitations meant higher-order and structured-data benchmarks remained out of scope.

Quality of Lowering

Despite the overhead, the generated code has good structural properties:

Control flow is direct with no unnecessary jumps
Variable usage is straightforward (one definition per abstract location)
The call graph mirrors the source program exactly
The main inefficiency is bitwise operations; with native Bril support, generated code would be much more efficient

Conclusion

This project successfully bridges two educational compiler infrastructures, enabling Racket’s subset to generate Bril benchmarks. The implementation handles complete translation from proc-imp-cmf-lang-v7 to Bril JSON, including integration with a GenAI-generated standard library for bitwise operations.

While there are significant limitations—particularly around first-class procedures, closures, and the 79× overhead from tag operations—the compiler produces correct code for a substantial subset of programs. The 10 benchmark translations demonstrate that the core infrastructure works reliably.

The project reveals important insights about semantic gaps between high-level, dynamically-typed educational languages and lower-level IRs. The dramatic overhead suggests that either:

Bril should consider adding native bitwise operations as a core extension
Compilers targeting Bril from dynamic languages should consider alternative representation strategies

Future Work

Several extensions would make this compiler more powerful:

Defunctionalization for closures to support true first-class procedures
Memory extension support for heap-allocated structures (pairs, vectors)
Optimization during lowering to avoid unnecessary tag operations when types can be inferred statically
Native bitwise operations in Bril core to eliminate standard library overhead
Better calling conventions using Bril’s speculative extension for optimistic tag assumptions

The code demonstrates that educational IRs can interoperate successfully, even across very different semantic models. Racket’s pedagogical compiler infrastructure can serve as a rich source of interesting, expressive Bril programsthough the translation overhead suggests this approach is best suited for correctness testing and educational purposes rather than performance benchmarking.

GenAI Usage: I used Gemini for the standard library.

The CS 6120 Course Blog