[bracket]: Bridging Bril and Racket Educational Compilers
Code: https://github.com/maheshejs/cs6120pr
What was the goal?
The goal of this project was to connect two educationally-focused intermediate languages: Racket’s exprs-lang-v9 subset from UBC’s CPSC411 (Compiler Construction) course and Bril, the Big Red Intermediate Language. While Bril already has TypeScript and Rust frontends, these support only limited subsets of their source languages. Racket’s exprs-lang-v9, in contrast, is a more expressive subset that includes first-class procedures, making it an ideal candidate for generating interesting Bril benchmarks. The project aimed to compile exprs-lang-v9 programs—which normally compile all the way down to x64 machine code into Bril programs using the dynamic type extension.
What did I do?
Design Overview
My compiler takes programs in proc-imp-cmf-lang-v7 (an intermediate representation in the CPSC411 compiler pipeline) and lowers them to Bril JSON. This particular IR was chosen because it sits at the sweet spot after all high-level abstractions have been lowered but before architecture-specific concerns like register allocation.
Choosing this intermediate language was itself a significant design decision. Initially, I considered starting very close to assembly since Bril is close to assembly. However, x64 and lower-level languages in the CPSC411 pipeline allow copying a label to a location (treating code as data), which Bril does not support. This would have required designing an abstract interpreter to uncover labels in locations and abstract physical locations to abstract locations, plus extensive instruction patching. That turned out to be cumbersome.
Instead, I chose proc-imp-cmf-lang-v7, which perfectly satisfied my requirements:
- It already uses abstract locations (before register allocation and spillage)
- It hasn’t yet transformed calls into low-level calling conventions
- Its predicates can be easily transformed to basic blocks with appropriate labels
- Calls can be directly translated to Bril’s function calls
Here’s the grammar for proc-imp-cmf-lang-v7:
proc-imp-cmf-lang-v7 : grammar?
p ::= (module (define label (lambda (aloc ...) entry)) ... entry)
entry ::= tail
pred ::= (relop opand opand) | (true) | (false) | (not pred)
| (begin effect ... pred) | (if pred pred pred)
tail ::= value | (call triv opand ...)
| (begin effect ... tail) | (if pred tail tail)
value ::= triv | (binop opand opand) | (call triv opand ...)
effect ::= (set! aloc value) | (begin effect ... effect)
| (if pred effect effect)
opand ::= int64 | aloc
triv ::= opand | label
binop ::= * | + | - | bitwise-and | bitwise-ior
| bitwise-xor | arithmetic-shift-right
relop ::= < | <= | = | >= | > | !=
aloc ::= aloc?
label ::= label?
int64 ::= int64?
The programs being compiled originate from exprs-lang-v7, a much higher-level source language with dynamic types and first-class data. Here’s its grammar:
exprs-lang-v7 : grammar?
p ::= (module (define x (lambda (x ...) value)) ... value)
value ::= triv | (let ([x value] ...) value)
| (if value value value) | (call value value ...)
triv ::= x | fixnum | #t | #f | empty
| (void) | (error uint8) | ascii-char-literal
x ::= name? | prim-f
prim-f ::= binop | unop
binop ::= * | + | - | eq? | < | <= | > | >=
unop ::= fixnum? | boolean? | empty? | void?
| ascii-char? | error? | not
uint8 ::= uint8?
ascii-char-literal ::= ascii-char-literal?
fixnum ::= int61?
The key design challenge was bridging the semantic gap between these educational IRs. Racket’s exprs-lang-v7 uses a dynamically-typed, tagged object representation where values are ptrs (64-bit words with tag bits encoding type information), while Bril uses explicit any types with runtime type checking.
Implementation: The bracket Pass
The compiler is implemented as a single pass called bracket that performs a complete translation from proc-imp-cmf-lang-v7 to Bril JSON.
Tagged Object Representation Translation
The source language uses a sophisticated tagging scheme where:
- Fixnums use 3 least-significant bits as a primary tag (
#b000), with the remaining 61 bits holding the actual integer value shifted left by 3 - Non-fixnum immediates use a 3-bit primary tag (
#b110) plus 5 additional bits for secondary tags
This representation enables efficient operations: for example, (* 8 n) directly represents fixnum n, allowing addition and subtraction without untagging since 8x + 8y = 8(x + y).
Since Bril’s core instruction set doesn’t include bitwise operations like bitwise-and, bitwise-ior, bitwise-xor, and arithmetic-shift-right, I used GenAI to generate a standard library that implements these operations using only Bril’s integer arithmetic and boolean logic:
@bit(x: int): int- Extracts the least significant bit@bitand(a: int, b: int): int- Implements bitwise AND through loops@bitor(a: int, b: int): int- Implements bitwise OR@bitxor(a: int, b: int): int- Implements bitwise XOR@ashr(x: int, n: int): int- Implements arithmetic shift right with sign extension
I implemented the logic in bracket to recognize when a binop is a bitwise operation and generate the appropriate calls to these standard library functions. These functions are prepended to every compiled program.
Control Flow Translation
The source IR uses predicates (relops, boolean combinators) that must be converted to Bril’s branch instructions. My compile-pred function:
- Translates relops directly to Bril comparison operations
- Handles
!=by compiling as(not (= op1 op2)) - Transforms
not,(true), and(false)appropriately - Threads effects through predicate computation
- Handles nested
ifexpressions by creating fresh labels
Function Calls and Tail Calls
The compiler distinguishes between tail calls and non-tail calls:
- Tail calls in
mainuntag the result (divide by 8) before printing. Sinceprintis not supported in exprs-lang-v7 and its pipelined IRs, the translation replacesmain’sretinstructions withprintto output the result in Bril. - Tail calls in other functions emit a
retinstruction - Non-tail calls store results in fresh temporaries
- All arguments are evaluated to temporaries before calling
Module Structure
The source program consists of a module with multiple function definitions plus a main entry expression. I compile this by:
- Creating a Bril function for each defined function with
any-typed parameters - Creating a special
mainfunction that compiles the module’s entry expression - Prepending the standard library functions to the output JSON
Technical Details
The compiler uses helper functions for:
- Converting abstract locations and labels to Bril variable/function names
- Generating unique temporary variable and label names
- Creating Bril instruction JSON structures
- Handling operands (integers need const instructions, variables are used directly)
- Accumulating instructions through emit closures for straightforward linear translation
Were you successful?
I evaluated success through correctness and expressiveness, with analysis of translation overhead.
Correctness Evaluation
I reimplemented 10 core Bril benchmarks in exprs-lang-v7 and thoroughly tested bracket’s translation by comparing interpreter results:
ackermann,delannoy,fitsinside,loopfact,recfactcatalan,fib_recursive,gcd,mccarthy91,triangle
All 10 benchmarks produced correct outputs matching the original Bril versions.
Lines of Code Comparison
| Benchmark | SLOC (exprs-lang-v7) | TLOC (Bril)* | Original LOC | Overhead |
|---|---|---|---|---|
ackermann | 10 | 155 | 31 | 5.0× |
catalan | 16 | 309 | 33 | 9.4× |
delannoy | 11 | 160 | 63 | 2.5× |
fib_recursive | 11 | 182 | 39 | 4.7× |
fitsinside | 8 | 129 | 15 | 8.6× |
gcd | 13 | 149 | 40 | 3.7× |
loopfact | 12 | 181 | 27 | 6.7× |
mccarthy91 | 8 | 179 | 22 | 8.1× |
recfact | 8 | 181 | 31 | 5.8× |
triangle | 8 | 134 | 20 | 6.7× |
*Excluding standard library (132 LOC)
Dynamic Instruction Count Comparison
| Benchmark | Dynamic Instructions (Bril) | Original Dynamic Instructions | Overhead |
|---|---|---|---|
ackermann | 100,136,893 | 1,464,231 | 68.4× |
catalan | 40,068,566 | 659,378 | 60.8× |
delannoy | 242,519,048 | 5,748,752 | 42.2× |
fib_recursive | 132,182 | 2,693 | 49.1× |
fitsinside | 1,242 | 10 | 124.2× |
gcd | 3,407 | 46 | 74.1× |
loopfact | 9,988 | 116 | 86.1× |
mccarthy91 | 191,429 | 1,385 | 138.2× |
recfact | 8,265 | 104 | 79.5× |
triangle | 31,684 | 188 | 168.5× |
Average overhead: 89.1× dynamic instructions, 6.2× static lines of code
Expressiveness of exprs-lang-v7
exprs-lang-v7 is significantly more expressive than Bril for writing these benchmarks. Most could be written in under 15 lines. For example, factorial using recursion:
(module
(define fact
(lambda (n)
(if (call <= n 1)
1
(call * (call fact (call - n 1)) n))))
(let ([arg.x 8])
(call fact arg.x)))
This concise implementation demonstrates the expressive power of the Racket subset with first-class functions, natural recursion, and nested expressions.
Translation Overhead Analysis
However, this expressiveness comes at significant cost. The factorial example translates to approximately 180 lines of Bril code (not counting the standard library) and executes 8,265 dynamic instructions compared to 104 dynamic instructions in the original hand-written Bril benchmark.
This dramatic 79× overhead comes primarily from:
- Tagged object representation: Every integer operation requires tag checking, tagging, and untagging
- Dynamic type checking: Safe primitive operations perform runtime type checks
- Bitwise operation simulation: Each bitwise operation expands to a loop with 15-20+ instructions
This demonstrates the fundamental tradeoff: exprs-lang-v7’s high-level abstractions and dynamic typing make writing correct programs much easier, but translation to Bril’s lower-level model introduces substantial overhead. The generated code is correct but far from optimal.
Limitations and Scope Decisions
First-Class Procedures and Closures
I had originally planned to include 5 additional benchmarks using higher-order functions and continuation-passing style. However, fully implementing exprs-lang-v9 with closure conventions proved infeasible because:
- Bril doesn’t support labels-as-values
- Supporting true closures would require defunctionalization, converting higher-order functions into first-order data structures plus a dispatcher, a compiler technique that I learned during the project
- This transformation seemed complex enough to warrant its own project
Heap Allocation and Structured Data
exprs-lang-v9 builds on exprs-lang-v8, which introduces structured data (pairs, vectors) and heap allocation. While the primitives map fairly directly to Bril’s memory extension, there are subtle differences:
- exprs-lang-v8 doesn’t free memory; manual memory management in Bril would be required
- exprs-lang-v8 allows byte-level offsets; Bril’s
ptraddincrements by word size (8 bytes) - These differences would require additional translation logic and memory management infrastructure
These limitations meant higher-order and structured-data benchmarks remained out of scope.
Quality of Lowering
Despite the overhead, the generated code has good structural properties:
- Control flow is direct with no unnecessary jumps
- Variable usage is straightforward (one definition per abstract location)
- The call graph mirrors the source program exactly
- The main inefficiency is bitwise operations; with native Bril support, generated code would be much more efficient
Conclusion
This project successfully bridges two educational compiler infrastructures, enabling Racket’s subset to generate Bril benchmarks. The implementation handles complete translation from proc-imp-cmf-lang-v7 to Bril JSON, including integration with a GenAI-generated standard library for bitwise operations.
While there are significant limitations—particularly around first-class procedures, closures, and the 79× overhead from tag operations—the compiler produces correct code for a substantial subset of programs. The 10 benchmark translations demonstrate that the core infrastructure works reliably.
The project reveals important insights about semantic gaps between high-level, dynamically-typed educational languages and lower-level IRs. The dramatic overhead suggests that either:
- Bril should consider adding native bitwise operations as a core extension
- Compilers targeting Bril from dynamic languages should consider alternative representation strategies
Future Work
Several extensions would make this compiler more powerful:
- Defunctionalization for closures to support true first-class procedures
- Memory extension support for heap-allocated structures (pairs, vectors)
- Optimization during lowering to avoid unnecessary tag operations when types can be inferred statically
- Native bitwise operations in Bril core to eliminate standard library overhead
- Better calling conventions using Bril’s speculative extension for optimistic tag assumptions
The code demonstrates that educational IRs can interoperate successfully, even across very different semantic models. Racket’s pedagogical compiler infrastructure can serve as a rich source of interesting, expressive Bril programsthough the translation overhead suggests this approach is best suited for correctness testing and educational purposes rather than performance benchmarking.
GenAI Usage: I used Gemini for the standard library.