The CS 6120 Course Blog

[bracket]: Bridging Bril and Racket Educational Compilers

by Joseph Maheshe

Code: https://github.com/maheshejs/cs6120pr

What was the goal?

The goal of this project was to connect two educationally-focused intermediate languages: Racket’s exprs-lang-v9 subset from UBC’s CPSC411 (Compiler Construction) course and Bril, the Big Red Intermediate Language. While Bril already has TypeScript and Rust frontends, these support only limited subsets of their source languages. Racket’s exprs-lang-v9, in contrast, is a more expressive subset that includes first-class procedures, making it an ideal candidate for generating interesting Bril benchmarks. The project aimed to compile exprs-lang-v9 programs—which normally compile all the way down to x64 machine code into Bril programs using the dynamic type extension.

What did I do?

Design Overview

My compiler takes programs in proc-imp-cmf-lang-v7 (an intermediate representation in the CPSC411 compiler pipeline) and lowers them to Bril JSON. This particular IR was chosen because it sits at the sweet spot after all high-level abstractions have been lowered but before architecture-specific concerns like register allocation.

Choosing this intermediate language was itself a significant design decision. Initially, I considered starting very close to assembly since Bril is close to assembly. However, x64 and lower-level languages in the CPSC411 pipeline allow copying a label to a location (treating code as data), which Bril does not support. This would have required designing an abstract interpreter to uncover labels in locations and abstract physical locations to abstract locations, plus extensive instruction patching. That turned out to be cumbersome.

Instead, I chose proc-imp-cmf-lang-v7, which perfectly satisfied my requirements:

Here’s the grammar for proc-imp-cmf-lang-v7:

proc-imp-cmf-lang-v7 : grammar?

p         ::= (module (define label (lambda (aloc ...) entry)) ... entry)
entry     ::= tail
pred      ::= (relop opand opand) | (true) | (false) | (not pred)
          |   (begin effect ... pred) | (if pred pred pred)
tail      ::= value | (call triv opand ...) 
          |   (begin effect ... tail) | (if pred tail tail)
value     ::= triv | (binop opand opand) | (call triv opand ...)
effect    ::= (set! aloc value) | (begin effect ... effect)
          |   (if pred effect effect)
opand     ::= int64 | aloc
triv      ::= opand | label
binop     ::= * | + | - | bitwise-and | bitwise-ior 
          |   bitwise-xor | arithmetic-shift-right
relop     ::= < | <= | = | >= | > | !=
aloc      ::= aloc?
label     ::= label?
int64     ::= int64?

The programs being compiled originate from exprs-lang-v7, a much higher-level source language with dynamic types and first-class data. Here’s its grammar:

exprs-lang-v7 : grammar?

p         ::= (module (define x (lambda (x ...) value)) ... value)
value     ::= triv | (let ([x value] ...) value)
          |   (if value value value) | (call value value ...)
triv      ::= x | fixnum | #t | #f | empty 
          |   (void) | (error uint8) | ascii-char-literal
x         ::= name? | prim-f
prim-f    ::= binop | unop
binop     ::= * | + | - | eq? | < | <= | > | >=
unop      ::= fixnum? | boolean? | empty? | void?
          |   ascii-char? | error? | not
uint8     ::= uint8?
ascii-char-literal ::= ascii-char-literal?
fixnum    ::= int61?

The key design challenge was bridging the semantic gap between these educational IRs. Racket’s exprs-lang-v7 uses a dynamically-typed, tagged object representation where values are ptrs (64-bit words with tag bits encoding type information), while Bril uses explicit any types with runtime type checking.

Implementation: The bracket Pass

The compiler is implemented as a single pass called bracket that performs a complete translation from proc-imp-cmf-lang-v7 to Bril JSON.

Tagged Object Representation Translation

The source language uses a sophisticated tagging scheme where:

This representation enables efficient operations: for example, (* 8 n) directly represents fixnum n, allowing addition and subtraction without untagging since 8x + 8y = 8(x + y).

Since Bril’s core instruction set doesn’t include bitwise operations like bitwise-and, bitwise-ior, bitwise-xor, and arithmetic-shift-right, I used GenAI to generate a standard library that implements these operations using only Bril’s integer arithmetic and boolean logic:

I implemented the logic in bracket to recognize when a binop is a bitwise operation and generate the appropriate calls to these standard library functions. These functions are prepended to every compiled program.

Control Flow Translation

The source IR uses predicates (relops, boolean combinators) that must be converted to Bril’s branch instructions. My compile-pred function:

Function Calls and Tail Calls

The compiler distinguishes between tail calls and non-tail calls:

Module Structure

The source program consists of a module with multiple function definitions plus a main entry expression. I compile this by:

Technical Details

The compiler uses helper functions for:

Were you successful?

I evaluated success through correctness and expressiveness, with analysis of translation overhead.

Correctness Evaluation

I reimplemented 10 core Bril benchmarks in exprs-lang-v7 and thoroughly tested bracket’s translation by comparing interpreter results:

All 10 benchmarks produced correct outputs matching the original Bril versions.

Lines of Code Comparison

BenchmarkSLOC (exprs-lang-v7)TLOC (Bril)*Original LOCOverhead
ackermann10155315.0×
catalan16309339.4×
delannoy11160632.5×
fib_recursive11182394.7×
fitsinside8129158.6×
gcd13149403.7×
loopfact12181276.7×
mccarthy918179228.1×
recfact8181315.8×
triangle8134206.7×

*Excluding standard library (132 LOC)

Dynamic Instruction Count Comparison

BenchmarkDynamic Instructions (Bril)Original Dynamic InstructionsOverhead
ackermann100,136,8931,464,23168.4×
catalan40,068,566659,37860.8×
delannoy242,519,0485,748,75242.2×
fib_recursive132,1822,69349.1×
fitsinside1,24210124.2×
gcd3,4074674.1×
loopfact9,98811686.1×
mccarthy91191,4291,385138.2×
recfact8,26510479.5×
triangle31,684188168.5×

Average overhead: 89.1× dynamic instructions, 6.2× static lines of code

Expressiveness of exprs-lang-v7

exprs-lang-v7 is significantly more expressive than Bril for writing these benchmarks. Most could be written in under 15 lines. For example, factorial using recursion:

(module
    (define fact
      (lambda (n)
        (if (call <= n 1)
          1
          (call * (call fact (call - n 1)) n))))
    (let ([arg.x 8])
      (call fact arg.x)))

This concise implementation demonstrates the expressive power of the Racket subset with first-class functions, natural recursion, and nested expressions.

Translation Overhead Analysis

However, this expressiveness comes at significant cost. The factorial example translates to approximately 180 lines of Bril code (not counting the standard library) and executes 8,265 dynamic instructions compared to 104 dynamic instructions in the original hand-written Bril benchmark.

This dramatic 79× overhead comes primarily from:

  1. Tagged object representation: Every integer operation requires tag checking, tagging, and untagging
  2. Dynamic type checking: Safe primitive operations perform runtime type checks
  3. Bitwise operation simulation: Each bitwise operation expands to a loop with 15-20+ instructions

This demonstrates the fundamental tradeoff: exprs-lang-v7’s high-level abstractions and dynamic typing make writing correct programs much easier, but translation to Bril’s lower-level model introduces substantial overhead. The generated code is correct but far from optimal.

Limitations and Scope Decisions

First-Class Procedures and Closures

I had originally planned to include 5 additional benchmarks using higher-order functions and continuation-passing style. However, fully implementing exprs-lang-v9 with closure conventions proved infeasible because:

Heap Allocation and Structured Data

exprs-lang-v9 builds on exprs-lang-v8, which introduces structured data (pairs, vectors) and heap allocation. While the primitives map fairly directly to Bril’s memory extension, there are subtle differences:

These limitations meant higher-order and structured-data benchmarks remained out of scope.

Quality of Lowering

Despite the overhead, the generated code has good structural properties:

Conclusion

This project successfully bridges two educational compiler infrastructures, enabling Racket’s subset to generate Bril benchmarks. The implementation handles complete translation from proc-imp-cmf-lang-v7 to Bril JSON, including integration with a GenAI-generated standard library for bitwise operations.

While there are significant limitations—particularly around first-class procedures, closures, and the 79× overhead from tag operations—the compiler produces correct code for a substantial subset of programs. The 10 benchmark translations demonstrate that the core infrastructure works reliably.

The project reveals important insights about semantic gaps between high-level, dynamically-typed educational languages and lower-level IRs. The dramatic overhead suggests that either:

  1. Bril should consider adding native bitwise operations as a core extension
  2. Compilers targeting Bril from dynamic languages should consider alternative representation strategies

Future Work

Several extensions would make this compiler more powerful:

The code demonstrates that educational IRs can interoperate successfully, even across very different semantic models. Racket’s pedagogical compiler infrastructure can serve as a rich source of interesting, expressive Bril programsthough the translation overhead suggests this approach is best suited for correctness testing and educational purposes rather than performance benchmarking.

GenAI Usage: I used Gemini for the standard library.