The CS 6120 Course Blog

It's About (wasm)Time: A Bril to Wasm Translator

by Annabel Baniak, Michael Xing, Serena Duncan

Project Goal

Our goal was to build a compiler from Bril to WebAssembly. To do this, we collaborated with another group consisting of Gerardo Teruel and Dev Patel, who worked on the “Brilooped” project; this project took Bril, a CFG-based language with unstructured control flow, and changed it to a structured control flow language by creating a relooper for Bril. This means that all the branch and jump statements in Bril were replaced by while/continue/break statements. After agreeing on a shared syntax for the output of their project as the input of ours, we were able to take their project as finished and compile the Brilooped syntax to WebAssembly, which also uses structured control flow.

Implementation

We built a tool to compile Bril programs into wasm, and then executed the resulting program using Wasmtime. This allowed us to test our output WebAssembly programs for correctness and performance against the original equivalent Bril programs.

The bulk of the project was matching across the input JSON Brilooped instruction types, and converting them linearly into WebAssembly instructions. For the most part, we used a simple system of storing all variables in WebAssembly locals and shuttling them onto and off the stack only when needed for each instruction. This may not be as efficient as a more sophisticated register allocation algorithm, but it allowed us to guarantee correctness with high confidence.

We anticipated that the most difficult instructions to translate would be those involving control flow– namely, if, while, break, and continue, as these would have to be translated while keeping in mind the context of the rest of the program; for example, how many nested while/if statements we were breaking out of. Luckily, we were able to convert Bril if into the WebAssembly if instruction directly, simply changing the syntax as otherwise the instructions act statically the same. For while commands, there was a slight wrinkle to address, which was the fact that WebAssembly only allows code to jump to the top of a loop, and not out of one. This made implementing break difficult. To solve this problem, we wrapped an outer WebAssembly block with an inner loop, since block constructs can be arbitrarily exited from. Thus, any continue statements could jump back to the top of the inner loop, while break could then jump out of the outer block. To implement the proper semantics of nested loops, we maintained a stack of label names at compile-time, one label for each nested loop, so when we needed to break multiple layers back, we could simply pop an appropriate number of entries off our stack to find the proper jump target.

Printing

One thing to note is that we used Gemini to generate a runtime library, which provides functions to print base values– ints and bools– to the terminal. We could then statically link this into our output WebAssembly program that we generated. We made this choice because this process of printing WebAssembly values was highly nontrivial but also extremely non-interesting to the aims of the project, namely the translation between Bril (in Brilooped form) and WebAssembly. However, we found that using Gemini was nearly as much of a struggle as writing the code ourselves; it took nearly 10 back and forth interactions to get something that was even compiled.

Challenges

In general, we found that the earliest parts of the project were the most difficult. At first, we had to decide what language to work in, as the three of us had been using different languages for our 6120 homework assignments all semester long; Serena using Rust, Annabel using Python, and Michael using TypeScript. Eventually, we decided TypeScript would be the most natural choice for this project due to its robust type system and ease of learning. However, it was still difficult as Serena and Annabel had limited experience using this language.

Crucially, no one on our team had substantial prior experience with WebAssembly itself. This made it difficult at the very start of the project to know how to even begin– we went crawling through the WebAssembly documentation, but unfortunately most of the examples and documentation we found online did not match the kind of thing we needed from the documentation; the general use case for WebAssembly is compiling into it using someone else’s compiler program, so much of the documentation is geared toward this use-case rather than encoding in WebAssembly itself. Much of the first half of the time we spent working on the project was simply sitting with the WebAssembly documentation, trying to understand the way the system worked (we spent, for example, a good amount of time pursuing an implementation to emit the bytecode format of WebAssembly code rather than the text format, before deciding not to continue with this direction, nullifying a good deal of work).

Another challenge was working with code created by another group in the class, who were working toward the same deadline we were. Throughout the course of their Brilooped project, they naturally ran into adjustments to the Brilooped format, which then cascaded into affecting our project as well; as the two were developing concurrently, we did not have a well of extensive Briloop documentation to refer back to when developing our programs. However, we did have the advantage of being able to talk to Dev and Gerardo in real time. We could, and did, ask for help directly whenever we ran into trouble, and they were really really helpful in helping us get our input from Briloop working. We’re incredibly grateful for their help throughout this process.

One more challenging aspect of the project was, for the limited capacity in which we used Gemini, wrangling the LLM to produce actually correct (or even executable) code; though it still probably saved us the time it would take to do the uninteresting bit ourselves, it was certainly not an effortless “get out of jail free” card.

Analysis

Testing methodology

We tested our code thoroughly on small examples as we implemented core functionality. This allowed us to catch errors before our code became too complex to fix. After completing our implementation, we tested on the Bril benchmarks core test suite. To do this, we first took Briloop core benchmark translations, provided by the Briloop group, and translated them into JSON files using their Briloop2JSON command. After, we ran the files through our code to output the WebAssembly translation, then ran this translation using Wasmtime. We compared the output of our WebAssembly files to the output from the original Bril files. With this method, we were able to confirm correctness, in the sense that the WebAssembly output program behaved identically on our test cases to the original Bril programs (1 test per benchmark).

Note that our WebAssembly runtime prints all arguments one per line, instead of multiple arguments on the same line. Thus, our test cases ignored whitespace differences.

Performance

Fortunately (and perhaps unsurprisingly), the WebAssembly executed significantly faster in Wasmtime than the original Bril interpreter could run Bril files. Using a script that runs all of the core benchmarks in sequence using the provided arguments, we benchmarked performance using Hyperfine on a Windows machine with an AMD Ryzen 9 6900HS processor with Deno 2.3.1 and Wasmtime 34.0.0. The original Bril interpreter took a mean of 3.883 seconds with a standard deviation of 0.106 seconds to run through the entire folder. The WebAssembly versions took only 1.043 seconds with a standard deviation of 0.044 seconds. This is an observed speed increase of over 73.1%.

Notes and Future Work

There is some future work to explore. For example, we currently do not support Bril programs that have variables change types after declaration and during execution; this kind of instruction is available in both Bril and WebAssembly syntax.

Also, to replace our Gemini-generated runtime library that we used to print test cases, we could, instead of writing print functions directly in WebAssembly S-expressions, write them in some other language and then compile them to wasm, using Clang or some similar process. This would be a perhaps more efficient way to complete this step of execution.

As we noted above, our WebAssembly output prints arguments each on their own line, rather than on the same line. To make our project work with arguments printed on the same line, using the method of compiling our printing functions using Clang or something similar, we could simply take in a command like ‘print a b c’ and compile it to print commands which thread in spaces between each argument.

We currently only support core Bril instructions. Implementing floats proved to be rather tricky as we would have needed to write our own float printing function in WebAssembly. Since the WebAssembly documentation is rather sparse for how to actually code in the language, we decided to leave this as work for the future. For memory, the way WebAssembly handles memory is very different from how Bril handles it. Support for memory would require digging much further into the WebAssembly (and Bril) internals than we had time for, so this was also left for the future.