A5: RML Interpreter
In this assignment, you will build an interpreter for an OCaml-like language called RML (Robot Meta-Language).
What you’ll do: You will implement the evaluation function for the interpreter of RML and implement some simple programs using the asynchronous features of RML.
- Implement an interpreter for a non-trivial programming language.
- Gain experience with concurrent programming using promises.
Table of contents:
- RML: Overview
- Step 1: Form a Partnership on CMS
- Step 2: Get Started
- Step 3: Implement an Evaluator!
- Step 4: (Optional) Concurrency
- Testing and Debugging
- Hints and Common Errors
RML is a simple programming language that combines functional, imperative, and concurrent features. The langauge is described in detail below. We have provided most of the front-end as well as a simple type-checker to help you develop RML code and debug your interpreter more effectively.
The RML interpreter has the following phases:
- Lexing and Parsing
lexer.mll: Translates an RML program into a stream of tokens
parser.mly: Translates a stream of tokens into an abstract syntax tree (AST).
- Static Analysis
types.ml: Types and common error messages
checker.ml: Type checking function
ast_factory.ml: Functions for building AST values
ast.ml: Types representing expressions (an AST), patterns, definitions, etc.
eval.ml: Big-step evaluation function
promise.ml: Helper functions for evaluating asynchronous expressions.
In addition to the type-checker, we have implemented a basic
pre-processor for RML files. Writing
include "<path>" at the top of
an RML file introduces the definitions from the file at
the namespace of the current file. Includes are transitive, so if
c.rml, then the
c.rml will be in the scope of
dependencies will cause the preprocessor to raise an exception. The
path mentioned in an
include must either be a filename in the
current directory or a relative path to a filename. For example, a
"../../folder1/folder2/file.rml" indicates a file that
can be found by navigating two folders up, and then down into two
different sub-folders. Includes are only permitted at the top of the
file, before any definitions.
As an example, given files,
// library.rml let x = 0 in
// include.rml include "library.rml" let () = println x
the preprocessor will turn the above code into the program:
let x = 0 let () = println x
The RML Language
A complete list of all RML expressions is given below. The formal definition of the big-step operational semantics can be found here. However, we recommend reading the description here first to gain an intuitive understanding of the language.
The concrete syntax of RML allows grouping expressions using
parentheses, as well as
begin .. end, similar to OCaml. Comments in
RML are similar to C and Java: use
// for a single-line comments and
/* .. */ for multi-line comments.
|clos||Function closures. Closures can be understood as a triple of the form
||Empty list value|
||Non-empty list values|
|prom||Promises (result of async operations, as described below)|
|loc||Locations (result of
|hand||Robot handles (result of
||String literals (note: use double quotes only; RML does not support escapes)|
||Variables, which are evaluated by looking up their binding in an environment|
||Pairs, which evaluate to
||Empty list literal|
||Non-empty lists, which evaluates to
||Unary operations, which evaluates to
||Binary operations, which evaluates to
||Conditionals, which evaluate
||Sequences, which evaluates to
||Let expressions, which evaluate to
||Recursive functions, which evaluates to
||Anonymous functions, which evaluate to a closure
||Function applications, which evaluate to
||Pattern matching, which evaluates to
||Integer negation, which evaluates to
||Boolean complement, which evaluates to
||Arithmetic operators (Note the
||Comparisons on integers|
||Equality and disequality on integers, strings, and booleans.|
||Boolean operators, which behave as in OCaml, including short-circuiting|
||String concatenation operator|
||Function pipe-lining operator, which behaves the same as
||Wilcard pattern, which matches any value
||Constant pattern, where
||Variable patterns, which matches any value
||Pair patterns, which matches a pair
||Empty list pattern, which matches
||Non-empty list pattern, which matches
||References, which evaluate to
||Dereferences, which evaluates to
||Assignment, which evaluates to
ast.ml provided in the release code do not
have any explicit references to state (unlike the full version of the
formal semantics, which threads a store
sigma through the
evaluation). We suggest using OCaml’s imperative features to implement
the semantics of imperative expressions.
You will be required to implement asynchronous expressions as part of good scope. The implementations of these expression should all be straightforward using helper functions provided in
||The handle of the current robot. This allows robots to carry their own handles and pass their handles to robots they spawn.|
||Returns, which evaluates to a promise that is resolved to
||Awaits, which evaluates to a promise
||Binds, which evaluates to a promise
||Sends, which evaluates to unit and sends message
||Receives, which evaluates to a promise
||Spawns, which evaluates to a new handle
Async Implementation Hints. All of the heavy-lifting for
implementing the evaluation of these expressions has been abstracted
into a helper module
Promise. Think of
Promise as an async
Lwt, that provides a datatype
representing a promise and all necessary operations on promises.
Note that in order to aviod circular modular dependencies, the
self are all implemented
eval.ml rather than
Caution: Although under the hood,
Promise is implemented with
Lwt library, you never need to use Lwt methods or the Lwt.t
type ever in this assignment. In fact, we purposely make it
impossible for you to use
Lwt in the evaluator. (There is an
ominously named function in
Promise that converts a
Lwt.t’. We expose that method because it’s needed by other
helper modules, but it will complicate your code if you use the
converter. For your protection, we name the method with a warning so
you don’t accidentally use it.)
You are responsible for implementing the following built-in functions,
when you implement function application. This will involve extending
the definition of
eval.ml. Thus, if the user
explicitly rebinds the name of some built-in function to a different,
that binding should take precedence.
RML supports several forms of syntactic sugar. You do not need to do anything to implement these cases; they are desugared by the lexer and parser for you into one of the cases above. We mention this purely for your reference as you are coding in RML.
|Syntactic Sugar||Desugars to:|
Step 1: Form a Partnership on CMS
As with A4, We recommend that you complete this assignment with a partner, but it is not required. However, having a partner is not needed to complete the assignment. It is definitely doable by one person. If you worked with a partner for A4, you should have a conversation to discuss which aspects of your partnership are working well and which aspects could be improved. We expect that most partners will continue, but you are also free to split up and find a new partner or go solo.
→ If you do have a partner, you must read the Partner Policy. ←
The deadline to form a CMS partnership is Monday, April 27, at 11:59 pm. There will be a short grace period. Then the assignment will close for a few hours. On Tuesday, April 28th, we will re-open the assignment, but you will no longer be able to form new partnerships. Instead, you will have to email one of your section TAs, Cc’ing your intended partner. The TA can then manually put the two of you into a CMS partnership. However, there will be a penalty for late partner formation: -5 points on Tuesday and -10 points on Wednesday. The penalty applies to both partners (no exceptions). Starting on Thursday, April 30th at 12:01 am, no new partnerships may be formed.
Why do we have this early partner formation deadline, and the penalties? Again, it’s to make sure you are working with your partner on the entire assignment, as required by the Partner Policy, rather than joining forces part way through.
If you want to split up with your partner before the final assignment
deadline, you must acknowledge their influence on your work in the
Authors interface. You may form a new partership, subject to the
deadlines and penalties stated above.
Step 2: Get Started
Download the release code. There are many files, and you will need to read many of them before the assignment is over, but you don’t have to do so yet. Nevertheless, here is an overview to the files:
promise.mli: This module provides helper functions that will be useful in your implementation, especially if you attempt to implement asynchronous expressions. You should familiarize yourself with each of the functions in
promise.mlibefore starting work on the concurrency features of RML.
eval.ml: (You should change this file.) This module provides the implementation of the evaluator. It currently has four main unimplemented functions along with several unimplemented helper functions and stubs for the types
env. You should fully implement these functions. You are also free to add any extra helper functions that you find useful.
lexer.mll: This module implements a lexer that converts RML programs from a character stream into a token stream.
parser.mly: This module implements a parser that converts the token stream for an RML program into an AST.
ast.ml: (You should change this file.) This module defines the types for ASTs. Currently, most of these types are stubbed out as
unit. You should decide how to design the AST and provide implementions for these types.
ast_factory.mli: (You should change this file.) This module provides “factory” methods that the parser and type checker can use to build ASTs. As you choose the types of the ASTs, you will need to fill in the functions in this module.
test.ml: (You may change this file.) This module provides tests for
eval.ml. You may find it useful to develop a test suite, however you do not need to include it in your submission.
Makefile: The Makefile, which provides a number of targets as described below.
rml/examples: This directory contains examples of RML code that you may look at in order to get a feel for how the language works. You do not need to understand these functions or modify them in any way, although your final implementation should be able to run them without crashing.
rml/async: (You might change these files.) This directory contains the asynchronous functions you will implement in the optional exercises.
rml/test: (You might change these files.) This empty directory is provided as a “scratch space” for writing your own RML test programs.
Create a new git repository for this assignment. Make sure the repo is private. Add the release code to your repo, referring back to the instructions in A1 if you need help with that. Make sure you unzip and copy the files correctly, including hidden dotfiles, as described in the A1 handout.
If you are using a partner, grant them access to the repo:
- Go to the repo’s webpage on the COECIS Github.
- Click “Settings”; on the settings page click “Collaborators”, and type your
partner’s NetID into the search box. Click “Add collaborator”.
- Your partner should now click on the GitHub icon on the top left in their own browser. They should see the repo you just created listed on the left as a repo to which they have access.
In the release code, there is a makefile provided with the usual targets:
make build: compile
make test: compile and run the test suite
make check: check the types and names of the files
make repl: compile and run an interactive interpreter shell for RML
make run: compile and and prompt for a file on which to run the interpreter
make clean: delete bytecode files
make zip: create the
.zipfile for final submission
As usual, you may not change the names and types of the files in the
provided modules. The only files your should change are
test.ml. You should
follow the instructions carefully to make sure you do not change any
of our code.
This assignment will be autograded, so your submission must pass
check. Now that you’re used to working with
.mli files, we have
omitted the comments about what you are allowed to change. Any names
and types that we provide in an interface may not be changed, but you
may of course add new declarations and definitions. If you’re ever in
doubt about whether a change is permitted or not, just run
check: it will tell you whether you have changed the interface in a
Step 3: Implement an Evaluator!
Implement an evaluator for RML by filling in the missing code in
test.ml. You are free to
design your own types to represent ASTs in
ast.ml. However, you must
fill in the code in
ast_factory.ml so the parser can construct them.
Satisfactory scope: This level of scope corresponds roughly to the language features we covered in the textbook, lecture, and recitation; you are of course free to re-use any of that code we gave you.
Constants: integers, strings, Booleans.
Three basic operators: the addition operators on integers (e.g.,
1+1), and the short-circuit Boolean operators (e.g.,
true && false,
false || false). You can leave the rest of the binary operators, as well as conversions to integers and Booleans from other types, for Excellent scope.
Let expressions and definitions. This will require you to get environments working. Don’t worry about recursion or back-patching yet; leave that for Excellent scope, and just do non-recursive
Anonymous functions and function application with variable patterns. This will require you to get closures working.
Good scope: This level of scope involves adding mutability to the Satisfactory scope. It involves new features that we didn’t cover in class. The challenge here, therefore, is to use the formal semantics to guide your implementation.
Data types, pairs, lists, etc.
Sequences (i.e., semicolon).
References. This is the first feature that will cause you to modify state in a non-trivial way.
Excellent scope: As always, you should regard the Excellent scope as a chance to dig deeper, rather than something mandatory.
- All remaining features including other operators, full support for
pattern matching, recursive definitions,
- All remaining features including other operators, full support for pattern matching, recursive definitions,
The autograder test suite is engineered to do a good job of getting you partial credit on any language features you implement, as long as you follow the above order. But it’s not always possible to test each language feature completely in isolation, and of course we also need to test features in conjunction with one another to see whether they work together properly.
You should build this one language feature at a time. For language features that are unfamiliar or complex, you should consult the formal semantics carefully.
Implementation strategy. Here is a model work-flow you can follow to implement the interpreter.
- Pick a language feature to implement such as boolean literals.
- Implement a couple test cases for the feature in
- Extend the AST type in
ast.mlwith a constructor for that feature–e.g.
Bool of boolin
- Implement the factory function in
let make_bool b = Bool b
- If necessary, extend the
eval.mlwith a constructor. Also, make sure that your implementation of
string_of_valuehandles the new value correctly.
- Implement the big-step rule for the new language feature from the formal semantics.
- Fully develop your test suite with more comprehensive test cases in
test.ml. You can also try out the new feature in the REPL.
You will need to do this for patterns, expressions, and definitions. A few notes on each:
Patterns will be a bit more challenging since it will be very different from anything we have done so far. Notice that patterns are not just used in pattern matching, but also in place of normal variables in
funexpressions. This means that patterns need to be at least partially supported in order to get some of the basic language features to work. We recommend getting basic variable patterns to work first, and then adding more in later once you get to pattern matching. Also, the
bind_patternfunction returns an option. This is because a value may not match a pattern in a
matchexpression. In theory, such a mismatch could occur in a
letexpression or a function application. However, our type-checker rejects such programs so you don’t have to handle this case.
Expressions are the main part of the interpreter. Recursion and concurrency are intended to be an extra challenge, so you should save those for last.
Definitions will be the shortest part, and should be very similar to
let recexpressions. However, you will not be able to run your interpreter on interesting examples in the REPL or using
make rununtil definitions are at least partially done, so this should be filled in relatively early. We recommend doing so around the same time you implement
Step 4: (Optional) Concurrency
After your interpreter is complete, you may find it fun to use the language you have implemented to gain some experience implementing basic concurrency tasks using promises as optional exercises. Caution: do not attempt this section until you have completed your interpreter and are reasonably certain it is correct.
Excercise 1: The Promised Reference
In this exercise, you will use RML’s promises to simulate the behavior of an RML reference. You implementation should not make use of the imperative feature of RML, only the concurrency.
Since the type system of RML does not yet support polymorphism, we will implement this only for integer references. We have provided the implementation of the following function:
val promise_reference : int -> (string * int) handle -> unit promise
The following function is meant to be the reference simulator.
promise_reference v h is a function that simulates the
behavior of of a reference where
v is the value currently stored in
the reference and
h is the handle on which the reference expects to
receive dereference and assignment commands.
If you examine our implementation, you will see that it receives and
responds to messages on the input handle. A
deref message causes the
promise reference to send its current value back to the handle, while
assign message causes the promise reference to change its current
value by recursively calling itself with the new value.
Given this implementation, your job is to implement three functions
that act as an interface for
promise_reference. These functions
correspond to the
ref keyword, the
! unary operator, and the
val ref_async : int -> (string * int) handle
val assign_async : (string * int) handle -> int -> unit
val deref_async : (string * int) handle -> int promise
ref_async, you will have to come up with the proper way to spawn
an instance of
promise_reference and return the resulting handle.
assign_async, you will simply need to make use of
communicate to the instance that it should change its value. For
deref_async, you will need to make use of
communicate with the instance and return a promise to the value the
The release code for this excercise can be found in
rml/async/prom-ref. This directory contains the file
promise_reference as above. It also contains
deref.rml. You will implement the three
functions above in the respective files. Finally,
the four functions and runs a basic program to test that your
implementations are working. Full specifications of all these
components can be found in the files.
Exercise 2: Human-In-The-Middle
Two students, Mike and John, are working together on A5 for their favorite CS class. Due to unforseen circumstances, they have to work remotely. To make matters worse, there is a bug in the network they are using so that they can’t communicate directly. Mike and John must send messages to each other with the help of their favorite TA, Timmy.
The files for this exercise are in
rml/async/routing. To start, take
a look at
main.rml, which does some simple setup work before
returning the unit. It spawns robots for Mike, John, and Timmy. You
will notice that Mike and John do not have each other’s handles, so
direct communication is impossible. However, they each have the handle
for Timmy, and Timmy has both handles.
We have implemented simple functions for Mike and John so that they
are able to have a brief conversation about their project as long as
Timmy is helping them. Check out these implementations in
john.rml. Right now, however, Timmy is busy and cannot help out
yet. You can see in
timmy.rml that he is not doing anything useful.
Looking at these functions, try checking your understanding by
guessing what will happen when you run
Your task is to re-implement
timmy.ml so that when
run, John and Mike have the following interaction:
"[John] Timmy, can you send this to Mike? - first message from john" "[Mike] John says: first message from john" "[Mike] Timmy, can you send this to John? - first message from mike" "[John] Mike says: first message from mike" "[John] Timmy, can you send this to Mike? - second message from john" "[Mike] John says: second message from john" "[Mike] Timmy, can you send this to John? - second message from mike" "[John] Mike says: second message from mike"
Testing and Debugging
In order to make testing easier, we have provided you with an
interactive shell that interprets RML expressions and definitions. Run
make repl to start. The shell is similar to utop in that it does not
stop taking your input until you enter
;; at the end of a line. As
in utop, you may input RML expressions or definitions.
We have also provided you with the
make run directive, which
compiles your code and then prompts you for a file
Finally, you are required to develop a test suite for
submit it as part of the final
.zip file. We have provided you with
some starter code and a few example test cases in
Hints and Common Errors
Here are some tips for your implementations:
- Although the formal semantics mentions state (Delta and sigma), you are not responsible for keeping track of state. Your implementation should let OCaml update the state automatically using OCaml’s references and
- RML does not allow n-place tuples. However, you should be able to encode them using nested pairs.
- Type annotations are required almost everywhere in RML
selfvalue in RML must usually be stored in a variable with a type annotation in order for the type-checker to accept your program.
Here are some common bugs in RML to watch out for:
- RML’s match statements require the
endkeyword after the last case. Don’t forget it!
- While OCaml match statements allow you to omit the
|in the first case, RML does not.
- In OCaml, if you put an expression on the left-hand side of a sequential composition
;that does not have type
unit, you get a warning. In RML, it is a type error. You can define an
ignorefunction and use it to explicitly discard the expression on the left-hand side of the composition.
await p = e1 in e2, remember that
e2must be a promise.
- Unlike OCaml, your program cannot end with a semi-colon.
- 25 points: submitted and compiles
- 50 points: satisfactory scope
- 20 points: good scope
- 5 points: excellent scope
The only functions that the autograder test suite will directly call
Main.interp_expr. So feel free to test
other functions all you want, but to receive points you must ensure
Main.interp_expr are also working. The
latter is the function that the provided (although minimal) test suite
already tests. However, you will likely want to test it further.
We will not assess coding standards or testing on this assignment. We trust that you will use what you have learned in the rest of the semester to your advantage.
Each of the language features in Good scope will be worth about the same number of points, and the same is true for the five features in Excellent scope. So please don’t feel as though you have to implement all the features: it’s perfectly fine to stop early. If all you do is the Satisfactory scope, you will still have learned a lot about interpreters.
Make sure your team’s NetIDs are in
authors.mli, and set the
variable at the end of
make zip to construct the ZIP file you need to submit on CMS.
→ DO NOT use your operating system’s graphical file browser to construct the ZIP file. ←
Any mal-constructed ZIP files will receive a penalty of 15 points. Do not try to
construct the ZIP yourself by hand. Use only the provided
make zip command.
Otherwise, your ZIP will potentially contain the wrong files, and you will lose
points—possibly more than just the 15 points mentioned earlier, if you are
missing essential files.
Ensure that your solution passes
make finalcheck. Submit your zipfile on
CMS. Double-check before the deadline that you have submitted the
intended version of your file.
Congratulations! Your robot says “beep-bop.”
Acknowledgment: RML was inspired by an old CS 3110 assignment developed by Andrew Myers. Development of the current version was led by Timmy Zhu and Samwise Parkinson, with lots of help from the 2019sp and 2020sp course staff.