A5: RML Interpreter

robot

In this assignment, you will build an interpreter for an OCaml-like language called RML (Robot Meta-Language).

What you’ll do: You will implement the evaluation function for the interpreter of RML and implement some simple programs using the asynchronous features of RML.

Other Resources

Objectives

Table of contents:

RML: Overview

RML is a simple programming language that combines functional, imperative, and concurrent features. The langauge is described in detail below. We have provided most of the front-end as well as a simple type-checker to help you develop RML code and debug your interpreter more effectively.

The Interpreter

The RML interpreter has the following phases:

  1. Lexing and Parsing
    • lexer.mll: Translates an RML program into a stream of tokens
    • parser.mly: Translates a stream of tokens into an abstract syntax tree (AST).
  2. Static Analysis
    • types.ml: Types and common error messages
    • checker.ml: Type checking function
    • ast_factory.ml: Functions for building AST values
  3. Evaluation
    • ast.ml: Types representing expressions (an AST), patterns, definitions, etc.
    • eval.ml: Big-step evaluation function
    • promise.ml : Helper functions for evaluating asynchronous expressions.

Preprocessing

In addition to the type-checker, we have implemented a basic pre-processor for RML files. Writing include "<path>" at the top of an RML file introduces the definitions from the file at <path> into the namespace of the current file. Includes are transitive, so if a.rml includes b.rml, and b.rml includes c.rml, then the definitions in c.rml will be in the scope of a.rml. Cyclic dependencies will cause the preprocessor to raise an exception. The path mentioned in an include must either be a filename in the current directory or a relative path to a filename. For example, a relative path "../../folder1/folder2/file.rml" indicates a file that can be found by navigating two folders up, and then down into two different sub-folders. Includes are only permitted at the top of the file, before any definitions.

As an example, given files,

// library.rml
let x = 0 in

and

// include.rml
include "library.rml"
let () = println x

the preprocessor will turn the above code into the program:

let x = 0
let () = println x

The RML Language

A complete list of all RML expressions is given below. The formal definition of the big-step operational semantics can be found here. However, we recommend reading the description here first to gain an intuitive understanding of the language.

The concrete syntax of RML allows grouping expressions using parentheses, as well as begin .. end, similar to OCaml. Comments in RML are similar to C and Java: use // for a single-line comments and /* .. */ for multi-line comments.

Values

Value Semantics
() Unit value
n Integer values
s String values
true and false Boolean values
clos Function closures. Closures can be understood as a triple of the form cl(p,e,env_cl) that encodes a function that matches pattern p to its argument to obtain bindings, updates the defining environment env_cl with these bindings, and finally evaluates e in the resulting environment.
(v1, v2) Pair values
[] Empty list value
v1 :: v2 Non-empty list values
prom Promises (result of async operations, as described below)
loc Locations (result of ref operations, as described below)
hand Robot handles (result of spawn operations, as described below)

Expressions (Functional)

Expression Semantics
() Unit literal
n Integer literal
s String literals (note: use double quotes only; RML does not support escapes)
true and false Boolean literals
x Variables, which are evaluated by looking up their binding in an environment
(e1, e2) Pairs, which evaluate to (v1, v2) if e1 evaluates to v1 and e2 evaluates to v2 (in that order)
[] Empty list literal
e1 :: e2 Non-empty lists, which evaluates to v1 :: v2 if e1 evaluates to v1 and e2 evaluates to v2 (in that order)
uop e Unary operations, which evaluates to v if e1 evaluates to a value v1 and uop v1 is v
e1 bop e2 Binary operations, which evaluates to v if e1 evaluates to v1 and e2 evaluates to v2 (in that order) and bop v1 v2 is v
if e1 then e2 else e3 Conditionals, which evaluate v if e1 evaluates to true and e2 evaluates to v or if e1 evaluates to false and e3 evaluates to v
e1; e2 Sequences, which evaluates to v if e1 evaluates to () (possibly causing side effects) and e2 evalautes to v (in that order)
let (p : t) = e1 in e2 Let expressions, which evaluate to v if e1 evaluates to v1 and evaluating e2 in the environment extended with the bindings obtained by matching p with v1.
let rec (f : t) = fun (p : t) -> e1 in e2 Recursive functions, which evaluates to v if e2 evaluates to v in the environment extended so f maps to the closure clos = cl(p, e1, env_cl) Note that because e1 may refer to f, the environment env_cl must also be updated to include the binding f => clos. (Hint: because env_cl needs to contain f, a good implementation approach may be to use references to first set the closure environment to the defining environment and then later re-assign it to be updated with the new binding f => clos.) Note: the parser desugars let rec f p = e1 in e2 into let rec f = fun p -> e1 in e2.
fun (p : t) -> e Anonymous functions, which evaluate to a closure cl(p,e,env_cl) containing the pattern, body, and current environment (i.e., RML uses lexical scope)
e1 e2 Function applications, which evaluate to v if e1 evaluates to a closure cl(p,e,env_cl), e2 evaluates to a value v2, and evaluating e in env_cl extended with the bindings obtained by matching p with v2 evaluates to v.
match e0 with | p1 -> e1 | ... | pn -> en end Pattern matching, which evaluates to v if e0 evaluates to v0, pj is the first pattern that matches v0, and evaluating the corresponding body ej in the environment extended with the bindings obtained by matching pj with v0 evaluates to v. (Note the keyword end to denote end of match statements). If v does not match any pattern pj, the interpreter should raise InexhaustivePatterns.
Unary Operators
Operator Semantics
-e Integer negation, which evaluates to -n if e evaluates to n.
not e Boolean complement, which evaluates to true if e evaluates to false, and vice versa.
Binary Operators
Operator Semantics
e1 + e2, e1 - e2, e1 * e2, e1 / e2, e1 % e2 Arithmetic operators (Note the % is the “mod” operator)
e1 < e2, e1 > e2, e1 <= e2, e1 >= e2 Comparisons on integers
e1 = e2, e1 <> e2 Equality and disequality on integers, strings, and booleans.
e1 && e2, e1 || e2 Boolean operators, which behave as in OCaml, including short-circuiting
e1 ^ e2 String concatenation operator
e1 |> e2 Function pipe-lining operator, which behaves the same as e2 e1.
Patterns
Pattern Semantics
_ Wilcard pattern, which matches any value v and produces no bindings
c Constant pattern, where c is a unit, integer, boolean, or string literal, which matches only itself and produces no bindings
x Variable patterns, which matches any value v and produces the binding x => v
(p1, p2) Pair patterns, which matches a pair (v1,v2) if p1 matches v1 producing bindings b1 and p2 matches v2 producing bindings b2, and produces the bindings in both b1 and b2. (Note that if a variable is bound in both b1 and b2, the second binding takes priority).
[] Empty list pattern, which matches [] and produces no bindings
p1 :: p2 Non-empty list pattern, which matches v1::v2 if p1 matches v1 producing bindings b1 and p2 matches v2 producing bindings b2, and produces bindings b1 and b2. (Again, if a variable is bound in both b1 and b2, the second binding takes priority.)

Expressions (Imperative)

Expression Semantics
ref e References, which evaluate to loc if e evaluates to v and loc is a newly created location that points to v
!e Dereferences, which evaluates to v if e evalutes to a location loc that points to v
e1 := e2 Assignment, which evaluates to () and updates loc to point to v if e1 evalulates to loc and e2 evaluates to v

The files eval.ml and ast.ml provided in the release code do not have any explicit references to state (unlike the full version of the formal semantics, which threads a store sigma through the evaluation). We suggest using OCaml’s imperative features to implement the semantics of imperative expressions.

Expressions (Asynchronous)

You will be required to implement asynchronous expressions as part of good scope. The implementations of these expression should all be straightforward using helper functions provided in promise.mli and eval.ml.

Expression Semantics
self The handle of the current robot. This allows robots to carry their own handles and pass their handles to robots they spawn.
return e Returns, which evaluates to a promise that is resolved to v if e evaluates to v.
await p = e1 in e2 Awaits, which evaluates to a promise prom if e1 evaluates to a promise prom1 and when prom1 resolves to v1, v1 is bound to p and e2 is evaluated under the new bindings to prom2, and finally when prom2 eventually resolves to v2 then prom also resolves to v2. (Note the similarity between await p = e1 in e2 and OCaml’s Lwt.bind e1 (fun p -> e2).)
e1 >>= e2 Binds, which evaluates to a promise prom if e1 evaluates to a promise prom1 and e2 evaluates to a closure f, and when prom1 resolves to v1, the closure for f is applied with argument v1, resulting in a promise prom2, and finally, when prom2 eventually resolves to v2, then prom also resolves to v2. The type-checker will fail if e1 or e2 do not have the appropriate types. Note again the similarity between e1 >>= e2 and Lwt.bind.
send e1 to e2 Sends, which evaluates to unit and sends message v to handle hand if e1 evaluates to v and e2 evaluates to hand.
recv e1 Receives, which evaluates to a promise prom that resovles to the message sent to the current robot from hand if e1 evaluates to hand.
spawn e1 with e2 Spawns, which evaluates to a new handle hand if e1 evaluates to function closure f, e2 evaluates to a value v, and a new robot is created running f v at handle hand.

Async Implementation Hints. All of the heavy-lifting for implementing the evaluation of these expressions has been abstracted into a helper module Promise. Think of Promise as an async library, like Lwt, that provides a datatype Promise.t representing a promise and all necessary operations on promises. Note that in order to aviod circular modular dependencies, the primitives for spawn, recv, send, self are all implemented in eval.ml rather than promise.ml.

Caution: Although under the hood, Promise is implemented with the Lwt library, you never need to use Lwt methods or the Lwt.t type ever in this assignment. In fact, we purposely make it impossible for you to use Lwt in the evaluator. (There is an ominously named function in Promise that converts a Promise.t to an Lwt.t’. We expose that method because it’s needed by other helper modules, but it will complicate your code if you use the converter. For your protection, we name the method with a warning so you don’t accidentally use it.)

Built-in Functions

You are responsible for implementing the following built-in functions, when you implement function application. This will involve extending the definition of initial_env in eval.ml. Thus, if the user explicitly rebinds the name of some built-in function to a different, that binding should take precedence.

Function name Semantics
print e print e evaluates e to v, v is converted to string s via Eval.string_of_value, then s is printed and a unit is returned.
println e println e evaluates e to v, v is converted to string s via Eval.string_of_value, then s followed by a newline character is printed and a unit is returned.

Syntactic Sugar

RML supports several forms of syntactic sugar. You do not need to do anything to implement these cases; they are desugared by the lexer and parser for you into one of the cases above. We mention this purely for your reference as you are coding in RML.

Syntactic Sugar Desugars to:
[v1;...;vn] v1 :: ... :: vn :: []
[p1;...;pn] p1 :: ... :: pn :: []
let x p1 ... pn = e1 in e2 let x = fun p1 -> ... -> fun pn -> e1 in e2
let rec f p p1 ... pn = e1 in e2 let rec f p = fun p1 -> ... -> fun pn -> e1 in e2

Step 1: Form a Partnership on CMS

As with A4, We recommend that you complete this assignment with a partner, but it is not required. However, having a partner is not needed to complete the assignment. It is definitely doable by one person. If you worked with a partner for A4, you should have a conversation to discuss which aspects of your partnership are working well and which aspects could be improved. We expect that most partners will continue, but you are also free to split up and find a new partner or go solo.

→ If you do have a partner, you must read the Partner Policy. ←

The deadline to form a CMS partnership is Monday, April 27, at 11:59 pm. There will be a short grace period. Then the assignment will close for a few hours. On Tuesday, April 28th, we will re-open the assignment, but you will no longer be able to form new partnerships. Instead, you will have to email one of your section TAs, Cc’ing your intended partner. The TA can then manually put the two of you into a CMS partnership. However, there will be a penalty for late partner formation: -5 points on Tuesday and -10 points on Wednesday. The penalty applies to both partners (no exceptions). Starting on Thursday, April 30th at 12:01 am, no new partnerships may be formed.

Why do we have this early partner formation deadline, and the penalties? Again, it’s to make sure you are working with your partner on the entire assignment, as required by the Partner Policy, rather than joining forces part way through.

If you want to split up with your partner before the final assignment deadline, you must acknowledge their influence on your work in the Authors interface. You may form a new partership, subject to the deadlines and penalties stated above.

Step 2: Get Started

Download the release code. There are many files, and you will need to read many of them before the assignment is over, but you don’t have to do so yet. Nevertheless, here is an overview to the files:

Create a new git repository for this assignment. Make sure the repo is private. Add the release code to your repo, referring back to the instructions in A1 if you need help with that. Make sure you unzip and copy the files correctly, including hidden dotfiles, as described in the A1 handout.

If you are using a partner, grant them access to the repo:

In the release code, there is a makefile provided with the usual targets:

As usual, you may not change the names and types of the files in the provided modules. The only files your should change are ast_factory.ml, ast.ml, eval.ml, and test.ml. You should follow the instructions carefully to make sure you do not change any of our code.

This assignment will be autograded, so your submission must pass make check. Now that you’re used to working with .mli files, we have omitted the comments about what you are allowed to change. Any names and types that we provide in an interface may not be changed, but you may of course add new declarations and definitions. If you’re ever in doubt about whether a change is permitted or not, just run make check: it will tell you whether you have changed the interface in a prohibited way.

Step 3: Implement an Evaluator!

Implement an evaluator for RML by filling in the missing code in ast.ml, ast_factory.ml, eval.ml, and test.ml. You are free to design your own types to represent ASTs in ast.ml. However, you must fill in the code in ast_factory.ml so the parser can construct them.

Implementation order.

The autograder test suite is engineered to do a good job of getting you partial credit on any language features you implement, as long as you follow the above order. But it’s not always possible to test each language feature completely in isolation, and of course we also need to test features in conjunction with one another to see whether they work together properly.

You should build this one language feature at a time. For language features that are unfamiliar or complex, you should consult the formal semantics carefully.

Implementation strategy. Here is a model work-flow you can follow to implement the interpreter.

You will need to do this for patterns, expressions, and definitions. A few notes on each:

Step 4: (Optional) Concurrency

After your interpreter is complete, you may find it fun to use the language you have implemented to gain some experience implementing basic concurrency tasks using promises as optional exercises. Caution: do not attempt this section until you have completed your interpreter and are reasonably certain it is correct.

Excercise 1: The Promised Reference

In this exercise, you will use RML’s promises to simulate the behavior of an RML reference. You implementation should not make use of the imperative feature of RML, only the concurrency.

Since the type system of RML does not yet support polymorphism, we will implement this only for integer references. We have provided the implementation of the following function:

val promise_reference : int -> (string * int) handle -> unit promise

The following function is meant to be the reference simulator. Intuitively, promise_reference v h is a function that simulates the behavior of of a reference where v is the value currently stored in the reference and h is the handle on which the reference expects to receive dereference and assignment commands.

If you examine our implementation, you will see that it receives and responds to messages on the input handle. A deref message causes the promise reference to send its current value back to the handle, while an assign message causes the promise reference to change its current value by recursively calling itself with the new value.

Given this implementation, your job is to implement three functions that act as an interface for promise_reference. These functions correspond to the ref keyword, the ! unary operator, and the := binary operator:

For ref_async, you will have to come up with the proper way to spawn an instance of promise_reference and return the resulting handle. For assign_async, you will simply need to make use of send to communicate to the instance that it should change its value. For deref_async, you will need to make use of send and recv to communicate with the instance and return a promise to the value the instance contains.

The release code for this excercise can be found in rml/async/prom-ref. This directory contains the file prom-ref.rml which implements promise_reference as above. It also contains ref.rml, assign.rml, and deref.rml. You will implement the three functions above in the respective files. Finally, test.rml includes the four functions and runs a basic program to test that your implementations are working. Full specifications of all these components can be found in the files.

Exercise 2: Human-In-The-Middle

Two students, Mike and John, are working together on A5 for their favorite CS class. Due to unforseen circumstances, they have to work remotely. To make matters worse, there is a bug in the network they are using so that they can’t communicate directly. Mike and John must send messages to each other with the help of their favorite TA, Timmy.

The files for this exercise are in rml/async/routing. To start, take a look at main.rml, which does some simple setup work before returning the unit. It spawns robots for Mike, John, and Timmy. You will notice that Mike and John do not have each other’s handles, so direct communication is impossible. However, they each have the handle for Timmy, and Timmy has both handles.

We have implemented simple functions for Mike and John so that they are able to have a brief conversation about their project as long as Timmy is helping them. Check out these implementations in mike.rml and john.rml. Right now, however, Timmy is busy and cannot help out yet. You can see in timmy.rml that he is not doing anything useful. Looking at these functions, try checking your understanding by guessing what will happen when you run main.rml.

Your task is to re-implement timmy.ml so that when main.rml is run, John and Mike have the following interaction:

"[John] Timmy, can you send this to Mike? - first message from john"
"[Mike] John says: first message from john"
"[Mike] Timmy, can you send this to John? - first message from mike"
"[John] Mike says: first message from mike"
"[John] Timmy, can you send this to Mike? - second message from john"
"[Mike] John says: second message from john"
"[Mike] Timmy, can you send this to John? - second message from mike"
"[John] Mike says: second message from mike"

Testing and Debugging

In order to make testing easier, we have provided you with an interactive shell that interprets RML expressions and definitions. Run make repl to start. The shell is similar to utop in that it does not stop taking your input until you enter ;; at the end of a line. As in utop, you may input RML expressions or definitions.

We have also provided you with the make run directive, which compiles your code and then prompts you for a file . When you supply it with a directory path, it invokes the lexer, parser, type-checker, and interpreter on the text stored in the given file. Any `.rml` files you create for `make run` should be stored in `rml/test`. The files in this directory will be ignored by `make zip`.

Finally, you are required to develop a test suite for eval.ml and submit it as part of the final .zip file. We have provided you with some starter code and a few example test cases in test.ml.

Hints and Common Errors

Here are some tips for your implementations:

Here are some common bugs in RML to watch out for:

Rubric

The only functions that the autograder test suite will directly call are Main.interp_file and Main.interp_expr. So feel free to test other functions all you want, but to receive points you must ensure that Main.interp_file and Main.interp_expr are also working. The latter is the function that the provided (although minimal) test suite already tests. However, you will likely want to test it further.

We will not assess coding standards or testing on this assignment. We trust that you will use what you have learned in the rest of the semester to your advantage.

Each of the language features in Good scope will be worth about the same number of points, and the same is true for the five features in Excellent scope. So please don’t feel as though you have to implement all the features: it’s perfectly fine to stop early. If all you do is the Satisfactory scope, you will still have learned a lot about interpreters.

Submission

Make sure your team’s NetIDs are in authors.mli, and set the hours_worked variable at the end of authors.ml.

Run make zip to construct the ZIP file you need to submit on CMS.

→ DO NOT use your operating system’s graphical file browser to construct the ZIP file. ←

Any mal-constructed ZIP files will receive a penalty of 15 points. Do not try to construct the ZIP yourself by hand. Use only the provided make zip command. Otherwise, your ZIP will potentially contain the wrong files, and you will lose points—possibly more than just the 15 points mentioned earlier, if you are missing essential files.

Ensure that your solution passes make finalcheck. Submit your zipfile on CMS. Double-check before the deadline that you have submitted the intended version of your file.

Congratulations! Your robot says “beep-bop.”

Acknowledgment: RML was inspired by an old CS 3110 assignment developed by Andrew Myers. Development of the current version was led by Timmy Zhu and Samwise Parkinson, with lots of help from the 2019sp and 2020sp course staff.