A8: RML Interpreter

A8: RML Interpreter

In this assignment, you will build an interpreter for an OCaml-like language called RML (Robot Meta-Language).

What you’ll do: You will implement the main evaluation function for the interpreter of RML. You will also develop a library in RML similar to the List library in OCaml and do some simple exercises using the asynchronous features of RML.

Note: Teams can unanimously stay together from A7 or work in partners of their choosing. If you select the former option, you should ask your TA to form the team on CMS by Friday.

Table of contents:

Changelog

4/29

Expression Semantics
let rec f = fun p -> e1 in e2 Binds f to closure clos = cl(p, e1, env_cl) and evaluates e2 with the new binding. Because e1 may refer to f, env_cl must also be updated to include the binding f => clos. (Because VFunRec has an environment that is mutable, a good approach may be to first set the closure environment initially to be a reference to the defining environment and then later re-assign it to be updated with the new binding f => clos.) Note: Our parser desugars let rec f p = e1 in e2 to LetRec(f, Fun(p, e1), e2).

RML: The Language

RML is a simple programming language that combines functional, imperative, and concurrent features. We will use it to implement a robots for a simple game in A9.

The Interpreter

The RML interpreter has the following phases:

  1. Initialization
    • rml_server.ml: Initialize a server that manages I/O and spawning.
    • rml_interpreter.ml: Initialize a robot from an RML file.
    • io.ml and serialize.ml: helper functions for I/O.
  2. Lexing and Parsing
    • lexer.mll: Translate an RML program into a stream of tokens.
    • parser.mly: Translate a stream of tokens into an abstract syntax tree
  3. Evaluation
    • types.ml: The types representing expressions (an abstract syntax tree), values, patterns, etc.
    • eval.ml: The big-step evaluator.
    • dwt.ml : Helper functions for evaluating asynchronous expressions.
    • pretty.ml : Helper functions for pretty printing.

Preprocessing

Writing include "<FILE>" directly substitutes the contents of FILE into the source code. FILE must either be a filename or the relative path to a file name. For example, to include a file that is two folders up, and then down two other folders, the relative path might be "../../folder1/folder2/file.rmx". An include statement is semantically equivalent to copying and pasting the contents of the file in front of the following code. You should only include at the top of a file, and you should only include files that consist of one or more let expressions where the final let expression ends with an in but no final expression.

For example:

// library.rmx
let x = 0 in
// include.rml
include "library.rmx"
println x

The preprocessor will turn the above code into a complete RML expression:

let x = 0 in
println x

Note that library files cannot be interpreted on their own, since they are incomplete RML expressions. Also note that non-library files cannot be included, since they are valid RML expressions. By convention, distinguish libraries from non-libraries by using the file extension .rmx for library files and .rml for non-library files.

Expressions in RML

What follows is a complete list of all the valid expressions in RML. We have formally defined the big-step operational semantics here. In addition to what is listed below, the concrete syntax of RML also allows grouping using parentheses as well as begin .. end similar to OCaml. Comments in RML are similar to C; use // for a single-line comments and /* .. */ for multi-line comments.

Values

Value Semantics
() The unit value.
n Integer values.
s String values.
true and false Boolean values.
clos Function closures. Closures can be understood as an order triple of the form cl(p,e,env), a function that matches p to its argument, uses the resulting bindings to update its defining environment env, and then evaluates e in the resulting environment.
(v1, v2) A pair of values
[] The empty list
v1 :: v2 Cons of v1 and v2
prom Promises (results of async operations).
loc Locations (results of ref operations).
hand Port handles (results of spawn operations).

Expressions (Functional)

Expression Semantics
() Unit literal
n Integer literals
s String literals. Use double quotes only. RML does not support escape sequences.
true and false Boolean literals
x Variables, which are evaluated by looking up their binding in an environment. If x is not bound in the current environment, the interpreter should raise UnboundVariable.
(e1, e2) Pairs. Evaluates to value (v1, v2) where e1 evaluates to v1 and e2 evaluates to v2 (in that order).
[] Empty list literal
e1 :: e2 Returns v1 :: v2 where e1 evaluates to v1 and e2 evaluates to v2 (in that order). If e2 does not evaluate to a list, the interpreter should raise ExpectedList.
unop e Returns unop applied to the result of evaluation of e
e1 binop e2 Returns binop applied to the results of evaluations of e1 and e2 (evaluated in that order).
if e1 then e2 else e3 Evaluates e1 to a boolean value b. If b is true then evaluate and return e2. Otherwise, evaluate and return e3. If e1 does not evaluate to a boolean, the interpreter should raise ExpectedBool.
e1; e2 Evaluates e1 to a unit value (possibly causing side effects). Then e2 is evaluated and returned. If e1 does not evaluate to a unit, the interpreter should raise ExpectedUnit
let p = e1 in e2 Evaluates e1 to values v1, binds v1 to pattern p, and evaluates e2 assuming the new bindings. If the result of evaluating e1 does not match p, then the interpreter should raise LetMismatch.
let rec f = fun p -> e1 in e2 Binds f to closure clos = cl(p, e1, env_cl) and evaluates e2 with the new binding. Because e1 may refer to f, env_cl must also be updated to include the binding f => clos. (Because VFunRec has an environment that is mutable, a good approach may be to first set the closure environment initially to be a reference to the defining environment and then later re-assign it to be updated with the new binding f => clos.) Note: Our parser desugars let rec f p = e1 in e2 to LetRec(f, Fun(p, e1), e2).
fun p -> e Anonymous function with body e and argument pattern p. Evaluates to a closure with the current environment (lexical scope).
e1 e2 Evaluates e1 to a closure (|env_cl,p_cl,e_cl|), then evaluates e2 to a value v2, then binds v2 to pattern p_cl, and then evaluates e_cl under env_cl assuming the new bindings. If e1 is not a closure, the interpreter should raise ExpectedFunction. If v2 does not match pattern p_cl, the interpreter should raise ArgumentMismatch.
match e with | p1 -> e1 | ... | pn -> en end Evaluates e to value v and binds v to the first matching pattern pj and then evaluates the corresponding case body ej assuming the new bindings. (Note the keyword end to denote end of match statements). If v does not match any of the patterns pi, the interpreter should raise InexhaustivePatterns.

Unary Operators

Operator Semantics
-e Integer negation. If e does not evaluate to an integer, the interpreter should raise ExpectedInt
not e Boolean negation. If e does not evaluate to a boolean, the interpreter should raise ExpectedBool.

Binary Operators

Operator Semantics
e1 + e2, e1 - e2, e1 * e2, e1 / e2, e1 % e2 Integer arithmetic operators (% is mod); behavior is the same as the corresponding operation in OCaml. If e1 or e2 does not evaluate to an integer, the interpreter should raise ExpectedInt. When attempting to divide or mod by 0, the interpreter may raise the same error that OCaml would raise in the analogous case.
e1 < e2, e1 > e2, e1 <= e2, e1 >= e2 Integer comparison operators; behavior is the same as the corresponding operation in OCaml. If e1 or e2 does not evaluate to an integer, the interpreter should raise ExpectedInt.
e1 = e2, e1 <> e2 Equality comparison for integers, strings, and booleans. If the types of e1 and e2 are not the same type, or if one of the types is not int, string, or bool, the interpreter should raise IncompatibleTypes.
e1 && e2, e1 || e2 Boolean logical operators; behavior is the same as the corresponding operation in OCaml, including short-circuiting. If e1 or e2 does not evaluate to a boolean, the interpreter should raise ExpectedBool.
e1 ^ e2 String concatenation operator; behavior is the same as the corresponding operation in OCaml. If e1 or e2 does not evaluate to a string, the interpreter should raise ExpectedString.

Patterns

Pattern Semantics
_ Matches anything and produces no bindings.
c c means any constant (unit, integer, boolean, and string literals). Matches itself only and produces no bindings.
x x is any variable name. Matches any value v and produces the variable binding x => v
(p1, p2) If p1 matches v1 producing bindings b1 and p2 matches v2 producing bindings b2, then (p1, p2) matches (v1, v2) and produces bindings b1 and b2.
[] matches [] and produces no bindings
p1 :: p2 If p1 matches v1 producing bindings b1 and p2 matches v2 producing bindings b2, then p1::p2 matches v1::v2 and produces bindings b1 and b2. (Note that v2 must be a list, since it’s on the right-hand side of ::.)

Expressions (Imperative)

You should note that in the formal semantics, the state sigma is passed through all big-step inference rules. However, eval.ml and types.ml do not have any explicit references to state. This is because you do not need to explicitly keep track of state. Rather, you should simply use OCaml imperative features to implement the following expressions, and let OCaml’s state be used to maintain RML’s state.

Expression Semantics
ref e Evaluates e to a value. A new location loc is allocated and value v is stored at this location. Then loc is returned.
!e Evaluates e to a location loc and returns the value stored at loc. If e does not evaluate to a location, the interpreter should raise ExpectedRef.
e1 := e2 Evaluates e1 to a location loc. Then evaluates e2 to value v2. Then stores v2 at loc. Returns a unit. If e1 does not evaluate to a location, the interpreter should raise ExpectedRef.

Expressions (Asynchronous)

You are required to implement the following async primitives as part of good scope. The implementations of these expression should all be trivial in terms of a helper function in dwt.mli. Note that the initial environment will include the values SELF and SERVER, both of which are handles; SELF is the handle of the current process, and SERVER is the handle of the main server.

Expression Semantics
return e Evaluates e to a value v and returns a promise for v that is already resolved with value v.
await p = e1 in e2 Evaluates e1 to a promise prom1 and immediately returns a promise prom. When prom1 resolves to v1, v1 is bound to p and e2 is evaluated under the new bindings to prom2. Finally, when prom2 resolves to v2, prom resolves to v2. If e1 does not evaluate to a promise, the interpreter should raise ExpectedPromise. If v does not match p, the interpreter should raise AwaitMismatch. If e2 does not evaluate to a promise, the interpreter should raise ExpectedPromise. Note that await p = e1 in e2 is semantically equivalent to OCaml’s Lwt.bind e1 (fun p -> e2).
join e Evaluates e to a list of promises [p1;...; pn] and immediately returns a promise prom. When all of the promises p1...pn are resolved to v1, ..., vn, then prom resolves to the list [v1;...;vn]. If e does not evaluate to a list, the interpreter should raise ExpectedList. If it does evaluate to a list and one of the elements is not a promise, the interpreter should raise ExpectedPromise.
pick e Evaluates e to a list of promises [p1; ...; pn] and immediately returns a promise prom. When one of the promises pi is resolved to vi, prom resolves to vi (note that this involves some non-determinism). If e does not evaluate to a list, the interpreter should raise ExpectedList. If it does evaluate to a list and one of the elements is not a promise, the interpreter should raise ExpectedPromise.
send e1 to e2 Evaluates e1 to a value v (the message) and evaluates e2 to a handle hand (the destination port) and sends message v to robot at port handle hand. The expression immediately evaluates to a unit.
recv e1 Evaluates e1 to a port handle hand and asynchronously listens for a message from the robot at that handle. The expression immediately evaluates to a promise prom that resolves to the value contained in the message once it is received.
spawn e1 with e2 Evaluates e1 to function closure f, then evaluates e2 to a value v then initializes a new robot whose expression is f v. If e1 does not evaluate to a closure, the interpreter should raise ExpectedFunction. The spawn expression evaluates to a new handle hand of the port of the new robot.
Async Implementation Hints:

All of the heavy-lifting for implementing the evaluation of these expressions has been abstracted into a helper module Dwt. Think of Dwt as an async library, like Lwt, that provides a datatype Dwt.t representing a promise and all necessary operations on promises.

Note: The methods send, recv, and spawn in Dwt will accept or produce string representation of values, which can be converted back and forth via Serialize.string_of_value and Serialize.value_of_string.

Caution: Although under the hood, Dwt is implemented with the Lwt library, you never need to use Lwt methods or the Lwt.t type ever in this assignment. In fact, we purposely make it impossible for you to use Lwt in the evaluator. (There is an ominously named function in Dwt that converts Dwt.t’s to Lwt.t’s. We expose that method because it’s needed by other helper modules, but it will complicate your code if you use the converter. For your protection, we name the method with a warning so you don’t accidentally use it.)

Built-in Functions

You are responsible for implementing the following built-in functions, when you implement function application. These functions should only be used when directly applied to an argument—e.g., we cannot use a built-in functions as an argument to a higher-order function. In addition, if the user explicitly rebinds the name of some built-in function to a value, that binding should take prededence. Implementations for the print_detail functions are not required, but you may find them useful for debugging. Helper functions for printing can be found in pretty.mli.

Function name Semantics
print e print e evaluates e to v, v is converted to string s via Pretty.print, then s is printed and a unit is returned. (Under print, closure values are printed as '<function>'. Locations, handles, and promises also have special formatting.)
println e println e evaluates e to v, v is converted to string s via Pretty.print, then s followed by a newline character is printed and a unit is returned. (Under println, closure values are printed as '<function'>. Locations, handles, and promises also have special formatting)
print_detail1 e The same as print except that closure values are printed as an ordered pair displaying the function body and the string '<env>'. Uses Pretty.print_detail1 instead.
print_detail2 e The same as print_detail1 except that the entire mapping for the environment of a closure is printed. If the environment itself maps variable names to closures, these closures are printed as '<function>'. Uses Pretty.print_detail2 instead.
print_detail3 e The same as print_detail2 except all nested closures and environments are printed recursively. Uses Pretty.print_detail3 instead.
string_of_int e Evaluates to the string representation of n, where e evaluates to n. If e does not evaluate to an integer, the interpreter should raise ExpectedInt.
sleep e sleep e evaluates e to an integer n and returns a promise of a unit that resolves after n milliseconds. If e does not evaluate to an integer, the interpreter should raise ExpectedInt.
random e random e evaluates e to an integer n and returns a random integer in the range [0..n-1] inclusive. If e does not evaluate to an integer, the interpreter should raise ExpectedInt. If n is not positive then the interpreter should raise Invalid_argument.
ignore e Evaluates e, possibly causing side effects, and returns a unit.

Syntactic Sugar

RML supports several forms of syntactic sugar. You do not need to do anything to implement these cases; they are desugared by the lexer and parser for you into one of the cases above. This section is merely for your reference as you are coding in RML.

Syntactic Sugar Desugars to:
[v1;...;vn] v1 :: ... :: vn :: []
[p1;...;pn] p1 :: ... :: pn :: []
let x p1 ... pn = e1 in e2 let x = fun p1 -> ... -> fun pn -> e1 in e2
let rec f p p1 ... pn = e1 in e2 let rec f p = fun p1 -> ... -> fun pn -> e1 in e2

Step 1: Team Maintenance

Teams can unanimously stay together from A7 or work in partners of their choosing. If you select the former option, you should ask your TA to form the team on CMS by Friday.

At your first team meeting about A8, please assign roles for this assignment.

After this assignment is due, you will submit a performance evaluation of your teammates to the professor, and those evaluations will become part of team members’ grades.

Step 2: Explore the Starter Code

IMPORTANT: Before you can build the interpreter, you need to install ppx_deriving_yojson, which makes it easy to serialize OCaml objects into JSON strings. We use serialization to send objects between robots and the server.

Run the following command from the terminal: opam install ppx_deriving_yojson.

(“Serialize,” in this context means to convert into a string. So, any time you see the word “serialize,” think “stringify!”)

Here is a list of the files along with a brief explanation of each:

The makefile supports the following targets:

As usual, you may not change the names and types of the files in the provided modules. In fact, the only files you should modify in the main directory are eval.ml and test.ml.

Step 3: Evaluation

Implement all of the functions in eval.ml. For now, you may omit the cases for all seven asynchronous features and the case for recursion. (You are free to add any helper functions to eval.ml as you see fit.)

Add tests to your test.ml file to test that your implementation up to this point is working properly.

This is the stopping point for a satisfactory solution.

Step 4: Recursion and Async

Add implementations for recursion and async. Add corresponding tests to test.ml. Unfortunately, there is no straightforward way to test asynchronous features in OUnit. However, the TAs for your discussion section will look at your code and grade it for correctness by hand.

This is the stopping point for a good solution.

Step 5: RML List Library

At this point, you should download the custom syntax highlighter we have written for RML. Click on extensions in VSCode and search “rml.” The extension is called rmlHighlight, published by cornell3110sp19. Click install. You may have to restart VSCode. Now your code will be colored for RML in your .rml and .rmx files! (this only works if you are using VSCode)

Now that you have a complete version of eval.ml, you should implement the following functions in RML. Each function resides in its own .rmx file in the rml/list/ directory. Each function currently has a trivial implementation that is incorrect in all meaningful cases. You should remove these stub implementations and replace them with your own implementations. You are also free to add include statements. You may not change the function headers, and you may not add helper functions to the files.

Here are the functions you need to implement.

The full specifications can be found in the corresponding .rmx files. Note that we give type annotations here for the sake of understanding, but the syntax of RML does not actually permit type annotations.

Step 6: RML Async Warmup

Now, you will implement a list map function that maps asynchronously. There are two functions you need to implement:

Clarification about [send_back f a]: This method should spawn a robot to invoke f with a and then send back the results. This method is non-blocking, meaning it should immediately return a promise (which will eventually resolve to the result of f applied to a).

You will also implement a library that simulates the behavior of references using asynchronous computations. There are four function you need to implement:

The idea is that ref_async will spawn an instance of loc using the value you want to store in the reference and its own handle (SELF). deref_async will send the newly spawned loc a message requesting the stored value, and assign_async should send loc a message requesting that the value be changed. This means that the implementation of loc should be some sort of infinite loop that receives messages from its parent process (provided by ref_async on spawn) and responds to them appropriately. Note that these asynchronous references are not particularly useful since they only receive from a single computation.

The full specifications are in the respective .rmx files, and there are corresponding .rml test files for each. The .rml files include comments indicating the expected behavior of your implementation; make sure your implementation displays this behavior before submission. You should note the coding practice demonstrated by these .rml files: sometimes, we expect the main computation to complete before the child computations it has spawned. If this is the case, we make the main computation sleep while all of its children are resolved. Otherwise, the children may try to send to the dead main computation, resulting in an error.

This is the stopping point for an excellent solution.

Scope

Here’s what we consider satisfactory, good, and excellent scope:

Testing and Debugging

In order to make testing easier, we have provided you with an interactive shell that interprets RML expressions. Run make shell to start. The shell is similar to utop in that it does not stop taking your input until you enter ;; at the end of a line. In order for the shell to work correctly, you must give it only complete RML expressions. Note that this is different from OCaml, since OCaml allows you to provide code of the form let x = e ... let x2 = e2 without and in keyword in between. As we have discussed previously in the course, this kind of expression may be understood to mean let x = e [in *the rest of your code*]. RML does not have this feature, which means you must always separate your functions with an explicit in keyword in both the shell and .rml files.

We have also provided you with the make run directive, which compiles your code and then prompts you for a file . When you supply it with a directory path, it invokes the lexer, parser, and interpreter on the text stored in the given file and prints the resulting value. Any `.rml` files you create for `make run` should be stored in `rml/test`. The files in this directory will be ignored by `make zip`. Note the distinction between `.rml` files and `.rmx` files; you should not pass a `.rmx` file into `make run`.

When you receive an error message in for RML, the line numbers may not be accurate due to the way we are handling includes. We apologize for this inconvenience.

Finally, you are required to develop a test suite for eval.ml and submit it as part of the final .zip file.

You should test your list library using the unit test file rml/list/test.rml. Of course, full unit testing in RML to the same degree as something like OUnit is not possible. To replicate a similar setup, test.rml currently includes a series of expressions of the form let () = println e in ... where e evaluates to a boolean - true if your test case passed and false if it failed. This way, your output will be a column of booleans that you can examine to see what passed and what failed. Please keep this convention in the file you submit. Also, keep in mind that the process of coding in small amounts and testing before moving on is especially important in RML, since error messages are not as clear. If you write a large chunk of code and run it all at once, it may crash and be very difficult to fix.

Hints and Common Errors

Here are some tips for your implementations:

Here are some common bugs in RML to watch out for:

Submission

Run make zip to construct the ZIP file you need to submit on CMS. Any mal-constructed ZIP files will receive a penalty of 20 points. The size of the files is limited in CMS to 1 MB. Please stay within the size allotted, and do not submit large (or even any) test directories. You should double check that interp.zip contains the following files:

Submit the .zip file to CMS. Double check before the deadline that you have submitted the intended version.

Congratulations! You’re all done!