A6: JoCalf

Calf

Quick links: JoCalf manual | Formal semantics

A baby camel is called a calf. In this assignment you will implement an interpreter for JoCalf, a young language whose parents are OCaml and JavaScript. She also has a little bit of DNA from Racket, an untyped functional language descended from Lisp and Scheme.

This assignment is more difficult than A5. On a similar assignment last year, the mean hours worked was 13.5, and the standard deviation was 7.4. But, we’ve reduced the point values that both Good and Excellent scope are worth this semester, so hopefully those numbers will go down. Please get started right away and make steady progress each day. Please track the time that you spend. We will ask you to report it.

What you’ll do: Parsing and lexing has already been implemented for you in the starter code, as has a read-evaluate-print loop (REPL). All you have to do is implement an evaluator based on the formal semantics.

Table of contents:

Step 1: Form a CMS Partnership
Step 2: Get Acquainted with JoCalf
Step 3: Explore the Starter Code
Step 4: Evaluation
Rubric
Submission

Step 1: Form a CMS Partnership

For your convenience, we have already copied your A5 partnership over into A6. But if you want to work with someone else, that is fine. As with previous partner assignments, you have until Saturday morning to form a partnership, and after that, the same penalties will be in place if you want to change partners. This is to ensure that if you work with a partner, you really do all the work together with them instead of one person getting credit at the last minute for the work of another person.

Step 2: Get Acquainted with JoCalf

First, read the tutorial in the first section of the JoCalf manual. Follow along with it in your own terminal by downloading a pre-compiled reference implementation of the interpreter from CMS. The reference implementation provides a top-level for JoCalf that you can use to experiment and confirm your understanding of the language.

Unfortunately, because of technical limitations, the reference implementation will run only in the CS 3110 VM. We’d love—and tried—to make Windows and OS X versions available, but it’s just not possible right now. The underlying reason has to do with an outstanding issue in an OCaml Unicode library.

If you get an error “permission denied” when trying to run the reference implementation, it’s because your OS is trying to protect you from running arbitrary code downloaded from the Internet. Just run chmod u+x jocalf to fix this.

Second, study the next sections of the language manual, titled Syntax, Values, and Evaluation. You will need a good grasp of those before going forward.

Third, skim through the rest of the manual (starting with the section titled Constants) to get a sense for what is there. You don’t need to read it in detail yet. Instead, as you implement each language feature, come back and study the relevant section at that time.

Fourth, look at the formal semantics. It gives a precise, mathematical description of the language using a big-step environment model. Again, you don’t need to read it in detail yet, but can come back as you implement each language feature.

As the Evaluation section of the language manual describes, there are three kinds of evaluation: expressions, definitions, and phrases. The formal semantics accordingly uses three different big-step evaluation relations:

Expressions: <e, env, st> ==> <r, st'>. This is like the big-step environment model we studied in class, with a couple differences.

First, there is a state as part of the machine configuration on both sides of the relation. That state is used to model mutable memory locations. Think of it as a map from memory addresses to values, similar to how the environment is a map from identifiers to values. The state on the right-hand side might be different from the state on the left-hand side, because mutable language features will cause the state to change.

Second, on the right-hand side, a result is produced. Results are either values or exceptions.
Definitions: <d, env, st> ==> <r, env', st'>. This relation is like the relation for expressions, which we just discussed, except that it also includes an environment on the right-hand side. That is because a definition introduces new names into the environment, which can then be used as part of later definitions.
Phrases: <p, env, st> ==> <r, env', st'>. A phrase is either an expression or definition, and is evaluated according to one of those two evaluation relations.

Step 3: Explore the Starter Code

There is a makefile provided with the usual targets: build, test, check, finalcheck, zip, bisect, docs, and clean. There is a new target, make repl, that will build the JoCalf REPL and run it.

We are back to an autograded assignment, hence your submission must pass make check. As always, if you’re ever in doubt about whether a change is permitted or not, just run make check: it will tell you whether you have changed the interface in a prohibited way.

Here is a guided tour of the files in the starter code:

Files you will need to change:
- ast.ml contains a few types that are predefined, because the parser needs them, as well as two types (expr and defn) that are currently defined as unit, as a placeholder. Those types are what you need to define for yourself to represent the JoCalf AST. We deliberately do not provide an ast.mli, so that all the values defined in ast.ml are exposed.
- Ast_factory is a compilation unit that the parser relies on to construct AST nodes. You need to implement the functions in ast_factory.ml. There are many of them, but they are all very short. Make sure you read the specs in ast_factory.mli.
- Eval implements the interpreter itself. You need to implement a few functions in eval.ml. The vast majority of the work in this assignment is implementing and testing Eval.eval_expr, which is the function that evaluates JoCalf expressions. Be sure to factor out helper functions, rather than making that one function be hundreds of line long. Also, make sure to read the specs in eval.mli, especially string_of_value and string_of_result.
- test.ml is the beginning of a test suite. We have supplied some public test cases that, with high probability, will also be in the course staff test suite when we autograde your submission. So, make sure that your interpreter gets exactly the expected output.
- The Authors compilation unit needs to be completed, as usual.
Files you will not need to change:
- lexer.mll and parser.mly contain a lexer and parser that have already been implemented for you. Do not change them. That would change the syntax of JoCalf, which is not permitted.
- Parse is a helper module already implemented for you that runs the parser.
- Main is a module already implemented for you that coordinates all the phases of the interpreter: taking in a string, parsing it, evaluating it, and converting the result of evaluation back to a string.
- Repl is a REPL already implemented for you. In addition to evaluating expressions and definitions, it supports three directives: #quit, #state (which displays the contents of the state), and #env (which display the contents of the environment). The latter two rely on Eval.string_of_state and Eval.string_of_env, which you will need to implement.
- .ocamlinit will automatically load and open Main for you. Feel free to modify this file to test however you wish. Using the provided REPL, however, is likely to be just as useful.

Step 4: Evaluation

Implement evaluation by completing ast.ml, ast_factory.ml, eval.ml, and test.ml. Do this one language feature at a time, rather than trying to complete an entire file at once. Implement and test a language feature, then move on to the next. The first couple you implement might take awhile as you get used to the code base. Then you hopefully will get into a rhythm and make steady progress through all the language features that are in common with Core OCaml. As you get into new language features that weren’t part of Core OCaml, make sure to study the language manual and formal semantics carefully.

Implementation order. In general, we recommend implementing language features in the same order they appear in the language manual. In more detail, here is a recommended order and some notes on the features:

Satisfactory scope: This level of scope corresponds roughly to the language features we covered in the textbook, lecture, and recitation; you are of course free to re-use any of that code we gave you.
- Constants: integers, strings, Booleans, and undefined. You don’t have to get the corner cases of negative integer literals and min_int and max_int working correctly right away; those are for Excellent scope.
- Three basic operators: the plus operators on integers (e.g., 1+1), and the short-circuit Boolean operators (e.g., true && false, false || false). You can leave the rest of the binary operators, as well as conversions to integers and Booleans from other types, for Excellent scope.
- Let expressions and definitions. This will require you to get environments working. Don’t worry about recursion or back-patching yet; leave that for Excellent scope, and just do non-recursive let for now.
- Anonymous functions and function application. This will require you to get closures working.
- If expressions.
Good scope: This level of scope involves adding mutability to the Satisfactory scope. It involves new features that we didn’t cover in class. The challenge here, therefore, is to use the formal semantics to guide your implementation.
- Sequences (i.e., semicolon).
- References. This is the first language feature that will cause you to mutate states in an interesting way.
- While loops. Be careful to implement these correctly, and not to accidentally turn every loop into an infinite loop.
- Exceptions. Hint: a clever implementation strategy for these is to implement them in terms of OCaml exceptions. Raise an OCaml exception for a JoCalf throw, and handle an OCaml exception for a JoCalf try. Doing so can avoid a lot of extra pattern matching; but, be careful to use the correct state according to the semantics.
Excellent scope: As always, you should regard the Excellent scope as a chance to dig deeper, rather than something mandatory.
- All the remaining features: objects and external functions.
- All the leftovers not finished above: negative and extremal integers; all the remaining binary and unary operators, including conversions between types; and recursive let expressions and definitions.

The autograder test suite is engineered to do a good job of getting you partial credit on any language features you implement, as long as you follow the above order. But it’s not always possible to test each language feature completely in isolation, and of course we also need to test features in conjunction with one another to see whether they work together properly.

Implementation strategy. Here is an algorithm you can follow to implement the evaluator.

Pick a syntactic form to implement—for example, boolean constants.
Implement a couple test cases for the form in test.ml.
Extend the AST type in ast.ml with a constructor for that form. E.g., add EBool of bool to expr.
Implement the factory function in ast_factory.ml for the form. E.g., let make_bool b = EBool b.
If necessary, extend the value type in eval.ml with a constructor. E.g., add VBool of bool to value. Also, make sure that string_of_value and string_of_result are implemented in eval.ml for whatever kind of value you just added. E.g., in string_of_value, add a branch that reads
```
  | VBool b -> string_of_bool b
```
Implement the ==> rule for that form from the formal semantics. E.g., for Boolean constants that rule is <b, env, st> ==> <b, st>, so in eval.ml add a new branch to eval_expr that reads
```
  | EBool b -> (RValue (VBool b), st)
```
(assuming that RValue is the constructor of your result type that represents values, as opposed to exceptions). For any branch that takes more than a single line of code, we recommend that you factor out a helper function.
Implement more test cases for the form in test.ml.

Rubric

25 points: submitted and compiles
50 points: satisfactory scope
20 points: good scope
5 points: excellent scope

The only function that the autograder test suite will directly call is Main.interp_expr. So feel free to test all you want in the REPL, which uses Main.interp_phrase, but to receive points you must ensure that Main.interp_expr is also working. The latter is the function that provided (and minimal) test suite already does test, and that you should test further.

We will not assess coding standards or testing on this assignment. We trust that you will use what you have learned in the rest of the semester to your advantage. The provided make bisect target is an excellent way to check your test suite’s code coverage and identify missing unit tests.

Each of the four language features in Good scope will be worth about the same number of points (5 each), and the same is true for the five features in Excellent scope (1 each). So please don’t feel as though you have to implement all the features: it’s perfectly fine to stop early. If all you do is the Satisfactory scope, you will still have learned a lot about interpreters.

Submission

Make sure your team’s NetIDs are in authors.mli, and set the hours_worked variable at the end of authors.ml.

Run make zip to construct the ZIP file you need to submit on CMS.

→ DO NOT use your operating system’s graphical file browser to construct the ZIP file. ←

Any mal-constructed ZIP files will receive a penalty of 15 points. Do not try to construct the ZIP yourself by hand. Use only the provided make zip command. Otherwise, your ZIP will potentially contain the wrong files, and you will lose points—possibly more than just the 15 points mentioned earlier, if you are missing essential files.

Ensure that your solution passes make finalcheck. Submit your zipfile on CMS. Double-check before the deadline that you have submitted the intended version of your file.

Congratulations! Your new-born calf snuggles with you.

Acknowledgment: JoCalf was inspired by the paper The Essence of JavaScript, by Arjun Guha, Claudiu Saftoiu, and Shriram Krishamurthi, corrected version of October 3, 2015. Prof. Guha was a postdoc at Cornell approx. 2013, supervised by Prof. Foster.