A6: JoCalf
Quick links: JoCalf manual | Formal semantics
A baby camel is called a calf. In this assignment you will implement an interpreter for JoCalf, a young language whose parents are OCaml and JavaScript. She also has a little bit of DNA from Racket, an untyped functional language descended from Lisp and Scheme.
This assignment is more difficult than A5. On a similar assignment last year, the mean hours worked was 13.5, and the standard deviation was 7.4. But, we’ve reduced the point values that both Good and Excellent scope are worth this semester, so hopefully those numbers will go down. Please get started right away and make steady progress each day. Please track the time that you spend. We will ask you to report it.
What you’ll do: Parsing and lexing has already been implemented for you in the starter code, as has a read-evaluate-print loop (REPL). All you have to do is implement an evaluator based on the formal semantics.
Table of contents:
- Step 1: Form a CMS Partnership
- Step 2: Get Acquainted with JoCalf
- Step 3: Explore the Starter Code
- Step 4: Evaluation
- Rubric
- Submission
Step 1: Form a CMS Partnership
For your convenience, we have already copied your A5 partnership over into A6. But if you want to work with someone else, that is fine. As with previous partner assignments, you have until Saturday morning to form a partnership, and after that, the same penalties will be in place if you want to change partners. This is to ensure that if you work with a partner, you really do all the work together with them instead of one person getting credit at the last minute for the work of another person.
Step 2: Get Acquainted with JoCalf
First, read the tutorial in the first section of the JoCalf manual. Follow along with it in your own terminal by downloading a pre-compiled reference implementation of the interpreter from CMS. The reference implementation provides a top-level for JoCalf that you can use to experiment and confirm your understanding of the language.
Unfortunately, because of technical limitations, the reference implementation will run only in the CS 3110 VM. We’d love—and tried—to make Windows and OS X versions available, but it’s just not possible right now. The underlying reason has to do with an outstanding issue in an OCaml Unicode library.
If you get an error “permission denied” when trying to run the reference
implementation, it’s because your OS is trying to protect you from running
arbitrary code downloaded from the Internet. Just run chmod u+x jocalf
to fix this.
Second, study the next sections of the language manual, titled Syntax, Values, and Evaluation. You will need a good grasp of those before going forward.
Third, skim through the rest of the manual (starting with the section titled Constants) to get a sense for what is there. You don’t need to read it in detail yet. Instead, as you implement each language feature, come back and study the relevant section at that time.
Fourth, look at the formal semantics. It gives a precise, mathematical description of the language using a big-step environment model. Again, you don’t need to read it in detail yet, but can come back as you implement each language feature.
As the Evaluation section of the language manual describes, there are three kinds of evaluation: expressions, definitions, and phrases. The formal semantics accordingly uses three different big-step evaluation relations:
-
Expressions:
<e, env, st> ==> <r, st'>
. This is like the big-step environment model we studied in class, with a couple differences.First, there is a state as part of the machine configuration on both sides of the relation. That state is used to model mutable memory locations. Think of it as a map from memory addresses to values, similar to how the environment is a map from identifiers to values. The state on the right-hand side might be different from the state on the left-hand side, because mutable language features will cause the state to change.
Second, on the right-hand side, a result is produced. Results are either values or exceptions.
-
Definitions:
<d, env, st> ==> <r, env', st'>
. This relation is like the relation for expressions, which we just discussed, except that it also includes an environment on the right-hand side. That is because a definition introduces new names into the environment, which can then be used as part of later definitions. -
Phrases:
<p, env, st> ==> <r, env', st'>
. A phrase is either an expression or definition, and is evaluated according to one of those two evaluation relations.
Step 3: Explore the Starter Code
There is a makefile provided with the usual targets: build, test, check,
finalcheck, zip, bisect, docs, and clean. There is a new target, make repl
,
that will build the JoCalf REPL and run it.
We are back to an autograded assignment, hence your submission must pass
make check
. As always, if you’re ever in doubt about whether a change
is permitted or not, just run make check
: it will tell you whether you
have changed the interface in a prohibited way.
Here is a guided tour of the files in the starter code:
-
Files you will need to change:
-
ast.ml
contains a few types that are predefined, because the parser needs them, as well as two types (expr
anddefn
) that are currently defined asunit
, as a placeholder. Those types are what you need to define for yourself to represent the JoCalf AST. We deliberately do not provide anast.mli
, so that all the values defined inast.ml
are exposed. -
Ast_factory
is a compilation unit that the parser relies on to construct AST nodes. You need to implement the functions inast_factory.ml
. There are many of them, but they are all very short. Make sure you read the specs inast_factory.mli
. -
Eval
implements the interpreter itself. You need to implement a few functions ineval.ml
. The vast majority of the work in this assignment is implementing and testingEval.eval_expr
, which is the function that evaluates JoCalf expressions. Be sure to factor out helper functions, rather than making that one function be hundreds of line long. Also, make sure to read the specs ineval.mli
, especiallystring_of_value
andstring_of_result
. -
test.ml
is the beginning of a test suite. We have supplied some public test cases that, with high probability, will also be in the course staff test suite when we autograde your submission. So, make sure that your interpreter gets exactly the expected output. -
The
Authors
compilation unit needs to be completed, as usual.
-
-
Files you will not need to change:
-
lexer.mll
andparser.mly
contain a lexer and parser that have already been implemented for you. Do not change them. That would change the syntax of JoCalf, which is not permitted. -
Parse
is a helper module already implemented for you that runs the parser. -
Main
is a module already implemented for you that coordinates all the phases of the interpreter: taking in a string, parsing it, evaluating it, and converting the result of evaluation back to a string. -
Repl
is a REPL already implemented for you. In addition to evaluating expressions and definitions, it supports three directives:#quit
,#state
(which displays the contents of the state), and#env
(which display the contents of the environment). The latter two rely onEval.string_of_state
andEval.string_of_env
, which you will need to implement. -
.ocamlinit
will automatically load and openMain
for you. Feel free to modify this file to test however you wish. Using the provided REPL, however, is likely to be just as useful.
-
Step 4: Evaluation
Implement evaluation by completing ast.ml
, ast_factory.ml
, eval.ml
, and
test.ml
. Do this one language feature at a time, rather than trying to
complete an entire file at once. Implement and test a language feature, then
move on to the next. The first couple you implement might take awhile as you
get used to the code base. Then you hopefully will get into a rhythm and make
steady progress through all the language features that are in common with
Core OCaml. As you get into new language features that weren’t part of
Core OCaml, make sure to study the language manual and formal semantics
carefully.
Implementation order. In general, we recommend implementing language features in the same order they appear in the language manual. In more detail, here is a recommended order and some notes on the features:
-
Satisfactory scope: This level of scope corresponds roughly to the language features we covered in the textbook, lecture, and recitation; you are of course free to re-use any of that code we gave you.
-
Constants: integers, strings, Booleans, and
undefined
. You don’t have to get the corner cases of negative integer literals andmin_int
andmax_int
working correctly right away; those are for Excellent scope. -
Three basic operators: the plus operators on integers (e.g.,
1+1
), and the short-circuit Boolean operators (e.g.,true && false
,false || false
). You can leave the rest of the binary operators, as well as conversions to integers and Booleans from other types, for Excellent scope. -
Let expressions and definitions. This will require you to get environments working. Don’t worry about recursion or back-patching yet; leave that for Excellent scope, and just do non-recursive
let
for now. -
Anonymous functions and function application. This will require you to get closures working.
-
If expressions.
-
-
Good scope: This level of scope involves adding mutability to the Satisfactory scope. It involves new features that we didn’t cover in class. The challenge here, therefore, is to use the formal semantics to guide your implementation.
-
Sequences (i.e., semicolon).
-
References. This is the first language feature that will cause you to mutate states in an interesting way.
-
While loops. Be careful to implement these correctly, and not to accidentally turn every loop into an infinite loop.
-
Exceptions. Hint: a clever implementation strategy for these is to implement them in terms of OCaml exceptions. Raise an OCaml exception for a JoCalf
throw
, and handle an OCaml exception for a JoCalftry
. Doing so can avoid a lot of extra pattern matching; but, be careful to use the correct state according to the semantics.
-
-
Excellent scope: As always, you should regard the Excellent scope as a chance to dig deeper, rather than something mandatory.
-
All the remaining features: objects and external functions.
-
All the leftovers not finished above: negative and extremal integers; all the remaining binary and unary operators, including conversions between types; and recursive let expressions and definitions.
-
The autograder test suite is engineered to do a good job of getting you partial credit on any language features you implement, as long as you follow the above order. But it’s not always possible to test each language feature completely in isolation, and of course we also need to test features in conjunction with one another to see whether they work together properly.
Implementation strategy. Here is an algorithm you can follow to implement the evaluator.
-
Pick a syntactic form to implement—for example, boolean constants.
-
Implement a couple test cases for the form in
test.ml
. -
Extend the AST type in
ast.ml
with a constructor for that form. E.g., addEBool of bool
toexpr
. -
Implement the factory function in
ast_factory.ml
for the form. E.g.,let make_bool b = EBool b
. - If necessary, extend the
value
type ineval.ml
with a constructor. E.g., addVBool of bool
tovalue
. Also, make sure thatstring_of_value
andstring_of_result
are implemented ineval.ml
for whatever kind of value you just added. E.g., instring_of_value
, add a branch that reads| VBool b -> string_of_bool b
- Implement the
==>
rule for that form from the formal semantics. E.g., for Boolean constants that rule is<b, env, st> ==> <b, st>
, so ineval.ml
add a new branch toeval_expr
that reads| EBool b -> (RValue (VBool b), st)
(assuming that
RValue
is the constructor of yourresult
type that represents values, as opposed to exceptions). For any branch that takes more than a single line of code, we recommend that you factor out a helper function. - Implement more test cases for the form in
test.ml
.
Rubric
- 25 points: submitted and compiles
- 50 points: satisfactory scope
- 20 points: good scope
- 5 points: excellent scope
The only function that the autograder test suite will directly call is
Main.interp_expr
. So feel free to test all you want in the REPL, which uses
Main.interp_phrase
, but to receive points you must ensure that
Main.interp_expr
is also working. The latter is the function that provided
(and minimal) test suite already does test, and that you should test further.
We will not assess coding standards or testing on this assignment. We trust
that you will use what you have learned in the rest of the semester to your
advantage. The provided make bisect
target is an excellent way to check
your test suite’s code coverage and identify missing unit tests.
Each of the four language features in Good scope will be worth about the same number of points (5 each), and the same is true for the five features in Excellent scope (1 each). So please don’t feel as though you have to implement all the features: it’s perfectly fine to stop early. If all you do is the Satisfactory scope, you will still have learned a lot about interpreters.
Submission
Make sure your team’s NetIDs are in authors.mli
, and set the hours_worked
variable at the end of authors.ml
.
Run make zip
to construct the ZIP file you need to submit on CMS.
→ DO NOT use your operating system’s graphical file browser to construct the ZIP file. ←
Any mal-constructed ZIP files will receive a penalty of 15 points. Do not try to
construct the ZIP yourself by hand. Use only the provided make zip
command.
Otherwise, your ZIP will potentially contain the wrong files, and you will lose
points—possibly more than just the 15 points mentioned earlier, if you are
missing essential files.
Ensure that your solution passes make finalcheck
. Submit your zipfile on
CMS. Double-check before the deadline that you have submitted the
intended version of your file.
Congratulations! Your new-born calf snuggles with you.
Acknowledgment: JoCalf was inspired by the paper The Essence of JavaScript, by Arjun Guha, Claudiu Saftoiu, and Shriram Krishamurthi, corrected version of October 3, 2015. Prof. Guha was a postdoc at Cornell approx. 2013, supervised by Prof. Foster.