# A4: OCalf
**Deadline:** Wednesday, 11/02/16, 11:59 pm
*This assignment may be done as an individual or with one partner.
Sharing of code is permitted only between you and your partner; sharing
among larger groups is prohibited. Be careful that you do not
end up working as a three or four person team with the rest of your final
project group: we have had AI cases in the past resulting from that
A baby camel is called a *calf*. In this assignment you will implement
an interpreter for OCalf, a functional language containing many of the
core features of OCaml.
OCalf is intended to be a sub-language of OCaml: unless otherwise
specified, OCalf programs should behave like OCaml programs.
Here is a Backus-Naur form (BNF) grammar for the syntax of OCalf expressions,
to give you a sense of what features are in the language:
e ::= (* expressions *)
| () | i | b | s | x
| e1 bop e2
| if e1 then e2 else e3
| let [rec] x = e1 in e2
| e1 e2
| fun x -> e
| C e
| match e with p1 -> e1 | ... | pn -> en
bop ::= (* binary operators *)
| + | - | * | > | < | = | >= | <= | <> | ^
p ::= (* patterns *)
| () | i | b | s | x | C p | (p1,p2)
i ::= integers
b ::= booleans
s ::= strings
x ::= variable identifiers
C ::= constructor identifiers
Some of the ways that OCalf differs from OCaml are the following:
* OCalf source-file integers are non-negative, and there is no unary integer negation.
Negative integers can nonetheless be created, e.g. by writing `0-10` to produce
* OCalf variants must be non-constant.
* OCalf has pairs, but not \\(k\\)-tuples for \\(k \geq 3\\).
* OCalf does not have built-in lists.
* OCalf does not have a module system.
* OCalf has no imperative features.
Parsing and lexing has been implemented for you, and the AST has
been designed for you. You will implement the rest of the
interpreter, including evaluation and type inference. Evaluation
will be implemented using the big-step environment model.
* Know the design of an interpreter.
* Implement a big-step environment model semantics, including closures.
* Implement type inference.
* Better understand OCaml itself by implementing many of its main features.
## Recommended reading
* [Lectures and recitations 13–17][web]
* You must implement the evaluation as described below in Part 2.
* You must implement type inference and checking as described below in Part 3.
* You must submit an [overview document][overview].
* You must use Git as part of your development process.
## What we provide
See below under Part 1 for a description of the release code.
## What to turn in
Submit files with these names on [CMS][cms]:
* `a4src.zip`, containing your solution code.
See below for how you must construct this file.
* `gitlog.txt` containing your git log.
* `overview.pdf` or `overview.txt`, containing your overview document.
Although you should most certainly build a test suite (and we provide
you with the start of one in `test.ml`), we will not be grading your
test suite as part of this assignment.
**To prepare `a4src.zip` for submission:**
From the directory that contains `main.ml`, run `make zip`. This will produce
a file named `a4src.zip` for you automatically containing all your source code
located in that directory.
Double check that you got all your source files as follows:
$ zipinfo -1 a4src.zip
If the output you receive is not **identical** to the above, something is wrong.
For example, output that included the following would be wrong:
because the `.ml` file is located in a subdirectory rather than being at the top
level directory inside the zip file.
Submissions that do not meet the above criteria regarding the output of
`zipinfo` **will be penalized.** As usual, make sure you are satisfied
with the version of the file you upload in CMS **before** the deadline.
**To prepare `gitlog.txt` for submission:** Run the following
command in the directory containing `main.ml`:
$ git log --stat > gitlog.txt
**Private repos are of the utmost importance.** A public repo would
share your code with the entire world, including your classmates, thus
violating the course policy on academic integrity. Therefore we require
that you keep all your CS 3110 related code in private repos.
## Grading issues
* **Late submissions:** Carefully review the course policies on
[submission and late assignments][late]. Verify before the deadline on CMS
that you have submitted the correct version.
* **Environment, names, and types:** Your solution must compile and run
under OCaml 4.03.0. You are required to adhere to the names and types
provided in the released `.mli` files. Your solution must compile
against the provided public test cases in `test.ml` without any
changes to those tests. Otherwise, your solution will receive minimal
* **Code style:** Refer to the [CS 3110 style guide][style].
Ugly code that is functionally correct will nonetheless be penalized.
Take extra time to think and find elegant solutions.
## Permitted OCaml features
Imperative features are now permitted. Side effects are not necessarily
bad! But we urge you to think carefully before choosing to use
Your solutions may not require linking any additional libraries/packages
## Part 0: Sanity check
We are not shipping a `make check` with this assignment. Instead, we stipulate
* Your submission must compile and run with OCaml 4.03.0 and OUnit 2.0.0.
* You may not change the provided `.mli` files.
* Your solution must compile against the provided test cases in `test.ml` without any
changes to those test cases whatsoever.
By now, you should be able to take responsibility for adhering to these
requirements yourself, without needing automated help.
## Part 1: Understand the OCalf codebase
Your preliminary task is to familiarize yourself with the structure of
the code we have shipped to you. We provide the following compilation
units, comprising both `.ml` and `.mli` files, in the release code:
* `Eval` and `Infer` have skeleton code for the functions that implement
evaluation and type inference. There are some unimplemented helper functions
in these files that the course staff found helpful in their own solution.
You are free to implement them, change them, or remove them
* `Ast` and `TypedAst` contain the type definitions used in evaluation and
type inference, respectively. The `TypedAst` module also contains `annotate`
and `strip` functions for converting between the two.
* `Printer` contains functions for printing out values of various types.
* `Parse` contains functions to construct ASTs from strings. It relies on
a lexer and parser implemented in `lexer.mll` and `parser.mly`.
Skim each of the `.mli` and `.ml` files in the release code,
and plan to come back and read them in more detail later as necessary.
**Compiling and running:**
From the directory containing `main.ml`, run `make` to compile the
interpreter. As the parser is compiled, it will produce a message `1
shift/reduce conflict`. This is expected behavior. Note that `make`
compiles the interpreter but does compile or run your test suite; use
`make test` for that.
To run the intepreter, we will use OCaml's own REPL. (For karma,
you can build a REPL for OCalf.) So after compiling, load utop.
We provide a `.ocamlinit` file that makes the OCalf interpreter's
functions available. For example,
"if true then 3110 else 0"
|> eval 
causes the string `"if true then 3110 else 0"` representing an OCalf expression to be
parsed into an OCalf AST, evaluated to an OCalf value, then converted to an OCaml string
suitable for printing. But note that evaluation won't yet succeed with the release
code, because `eval` is unimplemented.
From the directory containing `main.ml`, run `make test` to compile and run the OUnit test
suite in `test.ml`. Use that file as a basis for creating your own test suite.
## Part 2: Evaluation
Implement the function
eval : Eval.environment -> Ast.expr -> Eval.value
in `eval.ml`. This function evaluates OCalf expressions in the
big-step environment model semantics. For example,
eval  (parse_expr "if false then 3 + 5 else 3 * 5"));;
- : value = VInt 15
Note that we expose the representation of environments as association lists.
Arguably this is poor design: all the client of an environment needs to know
is that it is a dictionary, and the particular dictionary representation
Whenever evaluation reaches a place where the semantics gets *stuck*,
meaning that evaluation could not meaningfully proceed but the
expression being evaluated is not yet a value, `eval` should produce the
This will most often occur when `eval` is asked to evaluate an
expression that would not type check. For example, `3 + true` and
`match 3 with | 1 -> false` should both evaluate to `VError`.
Whenever a necessary evaluation of a subexpression evaluates to
`VError`, the expression that it is part of also evaluates to `VError`.
For example, `snd (3+true, 5)`, where `snd` is `fun p -> match p with
(x,y) -> y`, evaluates to an error, because `3+true` evaluates to an
error, and evaluation of a pair necessarily requires evaluating both
components of the pair, even though `snd` will eventually discard the
erroneous first component. But `if true then 3110 else (3+true)` would
not evaluate to `VError`, because evaluation of the `else` branch is not
necessary according to the semantics.
The `VError` constructor carries a string, and the value of that string
is unspecified: use it to produce a helpful error message. When the
autograder tests your interpreter, it will ignore the string entirely.
*Hint: when your evaluator detects an error, don't try to return it all
the way up through the call stack of recursive calls to `eval`. Instead,
make use of OCaml exceptions.*
Although you are free to tackle the implementation of `eval`
in any way you see fit, you are strongly encouraged to follow
the plan outlined below. Our test cases will proceed in the plan's
order, so following the plan will maximize your chances of
**Step 0:** Remind yourself of the big-step semantics of OCaml,
because you are essentially implementing that judgment.
**Step 1: Implement eval without LetRec or Match.**
Implement `eval` for unit, integers, Booleans, strings, `BinOp`, `If`,
`Var`, `Fun`, `Pair`, `Variant`, `App`, and `Let`. The behavior of the
OCalf binary comparison operators is specified only for *primitive*
types (`unit`, `int`, `bool`, `string`), and should be the same as the
OCaml operators on those types. On non-primitive types (e.g., pairs and
variants), the behavior of these operators is unspecified, so your
implementation may do whatever you like.
**Step 2: Implement LetRec.**
Implement `eval` for `LetRec`. This is tricky, because a binding is needed in
the environment for the defined name before its definition can be evaluated.
We'll solve this problem with a technique called *backpatching*.
To evaluate `let rec f = e1 in e2` using backpatching,
first evaluate `e1` to a value `v1` in an environment where `f` is bound to
a *dummy value*, which may be any value at all.
(If the value of `f` is used while evaluating `e1`, it is an
error. This would occur, for example, if the programmer wrote
`let rec x = 3 + x in ...`.) Then imperatively update the
binding of `f` to `v1`. This "ties the
knot," allowing `v1` to refer to itself. Finally evaluate
`e2` in the environment where `f` is bound to `v1`.
To support backpatching, the `environment` type contains
`binding ref`'s instead of `binding`'s, thus enabling bindings
to be mutated.
**Step 3: Implement Match.**
Implement `eval` for `Match`. [CLARIFICATION
10/28/16] Because at this point you are implementing only dynamic
evaluation—not static checking—of match expressions, you do
not need to implement any of the static checking that OCaml would
normally do of match expressions. Normally OCaml would do some extended
static checking for errors that are not just type errors (e.g., unused
match cases, multiple bindings of a variable in a pattern, or
inexhaustive match cases) then either emit a warning or a compile-time
error. You are not implementing that kind of static checking for OCalf.
But it is possible for evaluation of OCalf and OCaml match expressions
to get stuck if no pattern matches the value; consider how `match 0
with 1 -> true` raises `Match_failure` in OCaml. In that scenario,
OCalf evaluation ought to return `VError`, because that is the specified
behavior of OCalf when evaluation gets stuck.
## Part 3: Type inference
OCalf's type inference algorithm proceeds as follows:
1. Each node of the AST is *annotated* (aka *decorated*) with a unique type variable.
2. The AST is traversed to *collect* a set of equations between types
that must hold for the expression to be well-typed.
3. Finally, those equations are solved to find an assignment of types
to the original variables; this step is called *unification*.
If the unification phase discovers that the system is unsatisfiable,
then it is impossible to give a type to the expression, so a type error would be reported.
If the system is underconstrained, then there will be some
"leftover" variables after unification. The typechecker would give them
user-friendly names like `'a` and `'b`.
* * *
**Example.** Consider `(fun x -> 3 + x)`. The annotation phase would add a
new type variable to each subexpression:
Next, the collection phase would collect equations from each node:
* We know that `(fun x -> e) : t1 -> t2` if `e:t2 `under the assumption
`x:t1`. Stated differently, `(fun (x:t1) -> e:t2) : t3` if and only
if `t3 = t1 -> t2`. Therefore, the equation `'t05 = 't04 -> 't03` would be
collected from `fun` node.
* Similarly, the equations `'t03 = int`, `'t02 = int`, and `'t01 = int` would
be collected from the `+` node.
* The equation `'t02 = int` would be collected from the `3` node.
* We know that if `x` had type `t` when it was
bound then it has type `t` when it is used. Therefore, `'t01 = 't04`
would be collected from the `x: 't01` node.
Finally, the system of equations is solved in the unification phase, assigning
`int` to `'t01` through `'t04`, and `int -> int` to `'t05`.
* * *
We provide the annotation and unification phases of type inference
for you; your task is to implement the function
collect : variant_spec list -> annotated_expr -> equation list
in `infer.ml`. If `collect` itself discovers a type error that
means that the expression cannot be typed, regardless of the equations
that are generated, then `collect` may raise `Failure`.
**Advice:** We strongly encourage you to follow this plan:
**Step 1: Implement collect without Match or Variant.**
Implement `collect` for all syntactic forms except `Variant` and `Match`.
**Step 2: Partially implement collect for Match.**
Implement `collect` for `Match`, but omit handling of `PVariant` patterns.
Make sure the bindings from the patterns are available while checking the bodies
of the cases.
**Step 3: Implement collect for Variant.**
Extend `collect` to handle variants.
Deriving the correct constraints for variants is tricky.
Consider this code:
type 'a list = Cons of 'a * 'a list | Nil of unit
let x = Cons(1,Nil ()) in
let y = Cons(true,Nil ()) in
An overly-strict implementation of collection might report a type error,
if collection generates constraints that force the separate occurrences of `Nil` and `Cons`
in binding `x` and `y` to have the same types.
A better implementation would generate constraints with distinct type variables,
thus permitting the code above.
Implementing some of these karma problems might require changing the names or
types that appear in the `.mli` files in the release code. Do so with care.
Adding a new function to an interface file will not break our testing harness,
but changing the name or type of a previously existing function could, especially
`eval` and `collect`. Likewise, extending the AST types with new constructors
will not break our testing harness, but changing the name of an existing constructor
or the data it carries would. If in doubt, please ask!
Implement an OCalf REPL. Like utop, your REPL could provide various
directives, syntax highlighting, tab completion, etc.
In the same way that OCaml's `Pervasives` module is automatically opened,
automatically populate the initial OCalf environment with useful functions.
A good place to start would be functions for I/O (e.g., the
`print_* and read_*` functions in OCaml's `Pervasives`).
You will need to extend the OCalf notion of values to include functions
that are provided by OCaml, rather than implemented in OCalf.
In this, you could be inspired by OCaml's [external definitions][external].
Instead of externals naming C functions, though, OCalf externals could name
Add syntactic support for lists to the OCalf lexer and parser, along with a
built-in `list` type constructor. Implement a collection of functions
similar to the OCaml `List` module in OCalf.
**Caution:** This karma problem is likely to cause you to reorganize your code.
Make sure you make good use of source control, helper functions, and unit tests
to ensure that you don't break your existing code while working on it!
The type inference algorithm described above allows us to give expressions
polymorphic types: `fun x -> x` will be given the type `'a -> 'a`.
But it does not let us use them polymorphically. Consider the following
let (any:t1) = (fun (x:t3) -> (x:t4)):t2 in (any 1, any "where")
OCaml will give this expression the type `int * string`, but our naive
inference algorithm fails.
The solution is *let-polymorphism*:
Every time a variable defined by a `let`
expression is used, we should create a new type variable corresponding to the use. We
use the constraints generated while type checking the body of the `let` as
a template for a new set of constraints on the new variable.
In the above example, while type checking the body of the let expression, we
will generate the constraints `t1 = t2`, `t2 = t3->t4`, and
`t3 = t4`. While checking the expression `(any:t5) 1`, we can
recreate these constraints with new type variables `t1'`, `t2'`, and
so on. We will output the constraints `t1'=t2'`, `t2' = t3'->t4'` and
`t3'=t4'`. We also will output the constraint `t5=t1'`.
Because the type variables are cloned, they are free to vary independently.
Because we have cloned their constraints, they must be used in a manner that is
consistent with their definition.
Implement an OCalf interpreter in OCalf.
Start by defining an OCalf variant to represent OCalf ASTs.
Then implement an `eval` function in OCalf to evaluate
OCalf expressions. Since OCalf doesn't have imperative
features, you won't be able to implement `let rec` with
backpatching. Instead, you could use a technique
called *recursive closures*, which is described in
the [final two slides of a 3110 lecture from Spring 2015][rec-cl].
Note those slides use a different notation than we used this semester.