Next, the collection phase would collect equations from each node: * We know that `(fun x -> e) : t1 -> t2` if `e:t2 `under the assumption `x:t1`. Stated differently, `(fun (x:t1) -> e:t2) : t3` if and only if `t3 = t1 -> t2`. Therefore, the equation `'t05 = 't04 -> 't03` would be collected from `fun` node. * Similarly, the equations `'t03 = int`, `'t02 = int`, and `'t01 = int` would be collected from the `+` node. * The equation `'t02 = int` would be collected from the `3` node. * We know that if `x` had type `t` when it was bound then it has type `t` when it is used. Therefore, `'t01 = 't04` would be collected from the `x: 't01` node. Finally, the system of equations is solved in the unification phase, assigning `int` to `'t01` through `'t04`, and `int -> int` to `'t05`. * * * We provide the annotation and unification phases of type inference for you; your task is to implement the function ``` collect : variant_spec list -> annotated_expr -> equation list ``` in `infer.ml`. We strongly encourage you to follow this plan: **Step 1: Implement collect without Match or Variant.** Implement `collect` for all syntactic forms except `Variant` and `Match`. **Step 2: Partially implement collect for Match.** Implement `collect` for `Match`, but omit handling of `PVariant` patterns. Make sure the bindings from the patterns are available while checking the bodies of the cases. **Step 3: Implement collect for Variant.** Extend `collect` to handle variants. Deriving the correct constraints for variants is tricky. Consider this code: ``` type 'a list = Cons of 'a * 'a list | Nil of unit let x = Cons(1,Nil ()) in let y = Cons(true,Nil ()) in 42 ``` An overly-strict implementation of collection might report a type error, if collection generates constraints that force the separate occurrences of `Nil` and `Cons` in binding `x` and `y` to have the same types. A better implementation would generate constraints with distinct type variables, thus permitting the code above. ## Karma Implementing some of these karma problems might require changing the names or types that appear in the `.mil` files in the release code. Do so with care. Adding a new function to an interface file will not break our testing harness, but changing the name or type of a previously existing function could, especially `eval` and `collect`. Likewise, extending the AST types will not break our testing harness, but changing the name of an existing constructor or the data it carries would. If in doubt, please ask! **REPL:** Implement an OCalf REPL. Like utop, your REPL could provide various directives, syntax highlighting, tab completion, etc. **Pervasives:** In the same way that OCaml's `Pervasives` module is automatically opened, automatically populate the initial OCalf environment with useful functions. A good place to start would be functions for I/O (e.g., the `print_* and read_*` functions in OCaml's `Pervasives`). You will need to extend the OCalf notion of values to include functions that are provided by OCaml, rather than implemented in OCalf. In this, you could be inspired by OCaml's [external definitions][external]. Instead of externals naming C functions, though, OCalf externals could name OCaml functions. [external]: http://caml.inria.fr/pub/docs/manual-ocaml/intfc.html#external-declaration **Lists:** Add syntactic support for lists to the OCalf lexer and parser, along with a built-in `list` type constructor. Implement a collection of functions similar to the OCaml `List` module in OCalf. **let-polymorphism:** **Caution:** This karma problem is likely to cause you to reorganize your code. Make sure you make good use of source control, helper functions, and unit tests to ensure that you don't break your existing code while working on it! The type inference algorithm described above allows us to give expressions polymorphic types: `fun x -> x` will be given the type `'a -> 'a`. But it does not let us use them polymorphically. Consider the following expression: ``` let (any:t1) = (fun (x:t3) -> (x:t4)):t2 in (any 1, any "where") ``` OCaml will give this expression the type `int * string`, but our naive inference algorithm fails. The solution is *let-polymorphism*: Every time a variable defined by a `let` expression is used, we should create a new type variable corresponding to the use. We use the constraints generated while type checking the body of the `let` as a template for a new set of constraints on the new variable. In the above example, while type checking the body of the let expression, we will generate the constraints `t1 = t2`, `t2 = t3->t4`, and `t3 = t4`. While checking the expression `(any:t5) 1`, we can recreate these constraints with new type variables `t1'`, `t2'`, and so on. We will output the constraints `t1'=t2'`, `t2' = t3'->t4'` and `t3'=t4'`. We also will output the constraint `t5=t1'`. Because the type variables are cloned, they are free to vary independently. Because we have cloned their constraints, they must be used in a manner that is consistent with their definition. **Meta:** Implement an OCalf interpreter in OCalf. Start by defining an OCalf variant to represent OCalf ASTs. Then implement an `eval` function in OCalf to evaluate OCalf expressions. Since OCalf doesn't have imperative features, you won't be able to implement `let rec` with backpatching. Instead, you could use a technique called *recursive closures*, which is described in the [final two slides of a 3110 lecture from Spring 2015][rec-cl]. Note those slides use a different notation than we used this semester. [rec-cl]: http://www.cs.cornell.edu/Courses/cs3110/2015sp/lectures/8/lec08.pdf