Problem Set 4: A Scheme Interpreter

Due Wednesday 10/21/09 11:59pm

Part 1: Logic

Note: In some browsers, the symbol for logical "and" may be rendered incorrectly as a small circle. It should look like an inverted ∨.

Propositional Logic
Prove the following statements using the inference rules provided in Lecture 13: Natural Deduction (don't use any of the tautologies).
1. A ∧ ¬B ⇒ ¬(A ⇒ B)
2. (A ⇒ C) ∧ (B ∧ ¬C) ⇒ ¬A
3. (¬A ∨ B) ⇔ (A ⇒ B)
Predicate Logic
Prove the following statements using the inference rules for propositional logic in Lecture 13: Natural Deduction and the rules for quantifiers in Lecture 12. Note that p, q are names of predicates.
1. ¬∃x. ⊥
2. (∀x. p(x) ⇒ q(x)) ⇒ ((∀x. p(x)) ⇒ (∀x. q(x)))
3. (∀x. p(x)) ⇒ (¬∃x.¬p(x))

What to Submit

Submit a file logic.txt or logic.pdf containing your solutions.

Part 2: A Scheme Interpreter

Overview

In this exercise, you will implement an interpreter for a simplified version of the Scheme programming language. Scheme is strongly related to LISP, the earliest functional language. The main difference between the two languages is that LISP is dynamically scoped, whereas Scheme is statically scoped, like OCaml.

We have provided low-level modules Lexer and Parser that can take a string input, parse it as a Scheme program, and output an internal representation of the program in the form of an abstract syntax tree (AST). This is defined in the module Ast, also provided. Your job will be to write functions for supplying input to the parser and for evaluating the AST that is returned.

The Scheme Language

Values

The language has six types of values:

Integers, which correspond to the OCaml type int
Strings, which correspond to the OCaml type string
Bools, which correspond to the OCaml type bool
Closures, which represent functions of one input and one output, along with an environment
Cons pairs
Nil.

There is also a seventh type, Undef, used internally for creating recursive definitions.

The following code, taken from heap.ml, declares the type value. When an expression is evaluated successfully, the resulting value is an entity of this type.

type value = Int of int | Str of string | Bool of bool
           | Closure of expr * env
           | Cons of value * value | Nil
           | Undef

Expressions

The lexical analyzer and parser convert an input program formatted in plain text to an AST, an entity of type Ast.expr, which can then be evaluated. The parser recognizes the following patterns:

Integers (Ast.Int_e of int) consist of sequences of one or more numeric characters. In our version of Scheme, this excludes negative integers, so that 42 and 007 are considered valid integers, whereas -1 is not
Strings (Ast.Str_e of string) consist of sequences of zero or more characters enclosed in double quotes, e.g. "Hello". Certain strings can be escaped by the backslash character:

Character Escaped

\ \\

" \"

<newline> \n

<tab> \t
Bools (Ast.Bool_e of bool) are specified by the strings #t for true or #f for false
Identifiers (ids) (Ast.Id_e of Ast.id) consist of a single lower- or upper-case letter followed by zero or more lower- or upper-case letters, numeric digits, underscores, and single-quote characters.

Character	Escaped
\	\\
"	\"
<newline>	\n
<tab>	\t

In the following expressions, the parentheses must be present:

(lambda x e) represents an anonymous function that takes a single value v as its input and evaluates e with the binding x=v. It is fully analogous to the OCaml fun x -> e. Its AST representation is an entity of type Ast.Fun_e of Ast.id * Ast.expr.
(define x e) may only be entered at the top level. When evaluated, it creates a top-level binding that binds the identifier x to the value obtained by evaluating e in the current top-level environment. This new binding is then added to the top-level environment. Its AST representation is an entity of type Ast.Def_e of Ast.id * Ast.expr.
(if e0 e1 e2) is a conditional test, fully analogous to the OCaml if e0 then e1 else e2. One difference, however, is that Scheme is not statically typed, but dynamically typed. This means that the interpreter must check after evaluating the test e0 that the result is a boolean, and if not, raise a runtime exception. If the result is true, it should evaluate and return the value of e1. If the result is false, it should evaluate and return the value of e2. Its AST is an entity of type
```
Ast.If_e of Ast.expr * Ast.expr * Ast.expr
```
.

(BO e1 e2) represents an application of a binary operation in any of the following cases:

BO	AST Name	Meaning
+	`Ast.Plus`	Integer addition
-	`Ast.Minus`	Integer subtraction
*	`Ast.Mul`	Integer multiplication
/	`Ast.Div`	Integer division
%	`Ast.Mod`	Integer modulus
=	`Ast.Eq`	Test for equality
!=	`Ast.Neq`	Test for inequality
<	`Ast.Lt`	Comparison (less than)
<=	`Ast.Leq`	Comparison (less than or equal to)
>	`Ast.Gt`	Comparison (greater than)
>=	`Ast.Geq`	Comparison (greater than or equal to)
&	`Ast.And`	Logical AND
\|	`Ast.Or`	Logical OR

A binary operation is represented by the following AST type:

Ast.Binop_e of op * expr * expr

(UO e) represents an application of a unary operation in any of the following cases:

UO	AST Name	Meaning
-	`Ast.Minus`	Multiply by -1
~	`Ast.Not`	Logical complement
car	`Ast.Car`	Take the first element of a `cons` pair
cdr	`Ast.Cdr`	Take the second element of a `cons` pair
null	`Ast.Null`	Take a cons pair as input. If that pair is `nil` then return true; otherwise return false
load	`Ast.Load`	Read a file (specified by a string-valued argument) from disk

A unary operation is represented by the following AST type:

Ast.Unop_e of op * expr

Notice that the minus sign is the only operator that can be used as both a unary and a binary operator.

Lists

There is another binary operator cons that can be used to create a pair. The expression (cons e1 e2), when evaluated, should recursively evaluate e1 and e2, then create a value of type Cons of value * value from the resulting values of e1 and e2. There are corresponding projections car and cdr that extract the first and second components of a pair, respectively; thus

(car (cons 1 2)) = 1
(cdr (cons 1 2)) = 2

The names car and cdr are historical; they stand for contents of the address register and contents of the decrement register, respectively, and refer to hardware components of the IBM 704 computer on which LISP was originally implemented in the late 1950s.

There is also a special entity nil that is used with cons to form lists. Lists in Scheme are sequences of objects combined using cons and terminated by nil. For example, the list

(cons 1 (cons 2 (cons 3 nil)))

is the Scheme equivalent of the OCaml list [1;2;3]. The object nil serves as the empty list, and is analogous to the OCaml []. The AST representation of nil is Ast.Nil_e.

There is a shorthand way to create lists in Scheme. Instead of typing

(cons 1 (cons 2 (cons 3 nil)))

one can type

(list 1 2 3)

and the resulting value is the same list. The output form of a list, i.e. the string that should be printed by your interpreter when you type either of the above expressions at the prompt, is (1 2 3).

There is an alternative syntax for cons. Instead of typing (cons 1 2), one can type (1 . 2). This is also the output form when the value is printed. Thus if you type (cons 1 2) at your Scheme interpreter, it should respond with (1 . 2). Note, however, that if the second component is nil instead of 2, it should respond with (1), because in that case it is a list of one element 1. This is true of any of the following input expressions:

(cons 1 nil) = (1)
(list 1) = (1)
(1 . nil) = (1)

More generally, if you have any sequence of objects combined with cons, but not terminated with nil, then the output form puts a dot before the last element. Thus

(cons 1 (cons 2 3)) = (1 2 . 3)
(cons 1 (cons 2 nil)) = (1 2)

Semantic Overview

Operations in Scheme are all represented in prefix notation and must be enclosed in parentheses. For instance, to add 5 and 7, we would type (+ 5 7). There is no notion of precedence and all compound expressions must be enclosed in parentheses. For instance, the OCaml expression 5 + (2 * 7) would be written (+ (5 (* 2 7))) in Scheme.

Comparisons may be performed on Int, String, and Bool types, while arithmetic operators are valid only on Ints, and boolean operations are valid only on Bools.

Values can be bound to variables in several ways. Top-level global variables are bound using the define keyword. Local variables are bound using the let keyword. The expression (let x 5 (* 2 x)) is equivalent to the OCaml let x = 5 in 2*x.

Finally, anonymous functions are defined using the lambda keyword. These can be bound to variables using define or let. For example, (lambda x (* 2 x)) in Scheme is equivalent to fun x -> 2*x in OCaml, and (let f (lambda x (* 2 x)) (f 4)) in Scheme is equivalent to let f = fun x -> 2*x in (f 4) in OCaml. If the function is recursive, use letrec and definerec instead. These behave identically to let and define, respectively, except that before the second argument is evaluated, the id given as the first argument is included in the environment bound to a dummy value (Undef); then after the second argument is evaluated, the binding is updated to that value. This means that the id may occur in the function body so that the function may call itself recursively. This is not possible with let and define.

An if expression evaluates its first argument, checks that it is a boolean, and if so, evaluates the second or third argument according as the value of the first argument is true or false, respectively. The values of second and third arguments need not be of the same type.

When applied to a cons pair, car returns the first component, while cdr returns the second. You can test whether a list is empty using the null operator.

The load operator can be used to load and interpret the contents of a file. The operators load, define, and definerec may be used only at the top level.

Your Tasks

All code you submit must adhere to the specifications defined in the given .mli files.

You must implement functions in the following files:

eval.ml This module is responsible for evaluating Scheme expressions. There is a function eval that takes an AST (Ast.expr) and an environment (Heap.env) and returns a value (Heap.value). Two easy cases have been done for you; you must implement the rest.
heap.ml This module defines Scheme values and environments. You must complete all the functions outlined in heap.mli. The function value_to_string creates a string from a value to display the result of an evaluation. For Str values, simply output the appropriate string, and for Int and Bool values, use OCaml's string casting methods. Closures should be displayed as <fun>, and nil as ().

Lists and cons pairs will need to be handled more carefully. The convention in Scheme is to output a cons pair as (x . y) for a single pair of values, unless y is nil, in which case it is a list of one element x, therefore should be displayed as (x). For nested cons of the form (cons x (cons y (cons z t))), output the value as (x y z . t), placing a period only before the final element, unless t is nil, in which case it is a list and should be displayed as (x y z). Note that the same rules apply if some of the components are cons pairs themselves; e.g., (cons (cons 1 2) (cons 3 4)) should be displayed as ((1 . 2) 3 . 4).
repl.ml This module contains the main read-eval-print loop, as well as functions for parsing the input and starting evaluation. First, complete the function eval_one. This should handle any top-level operations (define, definerec, and load) that cannot exist elsewhere in a program. Any definitions created with define or definerec, including those in loaded files, should be added to the top-level environment. Next, implement the function read_file, which reads code from a specified input file.

You may add function definitions to any .mli files, however you should be sure not to change any existing definitions.

Building Your Code

We have provided scripts buildProgram.bat and buildProgram.sh that you can use to build your project. This will produce a file interpreter which you can execute to launch the command line interface.

We have also provided a file input.txt with some Scheme expressions that you can use to test your interpreter. Here is a sample run from our solution:

? (load "input.txt")

>> <fun>
>> <fun>
>> <fun>
>> <fun>
>> <fun>
? (fact 3)

>> 6
? (define x (list 1 2 3 4 5))

>> (1 2 3 4 5)
? (length x)

>> 5
? (rev x)

>> (5 4 3 2 1)
? (fib 20)

>> (6765 4181 2584 1597 987 610 377 233 144 89 55 34 21 13 8 5 3 2 1 1 0)
? (let z (fib 20) (rev z))

>> (0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765)
?

What to submit

Submit a zip file ps4.zip containing the following files: eval.ml, repl.ml, heap.ml, ast.ml, util.ml, eval.mli, repl.mli, heap.mli, ast.mli, and util.mli.

Karma: Fun with Scheme!

Once you've completed your Scheme interpreter, why not write some useful code to test it out? Complete the following functions to gain valuable practice in coding in Scheme (oh, and also you may get some karma).

map and fold: Wish you had your favorite list traversal functions available in Scheme? Why not code them yourself? Write functions fold_left, fold_right, and map that behave the same as their OCaml counterparts.
flatten: As in OCaml, flatten should take in a list of lists and return a single list that is the concatenation of all the input lists. The ordering of the elements should be preserved.
quicksort: Implement quicksort such that (quicksort l) will return a list with the same elements as l, but sorted in ascending order.

What to submit

A file karma.txt containing your implementations, along with some sample runs of your interpreter on them.