You may do this problem set alone or in pairs. You will submit the files you changed through CMS.
Functions in SML can be treated as regular values, hence we can write expressions that "compute" functions. Such flexibility does not exist, for example, in Java and C/C++/C#.
In answering the questions below you must use only functions whose result depends explicitly on all of their arguments (e.g. you can not use constant functions, which are functions that return the same value irrespective of their arguments' values).
fun O x y = x y y
Function O is a curried function of two arguments. The first argument is a
2-argument function, which is applied to the duplicated second argument.
The type of O is (t1 -> t1 -> t2) -> t1 -> t2, where t1 and t2 are
generic, unspecified types. You can check this type signature if you type
its definition at the SML prompt (note that SML uses notations like 'a and
?.X1 for types that are not fully specified).
For this problem, edit prob1.sml.
(a) Provide specific values for x and y such that the value of expression
(O x y) is the integer 312.
(b) In the expression O (O (O <X>)) <Y> <Z>, replace each of the placeholders
<X> , <Y>, and <Z> with one or more appropriate expressions, such that the
entire expression evaluates to 2003.
Consider the definition below:
fun O2 x y z = x (y z)
Operator O2 is a curried function of three arguments that applies the 1-argument function f to the result of applying 1-argument function y to value z.
For this problem, put your answers in prob2.txt.
(a) Define in a manner similar to the explanation above, what
function O2 O2 O2 does.
(b) Prove that O2 o O2 = O2 O2 O2. The o operator is the function
composition operator defined in SML as (f o g) x = f (g (x)). Hint: Your
proof should not be a formal proof within the framework of the environment
model, but rather a sequence of transformations that show the equivalence
of the two expressions.
In lecture, we discussed proofs of algorithm correctness by induction. Here, you will need to construct such a proof.
fun sum(l:int list):int =
if (null l) then 0
else (hd l) + sum (tl l)
For this problem, put your answer in prob3.txt.
Prove that the function sum computes the sum of the integers in the provided list. Use the substitution model to explain its behavior. Your proof must contain all four parts (as defined in lecture) of an induction proof, clearly labeled. If a part is not labelled you will not receive credit for it.
In the last problem set, you were introduced to the idea of a calculator, that is, an arithmetic expression evaluator. Now, we are introducing an evaluator for a subset of ML. In this problem set, you will work with a fairly simple ML interpreter.
The most important part of this part of the problem set is understanding the interpreter. If you take the time to understand the interpreter before you begin, the following exercises will all be easy. If instead you start typing before thinking, expect to spend a long time on this section.
The Mini-ML language attempts to be a subset of ML. For reasons of simplicity, though, it does not behave exactly as ML does in some aspects.
The Mini-ML language has very few types. It has integers, booleans, functions, lists, tuples, and unit. There are no type conversions.
Because of the few types, there are fewer operators and expressions. The following is a representative list of what is still in the language:
+ - * mod div = > < >= <= :: @ andalso orelse ~ not 234 true () foo if/then/else let/in/end fn (x) => x #1 (3, true) val f = 3 hd tl null
The syntax of Mini-ML is very similar to that of ML. A formal BNF definition can be found in mini-ml.grm, which is a file that generates the parser. The semantics of Mini-ML is defined by the evaluator, found in evaluator.sml.
The fun keyword is not present. You can declare functions using fn and bind them to a variable:
val f = fn (x) => x + 1
(The parentheses around function arguments are mandatory.)
Note that recursive functions, as we know them, are impossible. (In the next problem set, we will discuss how it is actually possible to implement them. But for now, let us assume there is no way of doing this.)
No form of pattern matching exists. This means that any function
acting on a list needs to use the functions hd and tl rather than
using pattern matching.
All typing is done at runtime. When an attempt is made to apply a
primitive operation on a type which is not meaningful (i.e. adding
booleans, 87 as the condition of an if) an error is raised. If this code is never executed, though,
no error is raised. For example,
if true then 3 else 8 + [34]
is a perfectly acceptable expression. This also means that you do not need to declare the types of any expressions in Mini-ML.
The only types in Mini-ML are those defined by the interpreter. Users cannot create their own.
These language features are missing from Mini-ML.
Much like the calculator of problem set 2, the evaluator contains a few colon commands. Type :h at the Mini-ML prompt to get a list.
There are a few predefined values in Mini-ML. You can see what they are in ListEnvironment.
The dynamic typing allows you to put anything you want in a list. So
[2, true, (), fn (x) => x]
is a perfectly legal list. It is recommended that you keep your lists homogenous, however.
The outer structure of the interpreter is in the file interpreter.sml. The main read-eval-print loop is here. Basically, it takes input from the user and hands it to one of the functions in the Evaluator structure.
The Evaluator signature, found in evaluator.sml, only defines two functions, but they are integral. The evaluate function turns an expression into a value (as defined in values.sml). The evaluateDeclare function uses a declaration to modify an environment.
An environment is a mapping from names to values. It is where the active variables are stored. (More on this later.) The Interpreter is specialized over the particular environment using a functor, so to actually use the interpreter with ListEnvironment, for instance, you will have to do
structure I = InterpreterFn(ListEnvironment)
I.run()
Once a value has been determined, the read-eval-print loop calls the Printer.show function defined in printer.sml to display the computed value. The other structures exist mainly to define the types on which these ones operate.
The meaning of life is six times nine is forty-two. Unfortunately, the evaluator does not yet know this. Your job will be to enlighten it.
For questions a and b, put your answers in prob5.txt. Just include the line numbers you changed and what you changed those lines to.
a) Change the evaluator so that any expression that evaluates to 6 multiplied by any expression that evaluates to 9, or vice versa, evaluates to 42.
For instance:
Mini-ML =) 6 * 9;
42
Mini-ML =) (4 + 5) * (12 div 2)
42
b) Change the evaluator so that only an integer constant 6 multiplied directly by another integer constant 9 evaluates to 42. For instance:
Mini-ML =) 6 * 9
42
Mini-ML =) 9 * 6
54
Mini-ML =) (5+1) * 9
54
For question c, submit the changed file.
c) Implement lexicographical comparisons on lists using the <, <=, >, >= operators. Lexicographical essentially means dictionary order.
For nonempty lists L1 and L2,
For any lists L1 and L2,
Some examples:
Mini-ML =) [] < [1,2,3]
true
Mini-ML =) [1] > [1]
false
Mini-ML =) [1,2,4] > [1,2,3]
true
Mini-ML =) [1,2,3] >= [1,2]
true
Mini-ML =) [] <= []
true
The meaning of an expression like x + 3 is totally dependent on what x represents. In programming languages as in mathematics, we call symbols like x variables. A natural question to ask is how are values assigned to variables?
An environment is a mapping of variables to values. Any expression that involves some number of unbound variables will be meaningless without an environment to translate those variables into values. In this problem, we investigate the possibility of using data structures to make this process of looking up variables in the environment more efficient.
The default implementation of environment variables uses lists to store these values. To add a new variable binding to the environment, we simply add it with the desired value to the front of the list. To look up a variable, we search along the list until we either find the variable and get its value or fail to find it (causing an error).
There is a top-level environment which consists first of all builtin functions and also of any val bindings made at the top level. A sequence of variable bindings and expressions at the top level like
val x = 2
val y = x * x
val z = y + x
z
is equivalent to the following. On the right, we show the environment as it is represented by our ListEnvironment.
[]
let
val x = 2 [("x",Int_v 2)]
in
let
val y = x * x [("y",Int_v 4), ("x",Int_v 2)]
in
let
val z = y + x [("z",Int_v 6), ("y",Int_v 4), ("x",Int_v 2)]
in
z
end [("y",Int_v 4), ("x",Int_v 2)]
end [("x",Int_v 2)]
end []
Each new variable binding is equivalent to entering a new let.
Expressions have no effect on the Mini-ML environment. So when an expression like val x = 2 is evaluated at the
Mini-ML prompt, first the value of the expression (2) is determined.
Then the binding of "x" to 2 is made. It is in this
modified environment that all future expressions and declarations are
evaluated.
Here is a somewhat more complicated example involving scope.
let
val x = 2
val y = 3 + x
in
let
val x = (let val y = 2 in y * y end)
in
x + y
end
end
When multiple val declarations are made within the same let, this is equivalent to entering a new let for each new declaration. So the above is equivalent to the following. Again the right hand side shows our ListEnvironment.
[]
let
val x = 2 [("x", Int_v 2)]
in
let
val y = 3 + x [("y", Int_v 5),("x", Int_v 2)]
in
let
val x =
let
val y = 2 [("y", Int_v 2),("y", Int_v 5),("x", Int_v 2)]
in
y * y
end [("y", Int_v 5),("x", Int_v 2)]
[("x", Int_v 4),("y", Int_v 5),("x", Int_v 2)]
in
x + y
end [("y", Int_v 5),("x", Int_v 2)]
end [("x", Int_v 2)]
end []
Note that looking up a variable must use the most recent binding of
a value to that variable. The ListEnvironment works by adding new
bindings to the front of the list, and returning the first binding
that matches. So in the expression x + y above, the x
that is referred to is the first x reading from left to right: that
is, the most recently added x.
When a function expression is evaluated, it is necessary to
generate a closure for the body of that function. Briefly,
suppose you have something like Later, you evaluate
val f = let val y = 3 in fn (x) => y end
Now, you want let val y = 4 in f 2 end
y in the
body of the function that is bound here to f to refer to the y that
existed when f was defined, not to the y that has replaced it. f is defined,
we save a closure, which consists of a copy of the environment
together with the body expression of the function, inside every
function value. For technical reasons relating to SML's structure
system, the closure does not actually contain an environment but rather an exception that
wraps around an environment.
The reasons behind this involve mutual recursion among structures. Basically, you can have
datatype Foo = Blank | F of Bar
and datatype Bar = B of Foo
within the same structure. That is what we want to do with environments and values in Mini-ML: An environment should be able to contain function values, and a function value should be able to contain an environment. Unfortunately, types cannot be mutually recursive if they are defined in distinct structures. The workaround we use involves a peculiar property of exceptions. We declare a value as
datatype value =
Int_v of int
| Bool_v of bool
...
| Fn_v of (string option (* function name *)
* exn (* environment *)
* string list (* names of arguments *)
* exp) (* body *)
In the evaluator functor, we declare
exception ClosureEnv of env
and it is this exception that we store in a Fn_v. So when a function expression is evaluated, we take the current environment, wrap it in a ClosureEnv, and store it in the Fn_v that we produce.
(If you are interested in why we use an exception here, the reason is that the exn type is the only type in SML that is defined at one place but can be extended at another place. This is admittedly a hack, and there are better ways of handling closures, with techniques we will describe later in the course.)
When a function is applied to its arguments, the evaluator processes this first by evaluating the function expression and its argument expression. Then, a new environment is created by adding the names and values of the arguments to the environment inside the function closure. Finally, the expression representing the body of the function is evaluated in this environment.
The implementation of environments using lists has some advantages;
it is very simple and can be fast when the number of variables is
small. Also, it handles shadowing well. For instance, the expression
should evaluate to 3 rather than 2, as "inner" bindings
override "outer" ones. The list implementation simply puts
newer bindings on the front of the list and searches starting at the
beginning. It will use the first matching string it finds to
determine a value, so this process is handled fairly easily.
let val x = 2 in (let val x = 3 in x end) end
For this problem, you will implement RedBlackTreeEnvironment (in the file rbtreeEnvironment.sml), which uses red-black trees to handle lookup. The implementation of environment lookup is changeable simply by changing the various functors to use something other than ListEnvironment, so to test your environment you need to run
structure I = InterpreterFn(RedBlackTreeEnvironment)
I.run()
The most important part of this problem is that the RedBlackTreeEnvironment behaves exactly as the ListEnvironment does with respect to such things as scope and variable shadowing. Only its running time should differ from that of ListEnvironment.
You may adapt the red-black tree code we have given you, but note that it currently provides functionality for a set of integers, when you need to implement a mapping from strings to values. Take special care that shadowed variables are handled correctly.
You must implement all the functions in environment.sig. The
function lookupBinding looks up the specified string in the specified
environment and returns NONE if it is not found, or
SOME v where v is the appropriate value. The function
insertBinding inserts the specified value bound to the specified
string in the specified environment. It returns the new environment.
The topLevel environment is an environment containing the basic
primitives. These should be the same as the ones defined in
listEnvironment.sml.
The toString function deserves special attention. You need to implement this function so we can check that your implementation of red-black trees works correctly. The function should behave the following way:
For example (using R and B to represent red and black nodes):
B => bEE B / \ => brEErEE R R B / \ B R => bbEErbEEbEE / \ B B
This is not a difficult function to write, however, it is crucial that you get it right.
We have included support in the interpreter (in the colon functions) for timing how long an evaluation takes. Time 10,000 evaluations of the following (admittedly contrived) expression:
[allSet,arguments,alrm,andb,Array,anySet,app,array,abrt,Abs]
Time this using ListEnvironment and using your RBTreeEnvironment. Put the two times in a comment at the top of RBTreeEnvironment.