PS#3 is out, we hope you are hard at work on it.
Note that you should sort nested lists lexicographically as well.
The first prelim will focus on the RSM. The section before the prelim will be dedicated to a review. The best advice is to look at the first prelim from 1 year ago (it’s on the web site). Your prelim will look an awful lot like it!
Instead of substituting, we maintain an enviroment, which contains all the substitutions that we need to do. The contract of evaluate is now to evaluate an expression in an environment (i.e., with a set of substitutions). It is important to remember this: it now makes no sense to talk about evaluating an expression; it only makes sense to evaluate an expression in an environment. Whenever we see an occurrence of a variable, we look it up in the environment to gets it value.
Let us now introduce the notion of an environment. A binding contains an identifier and a value. We will informally write them as answer: 42. An environment is just an ordered set of bindings. It can be viewed as a map from identifiers to values, where the value is the first identifier.
Basically, this follows the signature given in PS#3 (in environment.sig):
signature ENVIRONMENT = sig type env val lookupBinding : env -> string -> Values.value option val insertBinding : env -> string -> Values.value -> envend
The semantics of lookup is that it returns the first binding that it finds, and that insert adds at the beginning (more precisely, returns a new env with the new binding at the start, then the old env).
We will describe the details in a moment, but basically whenever we evaluate an expression in an environment, if it is an identifier we look it up in the environment. Now, of course, there is a problem here, which is that the interpreter needs to start off with some environment in the main REPL.
In mini-ML, the initial environment is called the global environment, and contains only a few bindings (for example, for “hd” and “tl”).
Doing a let adds bindings to the environment. This now allows us to look at some examples. Before doing this, though, we need some notation for drawing environments. We will draw them as linked boxes, where the box containts bindings. Note that most of the time the value of the binding will go in the box. [One exception is coming, more eventually…]
The global environment will be drawn as a box in a box. An environment that extends the an older environment will be drawn with the new binding in a box, and an arrow (up) to the old environment.
We can now work through some examples:
hd
let val x = 3 in 39 + x end
let val x = 3 in
let val x = x+30 in x+6 end end
This actually works out extremely well, and we get the right behavior (in terms of shadowing) essentially for free.
In the evalutor, environments are used in a small number of important places. When we see an identifier, we call lookup. When we see a let, we add bindings to the current environment.
The other, extremely important, place that environments are used is with function (creating them and applying them). When we see a Fn_e expression we will create something called a closure.
Before getting into how closures are handled in the evaluator, we need to think about a few subtleties of the RSM. Because of the substitution model, when we evaluate a fn expression we essentially capture all the free variables that it uses. Example:
let val x = 3 in fn(y) => x + y end
This expression evaluates, by the RSM, to fn(y) => 3 + y. We can use it directly (i.e. substitute it anyplace we could say fn(y)=> 3 + y).
So for example we could say [warning: sample prelim question ahead!]
let val myfun =
(let val x = 3 in fn(y) => x + y end) val x = 39in
myfun(x)end
But remember that in the evaluator, we don’t do substitutions directly, we just keep a list of current substitutions.
So when we evaluate a fn expression in an environment, we need to hold on to the environment in which the expression is evaluated. This is accomplished by creating a closure.
A closure is a data structure with 3 parts: a list of arguments, a body, and an environment. I will draw a closure as a pair of circles. A binding for a closure (like for myfun in the example above) won’t be drawn directly in the environment, but rather with a pointer to it.
When we apply a function (i.e. we see a combination) we evaluate the arguments then add them to the closure’s environment (then, we call eval on the body in this environment). It is extremely important to understand this, and well worth going through some examples!
One of the major limitations of ML is that it is impossible to add new special forms. What do I mean by a special form? A function that controls the evaluation of its arguments.
In ML, functions evaluate their arguments (be sure you know this fact). As a consequence, it is impossible to write a function ifnot such that
ifnot(x,y,z) = if (not x) then y else z
Why? Consider
fun ifnot(x:bool, y:int, z:int):int = if (not x) then y else z
Let's try it:
3+ifnot(1 > 2, 39, 42)
è42
So it works. But, what about ifnot(1 > 2, 25, raise Fail "can't get here")? the function will raise the exception regardless of the value of the boolean expression!. So in the SML implementation we have, we are stuck; there is no way to write a new special form. The special forms are fixed; we're stuck with the ones that happen to be built into the language.
But what about Mini-ML? A different story! We can modify the interpreter by adding new special forms to it. Here are the examples we will focus on:
ifnot(e1,e2,e3)
To have that e2 will be evaluated only if e1 is false, and e3 will be evaluated if e1 is true.
ifmaybe(e1,e2) [flips a coin]
To have that e1 will be evaluated only if flip chooses e1, and e2 in the other case.
letsubst(var,e1,e2) [substitute e1 for var in e2]
To have possible unbounded values to substitute, for instance
letsubst(x,z+3,let z:int = 39 in x end)
èlet z:int = 39 in z+3 end
è42
If we try to use a SML let special form, we cannot do this:
let val x = z + 3 in let z:int = 39 in x end
If z is not defined, this can't be evaluated. In the case that z has another bound value, for instance 10, then the answer will be 13, which is not what we wanted.