An ML Interpreter

The kinds of semantics we have looked at are **operational semantics**:
descriptions of how to evaluate programs (there are other kinds of semantics,
such **axiomatic semantics**, which tell you how to prove statements about
programs). There is even more than one way to specify an operational semantics
for a given programming language. We have been exploring a particular
operational model of evaluation called the **substitution
model**. The key idea of the substitution model is that when a variable is
bound to a value (by pattern-matching), the value is substituted in place of all
occurrences of the variable that are bound by the pattern in question.

In a functional language, we can think of the execution of the program as a
series of **rewrite** steps applied to the program text. This is also how we
usually think about the evaluation of an arithmetic expression. For example, if
we see the expression (2+3)*4+3*4, we know that it evaluates in four steps:

(2+3)*4+3*4 -> 5*4+3*4 -> 20+3*4 -> 20+12 -> 32

In each step, we take some part of the expression and replace it with a new
expression. For example, in the first step we replace 2+3 with 5. Thus, each
rewrite step acts locally to replace a subexpression with its value. These local
rewritings are called **reductions**.

Sometime there are several rewrite steps we can choose for a given
expression; these different choice lead to different** evaluation orders**.
There are actually several possible evaluation orders for this expression; for
example, here is a different one:

(2+3)*4+3*4 -> (2+3)*4+12 -> 5*4+12 -> 20+12 -> 32

It doesn't matter what order we evaluate things in; we always get the same result regardless. This will also be true for SML as long as we stick to functional language features (that is, stay away from imperative features such as refs, arrays, :=, etc.) One benefit of functional programming is precisely that the result of evaluating an expression is always the same; it does not depend on the order of evaluation and it is always the same no matter how many times it is evaluated.

Here are some examples of simple SML evaluations:

#2(2+3*4, false) -> #2(2+12, false) -> #2(14, false) -> false false::(false orelse true)::nil -> false::true::nil

These evaluations use various reductions that are part of SML. For example,
there are lots of arithmetic reductions of the form

, In addition there are
reductions on tuples; as seen in the first example, we have a reduction*v*_{1} **op**
*v*_{2}->*v*_{3}

#i(v_{1},...,v) ->_{n}v<=_{i }(where 1i<=n)

Every SML expression form has its own reductions. For example, the `if`

..`then`

..`else`

expression has two reductions that capture the essential computational behavior
of the expression:

if true thene_{1}elsee_{2}->e_{1}if false thene_{1}elsee_{2}->e_{2}

When does the program stop? In arithmetic, it's when we reach a number,
because there are no further steps to take. In general, we have some set
of expressions in the programming language that can't be evaluated any further;
we call these expressions **values**. Values are things that you can type at
the SML prompt and get the same thing right back. For example, in SML, the
following are values:

1 true "hello" (true, "5", 1) fn(x:int) => x 5::4::nil (=[5,4])

The following expressions are not values, because an evaluation step can be performed on them:

```
1+2
true orelse false
(true, "5", 0+1)
(fn(x:int) => x) (3)
```

We can write a **BNF grammar** for values

, just as
we did earlier for expressions:*v*

c::=integer_const|bool_const|string_const | real_const|char_const

v::=c(* constants *) | (v1,...,vn) (* tuples of values *) | (fn (id:t):t' =>e) (* anonymous functions *) | {id1=v1, ...,idn=vn} (* records of values *) |Id|Id(v) (* data constructors *)

Anything described by this grammar is a value and thus a legal result of an
SML program. In other words, any tuple whose elements are values is a value
itself; any records whose fields are bound to values is a value, any data
constructor applied to a value is also a value, and any anonymous function is a
value—even if its body is an arbitrary expression *e*. In other words,
the body of a function is not evaluated at all until it is applied to an
argument.

How do we know that a program will always reach a value? Actually, we don't.
A program might go into an infinite loop. But no matter how long the program
executes, as long as it hasn't reached a value there will always be a reduction
to perform. For example, we'll never have to apply a reduction to `#`

...*i*(*v*_{1}*,*

where *,v _{n}*)

*i* > *n*

. The SML type checker ensures that
this and other bad things will never happen. This is what it means to say that
SML is type-safe.Of course, SML is quite a bit more complicated than 3rd-grade arithmetic. The
biggest difference is that in SML expressions can contain variables: *names*
that are bound to *values*. In the substitution model we handle variables
by *substituting* for them using the values to which they are bound. For
example, the expression `let val x=2 in x+3 end`

is evaluated by taking its right-hand side, `x+3`

, and substituting
all occurrences of `x`

with the value to which it bound, `2`

.
Therefore, it steps to `2+3`

and then to `5`

. In general,
an expression of the form `let val `

is evaluated by replacing it with *x*=*v* in *e'*
end

, but with
occurrences of *e'*

replaced by *x*

. We
denote the result of this substitution as *v*

;
that is, there is a reduction*e'***{***v/x***}**

`let val `

*x*=*v* in *e'*

Here are some examples of substitution:

x{true/x}= true x{true/y}= x (x+(2*x)){1/x}= 1 + (2*1) (x + let val x = 1 in x end){2/x}= (2 + let val x = 1 in x end) (fn x: int => x+1)(#1 x){(3,"three")/x} =(fn x: int => x+1)(#1 (3,"three"))

Occurrences of a variable in an expression can be either **bound***, ***unbound**,
or **binding** occurrences. For example, in the expression x+3, the variable
x is unbound: its meaning is not defined by the expression. In the expression ```
x + let val
x = 1 in x+3 end
```

, the first occurrence of `x`

is unbound; the second is a binding occurrence that binds `x`

to the
value 1 throughout the body of the let expression. The third occurrence is a
bound occurrence because it occurs within the **scope **of the second,
binding occurrence.

The last two substitution examples illustrate an important point: when we
substitute for some variable *x*, we don't
replace the binding or bound occurrences of *x*,
because that variable is really a different variable despite having the same
name.

We can also use substitution to explain the action of a function invocation. An expression of the form

`(fn(x: t) => e) (v)`

reduces to

e{v/x}

That is, we take the body of the function and replace all unbound occurrences
of

(which must have been bound by the binding occurrence
in the argument list) with the actual argument value *x** v*.

What about named functions? A declaration of the form

funf(x:t):t'=e

is mostly just syntactic sugar for the declaration

valf= fn(x:t) =>e

(it isn't completely syntactic sugar because a named function can refer to itself recursively. But that's another story.) So we can understand the evaluation of calls to non-anonymous functions as using the same rule that anonymous functions do. Here's an example:

let val y = 3 in fun f(x:int):int = x*y in f(2+y) end -> (let reduction) let fun f(x:int):int = x*3 in f(2+3) end -> (let reduction) (fn(x:int):int => x*3)(2+3) -> (+ reduction) (fn(x:int):int => x*3)(5) -> (fn application reduction) 5*3 -> (* reduction) 8

The other thing we have to keep in mind is that we can't perform reductions just anywhere. Each SML expression imposes some order on the evaluation of its subexpressions. For example, no reductions can be performed on the body of a let expression until all of its declarations have been evaluated and the results substituted into the body. Similarly, no evaluations are performed

When we talk about language semantics, we first need to say what it is we are defining the semantics of; that is, what is our representation of a "program". One obvious representation is the stream of bytes that are the ASCII codes for the characters in the program. However, this representation is not convenient for talking about language semantics.

Early in the course we commented on a similarity between BNF declarations and
`datatype`

declarations. In fact, we can define datatype declarations
that act like the corresponding BNF declarations. The values of these datatypes
then represent legal expressions that can occur in the language. For example,
our earlier BNF definition of legal SML types

(base types) b::= int | real | string | bool | char(types)t::=b|t_{1}->t_{2}|t_{1}*t_{2}*...*t| {_{n}id_{1}:t_{1},...,id:_{n}t}_{n}|id

has the same structure as the following datatype declarations:

type id = string datatype baseType = Int | Real | String | Bool | Char datatype type_ = Base of baseType | Arrow of type_*type_ | Product of type_ List | Record of (id*type_) List | DatatypeName of id

Any legal SML type expression can be represented by a value of type `type_`

that contains all the information in the type expression. This value is known as
the *abstract syntax* for that expression. It is abstract, because it
doesn't contain any information about the actual symbols used to represent the
expression in the program. For example, the abstract syntax for the (type)
expression `int*bool->{name: string}`

would be

Arrow( Product(Cons(Base Int, Cons(Base Bool, Nil))), Record(Cons(("name", Base String), Nil)))

It will be convenient to draw abstract syntax as trees. For example, the
expression above has the following **abstract syntax tree (AST)**:

`->`

is written where `Arrow`

could be as easily and as correctly written
instead.
Compilers typically use abstract syntax internally to represent the program
that they are compiling, and we can also use it to talk about operational
semantics. Inside a compiler it is the job of the **parser** to convert the
string-of-characters representation of the program into the abstract syntax.
Parsers can be built mostly automatically by giving the BNF grammar for the
language to an **parser generator**. To learn how parser generators work,
take CS 412!

Now that we have a representation of an SML program as a data structure, we
have the opportunity to precisely define the semantics of SML by writing a **definitional
interpreter**. An **interpreter** is a program that accepts as input
another program written in some language, and executes that program (or
simulates its execution, depending on your viewpoint). A definitional
interpreter is an interpreter written for the purpose of describing the
semantics of a programming language. Since its purpose is to help us understand
what SML programs are supposed to do, we will put a premium on clarity and worry
less about performance issues here. However, it is possible to produce a
reasonably fast interpreter using the basic approach shown here.

Below is a definitional interpreter for a subset of SML. Here are some things to notice about this interpreter:

- The first part of the code is a definition of the abstract syntax of the
simplified language. Because values and expressions overlap, we can't define
values as a separate datatype; therefore, there is a function
`is_value`

that figures out whether an expression is a value according to the rules above. `toString`

is a helpful function that prints out expressions in a more readable form than the AST it accepts as input. It isn't really part of the interpreter, though.`subst`

explains how substitution is done. Notice that the rules for substituting variables in`let`

and`fn`

expressions only substitute into the bodies of these expressions if the variable being substituted is not bound by the expression.`eval_binop`

implements all the reductions for primitive types.- The function
`eval`

takes an expression AST as input and gives the value that the expression evaluates to.- For possible kind of expression, it (1) recursively evaluates any
subexpressions, (2) performs the appropriate reduction for the resulting
expression and (3) applies
`eval()`

to finish the evaluation of the new reduced expression. - So each expression has some core code that performs the reduction, plus some code around it that specifies the order in which things should be evaluated.
- Since we have no type checker yet to make sure that there is always a legal reduction to be performed, the interpreter does run-time checking. For example, if a raw, unsubstituted variable shows up during evaluation, it must have been unbound in the original program because it was never substituted for by a containing expression that bound it.
- Evaluation is split into two functions
`eval`

and`eval'`

so that the interpreter not only reports the final result of evaluation, but also reports each intermediate step along the way.

- For possible kind of expression, it (1) recursively evaluates any
subexpressions, (2) performs the appropriate reduction for the resulting
expression and (3) applies

In Problem Set 5, you will be building an interpreter for a language that is not too different from ML, except that it is a concurrent language. Like this interpreter, your interpreter will have to implement reductions. Unlike this interpreter, your evaluator will only take one evaluation step at a time. This will be necessary in order to simulate the execution of multiple concurrent processes. So this interpreter is in some ways a good model of your code for problem set 5, but not in others.

code/interp1.sml | ||

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 |
(* A simple interpreter for a subset of SML *) (* we represent variables as strings *) type var = string (* various constants *) datatype const = Int of int | Real of real | String of string | Bool of bool (* various binary operations -- a good exercise is to add * more operations such as Equals or Divide ... *) datatype binop = Plus | Times | Minus | Concat | Lte (* how we represent expressions *) datatype exp = Const of const (* constants: 3, "foo", true, etc. *) | Binop of exp * binop * exp (* binary operations: e1 + e2, etc. *) | Var of var (* variables: x *) | Fn of var * exp (* functions: fn x => e *) | App of exp * exp (* applications: e1(e2) *) | Tuple of exp list (* tuples: (e1,e2,...,en) *) | Ith of int * exp (* projection: #i e *) | Let of var * exp * exp (* let: let val x = e1 in e2 *) | If of exp * exp * exp (* if: if e1 then e2 else e3 *) (* is_value(e) is whether e is a value. A value is a fully * evaluated expression. Only constants, functions, and tuples * of values are values in this subset of ML *) fun is_value(e:exp):bool = case e of Const _ => true | Binop(_,_,_) => false | Var _ => false | Fn(_,_) => true | App(_,_) => false | Tuple(es) => List.all is_value es | Ith(_,_) => false | Let(_,_,_) => false | If(_,_,_) => false (* Convert an expression to a string for display purposes *) fun toString (precedent:int) (e:exp):string = let val new_precedent = case e of Const _ => 100 | Var _ => 100 | Binop(_,Plus,_) => 4 | Binop(_,Minus,_) => 5 | Binop(_,Times,_) => 6 | Binop(_,Concat,_) => 4 | Binop(_,Lte,_) => 7 | Fn(_,_) => 1 | App(_,_) => 2 | Tuple(_) => 100 | Ith(i,e) => 2 | Let(_,_,_) => 100 | If(e1,e2,e3) => 3 val s = case e of Const(Int i) => Int.toString i | Const(Real r) => Real.toString r | Const(String s) => "\""^s^"\"" | Const(Bool true) => "true" | Const(Bool false) => "false" | Binop(e1,b,e2) => let val bs = case b of Plus => "+" | Times => "*" | Minus => "-" | Concat => "^" | Lte => "<=" in (toString new_precedent e1)^" "^bs^" "^(toString new_precedent e2) end | Var x => x | Fn(x,e1) => "fn "^x^" => "^(toString new_precedent e1) | App(e1,e2) => (toString new_precedent e1) ^ " " ^ (toString 100 e2) | Tuple(es) => let val ss = map (toString 0) es fun sep(ss) = case ss of [] => [")"] | [x] => [x,")"] | hd::tl => hd::","::(sep tl) in String.concat("("::(sep ss)) end | Ith(i,e1) => "#" ^ (Int.toString i) ^ " " ^ (toString new_precedent e1) | Let(x,e1,e2) => "let val "^x^" = "^(toString 0 e1)^" in "^(toString 0 e2)^" end" | If (e1,e2,e3) => "if "^(toString 0 e1)^" then "^(toString 0 e2)^" else "^ (toString new_precedent e3) in if (new_precedent > precedent) then s else "("^s^")" end (* convenient function for printing an expression *) fun print_exp(e:exp):unit = print (toString 0 e) (* pause for user input *) fun pause() = (TextIO.inputLine TextIO.stdIn; ()) (* raised when an error occurs during evaluation *) exception Eval_Error of string (* substitute v for x within e *) fun subst(v:exp, x:var, e:exp):exp = case e of Const _ => e | Binop(e1,b,e2) => Binop(subst(v,x,e1), b, subst(v,x,e2)) | Var y => if (x = y) then v else Var y | Fn(y,e1) => if (x = y) then Fn(y,e1) else Fn(y,subst(v,x,e1)) | App(e1,e2) => App(subst(v,x,e1), subst(v,x,e2)) | Tuple(es) => Tuple(map (fn e => subst(v,x,e)) es) | Ith(i,e1) => Ith(i,subst(v,x,e1)) | Let(y,e1,e2) => if (x = y) then Let(y,subst(v,x,e1),e2) else Let(y,subst(v,x,e1),subst(v,x,e2)) | If(e1,e2,e3) => If(subst(v,x,e1),subst(v,x,e2),subst(v,x,e3)) (* a version of substitute that prints out what's happening *) fun substitute(v:exp, x:var, e:exp):exp = (print "Substituting "; print_exp v; print " for "; print x; print " in "; print_exp e; print "\n"; pause(); subst(v,x,e)) (* given a binary operation b and two constants c1 and c2, returns * the constant we get when we perform the operation on the constants *) fun eval_binop(b:binop, c1:const, c2:const):const = case (b, c1, c2) of (Plus, Int i, Int j) => Int(i+j) | (Plus, Real r, Real s) => Real(r+s) | (Times, Int i, Int j) => Int(i*j) | (Times, Real r, Real s) => Real(r*s) | (Minus, Int i, Int j) => Int(i-j) | (Minus, Real r, Real s) => Real(r-s) | (Lte, Int i, Int j) => Bool(i <= j) | (Lte, Real r, Real s) => Bool(r <= s) | (Concat, String s, String t) => String(s^t) | (_,_,_) => raise Eval_Error("type mismatch for binop") (* eval'(e) is the result of evaluating e until we get a value *) fun eval'(e:exp):exp = case e of Const c => Const c | Binop (e1, b, e2) => let val v1 = eval e1 val v2 = eval e2 in case (v1, v2) of (Const c1, Const c2) => Const(eval_binop(b, c1, c2)) | _ => raise Eval_Error("bad binop expression") end | Var x => raise Eval_Error("unbound variable "^x) | Fn(x,e1) => Fn(x,e1) | App(e1,e2) => let val v1 = eval e1 val v2 = eval e2 in case v1 of Fn(x,e) => eval(substitute(v2, x, e)) | _ => raise Eval_Error("attempt to apply a non-function") end | Tuple(es) => Tuple(map eval es) | Ith(i,e) => let val v = eval e in case v of Tuple(vs) => if i < 1 orelse i > (length vs) then raise Eval_Error("tuple index out of bounds") else List.nth(vs,i-1) | _ => raise Eval_Error("attempt to project from non-tuple") end | Let(x,e1,e2) => let val v1 = eval e1 in eval(substitute(v1, x, e2)) end | If(e1,e2,e3) => let val v1 = eval e1 in case v1 of Const(Bool true) => eval e2 | Const(Bool false) => eval e3 | _ => raise Eval_Error("attempt to do if on non-bool") end (* a wrapper for evaluation that prints out the expression being * evaluated, pauses for user input, evaluates the expression * (using eval') and then prints out the result. *) and eval(e:exp):exp = if is_value e then e else (print "Evaluating: "; print_exp e; print "\n"; pause(); let val v = eval'(e) in print "Result of "; print_exp e; print "\n is "; print_exp v; print "\n\n"; v end) (************************************************************) (* Some sample expressions to evaluate *) (************************************************************) (* 300+12 *) val e1 = Binop(Const(Int 300), Plus, Const(Int 12)) (* fn x => x+1 *) val e2 = Fn("x", Binop(Var "x", Plus, Const(Int 1))) (* (fn x => x+1)(300+12) *) val e3 = App(e2,e1) (* fn x => x + ((fn x => x+1) 3) *) val e4 = Fn("x", Binop(Var "x", Plus, App(e2, Const(Int 3)))) (* (fn x => x + ((fn x => x+1) 3)) 4 *) val e5 = App(e4, Const(Int 4)) (* (fn x => (fn y => x + y)) *) val e6 = Fn("x", Fn("y", Binop(Var "x", Plus, Var "y"))) (* (fn x => (fn y => x + y)) 1 *) val e7 = App(e6, Const(Int 1)) (* (fn f => f(f(3))) (fn x => (fn y => x + y)) 1 *) val e8 = App(Fn("f",App(Var "f", App(Var "f", Const(Int 3)))), e7) (* let val f = (fn x => (fn y => x +y)) 1 * in * f(f(3)) *) val e9 = Let("f",e7,App(Var "f", App(Var "f", Const(Int 3)))) (* (3,4) *) val e10 = Tuple[Const(Int 3), Const(Int 4)] (* (fn x => (#2 x, #1 x)) *) val e11 = Fn("x",Tuple[Ith(2,Var "x"), Ith(1,Var "x")]) (* let val swap = (fn x => (#2 x, #1 x)) * in * swap(3,4) * end *) val e12 = Let("swap", e11, App(Var "swap", e10)) (* let val max = * fn x => if #1 x <= #2 x then #2 x else #1 x * in * max (3,4) * end *) val e13 = Let("max", Fn("x",If(Binop(Ith(1,Var "x"),Lte,Ith(2,Var "x")), Ith(2,Var "x"), Ith(1,Var "x"))), App(Var "max", e10)) (* This example shows something going wrong...what's wrong with * this code? * * let val fact = * fn n => if n <= 1 then 1 else n * fact(n-1) * * in * fact(3) * end *) val e14 = Let("fact", Fn("n",If(Binop(Var "n",Lte,Const(Int 1)), Const(Int 1), Binop(Var "n",Times, App(Var "fact", Binop(Var "n",Minus,Const(Int 1)))))), App(Var "fact", Const(Int 3))) (* Print expressions in full at SML prompt *) val dummy = (Compiler.Control.Print.printDepth := 1000) |

The language above doesn't support datatypes or pattern matching. Here is a definitional interpreter based on the substitution model that does support pattern matching.

code/interp2.sml | ||

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 |
(* Need to load the library -- change this path to point to your* version of SML *) CM.make' "/sml/lib/smlnj-lib.cm"; val print_expressions = ref true val print_substitutions = ref true val skip_values = ref true (* Some utility functions *) (* zip [a,b,c] [d,e,f] -> [(a,d), (b,e), (c,f) ] *) exception Zip fun zip (xs:'a list) (ys:'b list) : ('a * 'b) list = (case (xs, ys) of ([],[]) => [] | (x::xrest, y::yrest) => (x,y)::(zip xrest yrest) | _ => raise Zip) (* pause for user input *) fun pause() = (TextIO.inputLine TextIO.stdIn; ()) (* sort used for record expressions and patterns *) fun sort r = ListMergeSort.sort (fn ((x:string,_),(y:string,_)) => x < y) r (* error reporting *) exception Error of string fun error(s:string):'a = raise (Error s) (*************************************************************************) (* we represent identifiers (variables) as strings *) type id = string (* constants *) datatype const = Int of int | Real of real | String of string | Char of char (* binary operations *) datatype binop = Plus | Times | Minus | Equal | Concat (* expressions *) datatype exp = Const of const (* 3, 2.178, "foo", #"c", etc. *) | Id of id (* variables *) | Fn of (id*exp) (* anonymous functions: fn id => exp *) | App of (exp*exp) (* function application: exp1(exp2) *) | Binop of (exp*binop*exp) (* binary operations: eg., exp1 + exp2 *) | Tuple of (exp list) (* tuples: (3,"foo",true) *) | Ith of (int*exp) (* tuple projection: #i exp *) | Record of ((id*exp)list) (* records: {name="Greg", age=100} *) | Field of (id*exp) (* record projection: #name exp *) | DataCon of (id*(exp option))(* data constructors: true, false, nil, * NONE, SOME(exp), NODE{left=e1,elt=x,right=e2}*) | Case of (exp*(pat*exp)list)(* case: case exp of pat1=>e1 | ... | patn=>en *) | Let of (decl*exp) (* let: let decl in exp *) | Fun of (id*id*exp) (* recursive functions: fun f(x)=exp *) (* declarations *) and decl = Val_d of (pat*exp) (* val pat = exp *) | Fun_d of (id*id*exp) (* fun f(x) = exp *) (* patterns *) and pat = Wild_p (* wildcard: _ *) | Id_p of id (* variable: x *) | Const_p of int (* constant: 3 *) | DataCon_p of (id*pat option) (* data constructor: true, false, nil, * NONE, SOME(pat), NODE{left=p1,elt=p2,right=p3}*) | Tuple_p of pat list (* tuple patterns: (pat1,...,patn) *) | Record_p of (id*pat) list (* record patterns: {field1=pat1,...,fieldn=patn}*) (* returns true iff the expression is a value *) fun is_value (e:exp):bool = case e of Const _ => true | Fn _ => true | Tuple (es) => List.all is_value es | Record(ides) => List.all (fn (_,e) => is_value e) ides | DataCon(_,NONE) => true | DataCon(_,SOME(e)) => is_value e | _ => false (***************************************) (* Functions for printing out the code *) (***************************************) (* insert a separator between each element of a string list and * concatenate the whole thing. *) fun sep (s:string) (lis:string list) : string = let fun f lis = case lis of [] => [] | [x] => [x] | hd::tl => hd::s::(f tl) in List.foldr (op ^) "" (f lis) end (* convert a constant to a string *) fun const2s (c:const):string = case c of Int(i) => Int.toString i | Real(r) => Real.toString r | String(s) => "\"" ^ s ^ "\"" | Char(c) => "#\"" ^ (Char.toString c) ^ "\"" (* convert a binary operation to a string *) fun binop2s (b:binop):string = case b of Plus => "+" | Minus => "-" | Times => "*" | Concat => "^" | Equal => "=" (* maximum precedence *) val max_prec = 999; (* precedence for binary operations *) fun binop_prec(b) = case b of Plus => 7 | Times => 8 | Minus => 7 | Concat => 6 | Equal => 5 (* precedence for expressions *) fun prec(e:exp):int = case e of Const(c) => max_prec | Id(x) => max_prec | Fn(id,e) => 1 | App(e1,e2) => 2 | Binop(e1,b,e2) => binop_prec(b) | Tuple(es) => max_prec | Ith(i,e) => 2 | Record(ides) => max_prec | Field(x,e) => 2 | DataCon(x,NONE) => 2 | DataCon(x,SOME(e)) => 2 | Case(e,cases) => 1 | Let(d,e) => max_prec | Fun(f,x,e) => 1 (* convert an expression to a string *) fun exp2s (p:int) (e:exp):string = let val p' = prec(e) val e2s = exp2s p' val s = case e of Const(c) => const2s c | Id(x) => x | Fn(id,e) => "fn "^id^" => "^(e2s e) | App(e1,e2) => (e2s e1)^" "^(e2s e2) | Binop(e1,b,e2) => (e2s e1)^(binop2s b)^(e2s e2) | Tuple(es) => "("^(sep "," (List.map (exp2s 0) es))^")" | Ith(i,e) => "#"^(Int.toString i)^" "^(e2s e) | Record(ides) => ("{"^(sep "," (List.map (fn (x,e) => x^"="^(exp2s 0 e)) ides))^ "}") | Field(x,e) => "#"^x^" "^(e2s e) | DataCon(x,NONE) => x | DataCon(x,SOME(e)) => (case e of Tuple _ => (x ^ (e2s e)) | _ => (x ^ " " ^ (e2s e))) | Case(e,cases) => "case "^(exp2s 0 e)^" of "^ (sep " | " (List.map case2s cases)) | Let(d,e) => "let "^(decl2s d)^" in "^(exp2s 0 e)^" end" | Fun(f,x,e) => "fun "^f^"("^x^") = "^(exp2s 0 e) in if (p' > p) then s else "("^s^")" end (* convert a declaration to a string *) and decl2s (d:decl):string = case d of Val_d(p,e) => "val "^(pat2s p)^" = "^(exp2s 0 e) | Fun_d(f,x,e) => "fun "^f^"("^x^") = "^(exp2s 0 e) (* convert a pattern to a string *) and pat2s (p:pat):string = case p of Wild_p => "_" | Id_p(x) => x | Const_p(i) => Int.toString i | DataCon_p(id,NONE) => id | DataCon_p(id,SOME(p)) => id^" "^(pat2s p) | Tuple_p(ps) => "("^(sep "," (List.map pat2s ps))^")" | Record_p(idps) => "{"^(sep "," (List.map (fn (x,p) => x^"="^(pat2s p)) idps))^"}" (* convert a case expression to a string *) and case2s (p:pat,e:exp):string = (pat2s p)^" => "^(exp2s 0 e) (* print out an expression *) fun print_exp(e:exp):unit = print(exp2s 0 e); print "\n"; (* example data constructors true and false *) val True = DataCon("true",NONE); val False = DataCon("false",NONE); (* apply a binary operation to two constants *) fun apply_binop(b:binop,c1:const,c2:const):exp = case (b,c1,c2) of (Plus,Int i,Int j) => Const(Int(i+j)) | (Plus,Real i,Real j) => Const(Real(i+j)) | (Times,Int i,Int j) => Const(Int(i*j)) | (Times,Real i,Real j) => Const(Real(i*j)) | (Minus,Int i,Int j) => Const(Int(i-j)) | (Minus,Real i,Real j) => Const(Real(i-j)) | (Concat,String s1,String s2) => Const(String(s2 ^ s2)) | (Equal,Int i,Int j) => if i = j then True else False | (Equal,String i,String j) => if i = j then True else False | (Equal,Char i,Char j) => if i = j then True else False | (_,_,_) => error("bad binop application") (* raised when a pattern match fails *) exception MatchFail (* a substitution is a list of variables and their associated expressions *) type substitution = (id * exp) list (* match the value v against the pattern p to get a substitution *) fun match(v:exp,p:pat):substitution = case (v,p) of (_, Wild_p) => [] | (_, Id_p(x)) => [(x,v)] | (Const(Int i), Const_p j) => if (i = j) then [] else raise MatchFail | (DataCon(id,NONE), DataCon_p(id',NONE)) => if (id = id') then [] else raise MatchFail | (DataCon(id,SOME(v')), DataCon_p(id',SOME(p'))) => if (id = id') then match(v',p') else raise MatchFail | (Tuple(vs), Tuple_p(ps)) => List.foldr (op @) [] (map match (zip vs ps)) | (Record(idvs), Record_p(idps)) => let val idps = sort idps in List.foldr (op @) [] (map (fn ((_,v),(_,p)) => match(v,p)) (zip idvs idps)) end | (_, _) => raise MatchFail (* return the list of variables that occur in a pattern *) fun pat_vars(p:pat):id list = case p of Wild_p => [] | Id_p(x) => [x] | Const_p(_) => [] | DataCon_p(_,NONE) => [] | DataCon_p(_,SOME(p')) => pat_vars p' | Tuple_p(ps) => List.foldr (op @) [] (map pat_vars ps) | Record_p(idps) => List.foldr (op @) [] (map (fn (_,p) => pat_vars p) idps) (* substitute the value v for the variable x within the expression e *) fun subst(s as (x:id,v:exp),e:exp):exp = case e of Const _ => e | Id(y) => if (x = y) then v else e | Fn(y,e) => if (x = y) then e else Fn(y,subst(s,e)) | App(e1,e2) => App(subst(s,e1),subst(s,e2)) | Binop(e1,b,e2) => Binop(subst(s,e1),b,subst(s,e2)) | Tuple(es) => Tuple(List.map (fn e => subst(s,e)) es) | Ith(i,e) => Ith(i,subst(s,e)) | Record(ides) => Record(List.map (fn (lab,e) => (lab,subst(s,e))) ides) | Field(id,e) => Field(id,subst(s,e)) | DataCon(id,NONE) => e | DataCon(id,SOME(e)) => DataCon(id,SOME(subst(s,e))) | Case(e,cases) => Case(subst(s,e),List.map (subst_case s) cases) | Let(Val_d(p,e1),e2) => let val pvs = pat_vars(p) val d = Val_d(p,subst(s,e1)) in if List.exists (fn y => y = x) pvs then Let(d,e2) else Let(d,subst(s,e2)) end | Let(Fun_d(f,y,e1),e2) => if (x = f) then e else if (x = y) then Let(Fun_d(f,y,e1),subst(s,e2)) else Let(Fun_d(f,y,subst(s,e1)),subst(s,e2)) | Fun(f,y,e') => if (x = f) orelse (x = y) then e else Fun(f,y,subst(s,e')) (* substitute v for x within a case *) and subst_case (s as (x:id,v:exp)) (p:pat,e:exp) : pat*exp = if List.exists (fn y => y = x) (pat_vars p) then (p,e) else (p,subst(s,e)) (* substitute a substitution (list of variables and associated values) within * an expression *) fun substitute(S:substitution,e:exp):exp = (* print out the substitution *) (if (!print_substitutions) then (print "\nsubstituting ["; print (sep "," (List.map (fn (x,e) => "("^x^","^(let val s = exp2s 0 e in if (size s) > 15 then (String.substring(s,0,14)) ^ "..." else s end)^")") S)); print "]\n within "; print_exp e; print "\n"; pause()) else (); List.foldr subst e S) (* evaluate the expression e to get a value *) fun eval'(e:exp):exp = case e of Const(c) => Const(c) | Fn(x,e') => Fn(x,e') | Id(x) => error("Id: unbound variable "^x) | App(e1,e2) => let val v1 = eval e1 val v2 = eval e2 in case v1 of Fn(x,e') => eval(substitute([(x,v2)],e')) | _ => error("App: not a function") end | Binop(e1,b,e2) => (case (eval e1,eval e2) of (Const c1, Const c2) => apply_binop(b,c1,c2) | _ => error("Binop: arguments not constants")) | Tuple es => Tuple (map eval es) | Ith(i,e) => (case (eval e) of Tuple(vs) => List.nth(vs,i) | _ => error("Ith: not a tuple")) | Record lab_es => Record (sort (map (fn (lab,e) => (lab,eval e)) lab_es)) | Field(id,e) => (case (eval e) of Record(lab_vs) => (case List.find (fn (x,v) => x = id) lab_vs of SOME(_,v) => v | NONE => error("Field: record missing field "^id)) | _ => error("Field: not a record")) | DataCon(id,NONE) => DataCon(id,NONE) | DataCon(id,SOME(e)) => DataCon(id,SOME(eval e)) | Case(e,cases) => find_match(eval e,cases) | Let(Val_d(p,e1),e2) => let val v = eval e1 val S = match(v,p) in eval(substitute(S,e2)) end | Let(Fun_d(f,x,e1),e2) => eval(substitute([(f,Fun(f,x,e1))],e2)) | Fun(f,x,e) => eval (Fn(x,substitute([(f,Fun(f,x,e))],e))) (* print out the expression and evaluate it *) and eval(e:exp):exp = (* don't bother to evaluate expressions that are already values *) if (!skip_values andalso is_value e) then e else let val _ = if (!print_expressions) then (print("\nThe current expression being evaluated is: \n"); print_exp e; print "\n"; pause(); ()) else () val r = eval' e in (if (!print_expressions) then (print "the result of eval("; print_exp e; print ") is: \n"; print_exp r; print "\n\n"; pause()) else ()); r end (* given a list of cases from a case expression, try to match v * against the associated pattern. If this succeeds, then apply * the resulting substitution to the right-hand-side of the case * and evaluate it. Otherwise, go on to the next case. *) and find_match(v:exp,cases:(pat * exp) list):exp = case cases of [] => error("total match failure") | ((p,e)::rest) => (eval(substitute(match(v,p),e)) handle MatchFail => find_match(v,rest)) (**************************************************************************) (* Some examples to evaluate: swap_example, fact_example, append_example *) (**************************************************************************) val one = Const(Int(1)) val two = Const(Int(2)) val three = Const(Int(3)) val four = Const(Int(4)) fun plus(e1,e2) = Binop(e1,Plus,e2) (* let val swap = fn (p) => let val (x,y) = p in (y,x) end in swap(3+1,4+2) end *) val swap = Fn("p",Let(Val_d(Tuple_p[Id_p "x", Id_p "y"], Id "p"), Tuple[Id "y", Id "x"])) val swap_example = Let(Val_d(Id_p "swap",swap), App(Id "swap", Tuple[plus(three,one), plus(four,two)])) (* let fun fact(n) = if (n = 1) then 1 else n * fact(n-1) in fact 4 end *) fun ife e1 e2 e3 = Case(e1,[(DataCon_p("true",NONE),e2),(DataCon_p("false",NONE),e3)]); val fact = ("fact","n", ife (Binop(Id "n",Equal,one)) one (Binop(Id "n",Times,App(Id "fact",Binop(Id "n",Minus,one))))) val fact_example = Let(Fun_d(fact),App(Id "fact",Const(Int 4))) (* datatype intlist = Nil | Cons of (int * intlist) let fun append(p) = case p of (Nil, y) => y | (Cons(hd,tl), y) => Cons(hd,append(tl,y)) in append (Cons(1,Nil), Cons(2,Cons(3,Nil))) end *) fun cons(e1:exp,e2:exp):exp = DataCon("Cons",SOME(Tuple[e1,e2])) val nil_e:exp = DataCon("Nil",NONE) val nil_p:pat = DataCon_p("Nil",NONE) fun cons_p(p1:pat,p2:pat):pat = DataCon_p("Cons",SOME(Tuple_p[p1,p2])) fun pair_p(p1:pat,p2:pat):pat = Tuple_p[p1,p2] val append = ("append","p", Case(Id("p"),[(pair_p(nil_p,Id_p("y")), Id("y")), (pair_p(cons_p(Id_p("hd"),Id_p("tl")),Id_p("y")), cons(Id("hd"),App(Id("append"),Tuple[Id("tl"),Id("y")]))) ])) val append_example = Let(Fun_d(append),App(Id("append"), Tuple[cons(one,nil_e), cons(two,cons(three,nil_e))])) |