CS 312 Lecture 17:
An ML Interpreter

For the next few lectures, and in Problem Set 5, we will investigate programming languages more deeply. We have talked about what the various constructs of SML mean and how they are evaluated. We have seen that evaluation of SML programs can be described by rewrite rules that explain how to reduce SML subexpressions to other expressions. When we have fully described how to evaluate a program, we have obtained a semantics for the programming language. The word "semantics" means "meaning". A semantics for a programming language tells you how to determine what any program in that language means.

The kinds of semantics we have looked at are operational semantics: descriptions of how to evaluate programs (there are other kinds of semantics, such axiomatic semantics, which tell you how to prove statements about programs). There is even more than one way to specify an operational semantics for a given programming language. We have been exploring a particular operational model of evaluation called the substitution model. The key idea of the substitution model is that when a variable is bound to a value (by pattern-matching), the value is substituted in place of all occurrences of the variable that are bound by the pattern in question.

Evaluation

In a functional language, we can think of the execution of the program as a series of rewrite steps applied to the program text. This is also how we usually think about the evaluation of an arithmetic expression. For example, if we see the expression (2+3)*4+3*4, we know that it evaluates in four steps:

(2+3)*4+3*4 -> 5*4+3*4 -> 20+3*4 -> 20+12 -> 32

In each step, we take some part of the expression and replace it with a new expression. For example, in the first step we replace 2+3 with 5. Thus, each rewrite step acts locally to replace a subexpression with its value. These local rewritings are called reductions.

Sometime there are several rewrite steps we can choose for a given expression; these different choice lead to different evaluation orders. There are actually several possible evaluation orders for this expression; for example, here is a different one:

(2+3)*4+3*4 -> (2+3)*4+12 -> 5*4+12 -> 20+12 -> 32

It doesn't matter what order we evaluate things in; we always get the same result regardless. This will also be true for SML as long as we stick to functional language features (that is, stay away from imperative features such as refs, arrays, :=, etc.) One benefit of functional programming is precisely that the result of evaluating an expression is always the same; it does not depend on the order of evaluation and it is always the same no matter how many times it is evaluated.

Here are some examples of simple SML evaluations:

#2(2+3*4, false) -> #2(2+12, false) -> #2(14, false) -> false
false::(false orelse true)::nil -> false::true::nil

These evaluations use various reductions that are part of SML. For example, there are lots of arithmetic reductions of the form v1 op v2->v3, In addition there are reductions on tuples; as seen in the first example, we have a reduction

#i(v1,...,vn) -> vi       (where 1 <= i <= n)

Every SML expression form has its own reductions. For example, the if..then..else expression has two reductions that capture the essential computational behavior of the expression:

if true then e1 else e2 -> e1
if false then e1 else e2 -> e2

Expressions vs. Values

When does the program stop? In arithmetic, it's when we reach a number, because there are no further steps to take.  In general, we have some set of expressions in the programming language that can't be evaluated any further; we call these expressions values. Values are things that you can type at the SML prompt and get the same thing right back. For example, in SML, the following are values:

1
true
"hello"
(true, "5", 1)
fn(x:int) => x
5::4::nil         (=[5,4])

The following expressions are not values, because an evaluation step can be performed on them:

1+2
true orelse false
(true, "5", 0+1)
(fn(x:int) => x) (3)

We can write a BNF grammar for values v, just as we did earlier for expressions:

c ::= integer_const
    | bool_const
    | string_const
    | real_const
    | char_const
v ::= c                     (* constants *)
    | (v1,...,vn)           (* tuples of values *)
    | (fn (id:t):t' => e)   (* anonymous functions *)
    | {id1=v1, ..., idn=vn} (* records of values *)
    | Id  | Id(v)           (* data constructors *)

Anything described by this grammar is a value and thus a legal result of an SML program. In other words, any tuple whose elements are values is a value itself; any records whose fields are bound to values is a value, any data constructor applied to a value is also a value, and any anonymous function is a value—even if its body is an arbitrary expression e. In other words, the body of a function is not evaluated at all until it is applied to an argument.

How do we know that a program will always reach a value? Actually, we don't. A program might go into an infinite loop. But no matter how long the program executes, as long as it hasn't reached a value there will always be a reduction to perform. For example, we'll never have to apply a reduction to #i(v1,...,vn) where i > n. The SML type checker ensures that this and other bad things will never happen. This is what it means to say that SML is type-safe.

Variables

Of course, SML is quite a bit more complicated than 3rd-grade arithmetic. The biggest difference is that in SML expressions can contain variables: names that are bound to values. In the substitution model we handle variables by substituting for them using the values to which they are bound. For example, the expression let val x=2 in x+3 end is evaluated by taking its right-hand side, x+3, and substituting all occurrences of x with the value to which it bound, 2. Therefore, it steps to 2+3 and then to 5. In general, an expression of the form let val x=v in e' end is evaluated by replacing it with e', but with occurrences of x replaced by v. We denote the result of this substitution as e'{v/x}; that is, there is a reduction

let val x=v in e'

Here are some examples of substitution:

x{true/x} = true
x{true/y} = x
(x+(2*x)){1/x} = 1 + (2*1)
(x + let val x = 1 in x end){2/x} = (2 + let val x = 1 in x end)
(fn x: int => x+1)(#1 x){(3,"three")/x} = (fn x: int => x+1)(#1 (3,"three"))

Occurrences of a variable in an expression can be either bound, unbound, or binding occurrences. For example, in the expression x+3, the variable x is unbound: its meaning is not defined by the expression. In the expression x + let val x = 1 in x+3 end, the first occurrence of x is unbound; the second is a binding occurrence that binds x to the value 1 throughout the body of the let expression. The third occurrence is a bound occurrence because it occurs within the scope of the second, binding occurrence.

The last two substitution examples illustrate an important point: when we substitute for some variable x, we don't replace the binding or bound occurrences of x, because that variable is really a different variable despite having the same name.

We can also use substitution to explain the action of a function invocation. An expression of the form

(fn(x: t) => e) (v)

reduces to

e{v/x}

That is, we take the body of the function and replace all unbound occurrences of x (which must have been bound by the binding occurrence in the argument list) with the actual argument value v.

What about named functions? A declaration of the form

fun f(x: t):t' = e

is mostly just syntactic sugar for the declaration

val f = fn(x: t) => e

(it isn't completely syntactic sugar because a named function can refer to itself recursively. But that's another story.) So we can understand the evaluation of calls to non-anonymous functions as using the same rule that anonymous functions do. Here's an example:

let val y = 3 in
    fun f(x:int):int = x*y
in
  f(2+y)
end
->                           (let reduction)
let fun f(x:int):int = x*3
in
  f(2+3)
end
->                           (let reduction)
(fn(x:int):int => x*3)(2+3)
->                           (+ reduction)
(fn(x:int):int => x*3)(5)
->                           (fn application reduction)
5*3
->                           (* reduction)
8

Evaluation order

The other thing we have to keep in mind is that we can't perform reductions just anywhere. Each SML expression imposes some order on the evaluation of its subexpressions. For example, no reductions can be performed on the body of a let expression until all of its declarations have been evaluated and the results substituted into the body. Similarly, no evaluations are performed

Abstract syntax

When we talk about language semantics, we first need to say what it is we are defining the semantics of; that is, what is our representation of a "program". One obvious representation is the stream of bytes that are the ASCII codes for the characters in the program. However, this representation is not convenient for talking about language semantics.

Early in the course we commented on a similarity between BNF declarations and datatype declarations. In fact, we can define datatype declarations that act like the corresponding BNF declarations. The values of these datatypes then represent legal expressions that can occur in the language. For example, our earlier BNF definition of legal SML types

(base types)   b ::= int | real | string | bool | char
(types)        t ::= b | t1->t2 | t1*t2*...*tn | { id1 : t1,..., idn : tn } | id

has the same structure as the following datatype declarations:

type id = string
datatype baseType = Int | Real | String | Bool | Char
datatype type_ = Base of baseType | Arrow of type_*type_ | Product of type_ List
                 | Record of (id*type_) List | DatatypeName of id

Any legal SML type expression can be represented by a value of type type_ that contains all the information in the type expression. This value is known as the abstract syntax for that expression. It is abstract, because it doesn't contain any information about the actual symbols used to represent the expression in the program. For example, the abstract syntax for the (type) expression int*bool->{name: string} would be

Arrow( Product(Cons(Base Int, Cons(Base Bool, Nil))),
       Record(Cons(("name", Base String), Nil)))

It will be convenient to draw abstract syntax as trees. For example, the expression above has the following abstract syntax tree (AST):

In this diagram, the names of the nodes are not essential; for example, -> is written where Arrow could be as easily and as correctly written instead.

Compilers typically use abstract syntax internally to represent the program that they are compiling, and we can also use it to talk about operational semantics. Inside a compiler it is the job of the parser to convert the string-of-characters representation of the program into the abstract syntax. Parsers can be built mostly automatically by giving the BNF grammar for the language to an parser generator. To learn how parser generators work, take CS 412!

Definitional Interpreter

Now that we have a representation of an SML program as a data structure, we have the opportunity to precisely define the semantics of SML by writing a definitional interpreter. An interpreter is a program that accepts as input another program written in some language, and executes that program (or simulates its execution, depending on your viewpoint). A definitional interpreter is an interpreter written for the purpose of describing the semantics of a programming language. Since its purpose is to help us understand what SML programs are supposed to do, we will put a premium on clarity and worry less about performance issues here. However, it is possible to produce a reasonably fast interpreter using the basic approach shown here.

Below is a definitional interpreter for a subset of SML. Here are some things to notice about this interpreter:

In Problem Set 5, you will be building an interpreter for a language that is not too different from ML, except that it is a concurrent language. Like this interpreter, your interpreter will have to implement reductions. Unlike this interpreter, your evaluator will only take one evaluation step at a time. This will be necessary in order to simulate the execution of multiple concurrent processes. So this interpreter is in some ways a good model of your code for problem set 5, but not in others.


code/interp1.sml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
 
(* A simple interpreter for a subset of SML *)

(* we represent variables as strings *)
type var = string

(* various constants *)
datatype const = Int of int
               | Real of real
               | String of string
               | Bool of bool

(* various binary operations -- a good exercise is to add
 * more operations such as Equals or Divide ... *)
datatype binop = Plus | Times | Minus | Concat | Lte

(* how we represent expressions *)
datatype exp =
  Const of const               (* constants: 3, "foo", true, etc.  *)
| Binop of exp * binop * exp   (* binary operations: e1 + e2, etc. *)
| Var of var                   (* variables: x *)
| Fn of var * exp              (* functions: fn x => e *)
| App of exp * exp             (* applications:  e1(e2) *)
| Tuple of exp list            (* tuples: (e1,e2,...,en) *)
| Ith of int * exp             (* projection:  #i e *)
| Let of var * exp * exp       (* let:  let val x = e1 in e2 *)
| If of exp * exp * exp        (* if:  if e1 then e2 else e3 *)

(* is_value(e) is whether e is a value. A value is a fully
 * evaluated expression. Only constants, functions, and tuples
 * of values are values in this subset of ML *)
fun is_value(e:exp):bool =
  case e of
    Const _ => true
  | Binop(_,_,_) => false
  | Var _ => false
  | Fn(_,_) => true
  | App(_,_) => false
  | Tuple(es) => List.all is_value es
  | Ith(_,_) => false
  | Let(_,_,_) => false
  | If(_,_,_) => false

(* Convert an expression to a string for display purposes *)
fun toString (precedent:int) (e:exp):string =
  let
    val new_precedent =
      case e of
        Const _ => 100
      | Var _ => 100
      | Binop(_,Plus,_) => 4
      | Binop(_,Minus,_) => 5
      | Binop(_,Times,_) => 6
      | Binop(_,Concat,_) => 4
      | Binop(_,Lte,_) => 7
      | Fn(_,_) => 1
      | App(_,_) => 2
      | Tuple(_) => 100
      | Ith(i,e) => 2
      | Let(_,_,_) => 100
      | If(e1,e2,e3) => 3

    val s =
      case e of
        Const(Int i) => Int.toString i
      | Const(Real r) => Real.toString r
      | Const(String s) => "\""^s^"\""
      | Const(Bool true) => "true"
      | Const(Bool false) => "false"
      | Binop(e1,b,e2) =>
          let val bs =
            case b of
              Plus => "+"
            | Times => "*"
            | Minus => "-"
            | Concat => "^"
            | Lte => "<="
          in
            (toString new_precedent e1)^" "^bs^" "^(toString new_precedent e2)
          end
      | Var x => x
      | Fn(x,e1) => "fn "^x^" => "^(toString new_precedent e1)
      | App(e1,e2) =>
          (toString new_precedent e1) ^ " " ^ (toString 100 e2)
      | Tuple(es) =>
          let val ss = map (toString 0) es
              fun sep(ss) =
                case ss of
                  [] => [")"]
                | [x] => [x,")"]
                | hd::tl => hd::","::(sep tl)
          in
            String.concat("("::(sep ss))
          end
      | Ith(i,e1) => "#" ^ (Int.toString i) ^ " " ^ (toString new_precedent e1)
      | Let(x,e1,e2) =>
          "let val "^x^" = "^(toString 0 e1)^" in "^(toString 0 e2)^" end"
      | If (e1,e2,e3) =>
          "if "^(toString 0 e1)^" then "^(toString 0 e2)^" else "^
          (toString new_precedent e3)
  in
    if (new_precedent > precedent) then s else "("^s^")"
  end

(* convenient function for printing an expression *)
fun print_exp(e:exp):unit = print (toString 0 e)

(* pause for user input *)
fun pause() = (TextIO.inputLine TextIO.stdIn; ())

(* raised when an error occurs during evaluation *)
exception Eval_Error of string

(* substitute v for x within e *)
fun subst(v:exp, x:var, e:exp):exp =
  case e of
    Const _ => e
  | Binop(e1,b,e2) => Binop(subst(v,x,e1), b, subst(v,x,e2))
  | Var y => if (x = y) then v else Var y
  | Fn(y,e1) => if (x = y) then Fn(y,e1) else Fn(y,subst(v,x,e1))
  | App(e1,e2) => App(subst(v,x,e1), subst(v,x,e2))
  | Tuple(es) => Tuple(map (fn e => subst(v,x,e)) es)
  | Ith(i,e1) => Ith(i,subst(v,x,e1))
  | Let(y,e1,e2) =>
      if (x = y) then Let(y,subst(v,x,e1),e2)
      else Let(y,subst(v,x,e1),subst(v,x,e2))
  | If(e1,e2,e3) =>
        If(subst(v,x,e1),subst(v,x,e2),subst(v,x,e3))

(* a version of substitute that prints out what's happening *)
fun substitute(v:exp, x:var, e:exp):exp =
  (print "Substituting "; print_exp v; print " for "; print x; print " in ";
   print_exp e; print "\n"; pause(); subst(v,x,e))

(* given a binary operation b and two constants c1 and c2, returns
 * the constant we get when we perform the operation on the constants *)
fun eval_binop(b:binop, c1:const, c2:const):const =
  case (b, c1, c2) of
    (Plus, Int i, Int j) => Int(i+j)
  | (Plus, Real r, Real s) => Real(r+s)
  | (Times, Int i, Int j) => Int(i*j)
  | (Times, Real r, Real s) => Real(r*s)
  | (Minus, Int i, Int j) => Int(i-j)
  | (Minus, Real r, Real s) => Real(r-s)
  | (Lte, Int i, Int j) => Bool(i <= j)
  | (Lte, Real r, Real s) => Bool(r <= s)
  | (Concat, String s, String t) => String(s^t)
  | (_,_,_) => raise Eval_Error("type mismatch for binop")


(* eval'(e) is the result of evaluating e until we get a value *)
fun eval'(e:exp):exp =
  case e of
    Const c => Const c
  | Binop (e1, b, e2) =>
      let val v1 = eval e1
          val v2 = eval e2
      in
        case (v1, v2) of
          (Const c1, Const c2) => Const(eval_binop(b, c1, c2))
        | _ => raise Eval_Error("bad binop expression")
      end
  | Var x => raise Eval_Error("unbound variable "^x)
  | Fn(x,e1) => Fn(x,e1)
  | App(e1,e2) =>
      let val v1 = eval e1
          val v2 = eval e2
      in
        case v1 of
          Fn(x,e) => eval(substitute(v2, x, e))
        | _ => raise Eval_Error("attempt to apply a non-function")
      end
  | Tuple(es) => Tuple(map eval es)
  | Ith(i,e) =>
      let val v = eval e
      in
        case v of
          Tuple(vs) =>
            if i < 1 orelse i > (length vs) then
              raise Eval_Error("tuple index out of bounds")
            else List.nth(vs,i-1)
        | _ => raise Eval_Error("attempt to project from non-tuple")
      end
  | Let(x,e1,e2) =>
      let val v1 = eval e1
      in
        eval(substitute(v1, x, e2))
      end
  | If(e1,e2,e3) =>
      let val v1 = eval e1
      in
        case v1 of
          Const(Bool true) => eval e2
        | Const(Bool false) => eval e3
        | _ => raise Eval_Error("attempt to do if on non-bool")
      end
(* a wrapper for evaluation that prints out the expression being
 * evaluated, pauses for user input, evaluates the expression
 * (using eval') and then prints out the result.
 *)
and eval(e:exp):exp =
  if is_value e then e else
    (print "Evaluating: "; print_exp e; print "\n"; pause();
     let val v = eval'(e)
     in
       print "Result of "; print_exp e; print "\n is ";
       print_exp v; print "\n\n"; v
     end)

(************************************************************)
(* Some sample expressions to evaluate                      *)
(************************************************************)

(* 300+12 *)
val e1 = Binop(Const(Int 300), Plus, Const(Int 12))
(* fn x => x+1 *)
val e2 = Fn("x", Binop(Var "x", Plus, Const(Int 1)))
(* (fn x => x+1)(300+12) *)
val e3 = App(e2,e1)
(* fn x => x + ((fn x => x+1) 3) *)
val e4 = Fn("x", Binop(Var "x", Plus, App(e2, Const(Int 3))))
(* (fn x => x + ((fn x => x+1) 3)) 4 *)
val e5 = App(e4, Const(Int 4))
(* (fn x => (fn y => x + y)) *)
val e6 = Fn("x", Fn("y", Binop(Var "x", Plus, Var "y")))
(* (fn x => (fn y => x + y)) 1 *)
val e7 = App(e6, Const(Int 1))
(* (fn f => f(f(3))) (fn x => (fn y => x + y)) 1 *)
val e8 = App(Fn("f",App(Var "f", App(Var "f", Const(Int 3)))), e7)
(* let val f = (fn x => (fn y => x +y)) 1
 * in
 *     f(f(3))
 *)
val e9 = Let("f",e7,App(Var "f", App(Var "f", Const(Int 3))))
(* (3,4) *)
val e10 = Tuple[Const(Int 3), Const(Int 4)]
(* (fn x => (#2 x, #1 x)) *)
val e11 = Fn("x",Tuple[Ith(2,Var "x"), Ith(1,Var "x")])
(* let val swap = (fn x => (#2 x, #1 x))
 * in
 *   swap(3,4)
 * end
 *)
val e12 = Let("swap", e11, App(Var "swap", e10))
(* let val max =
 *    fn x => if #1 x <= #2 x then #2 x else #1 x
 * in
 *   max (3,4)
 * end
 *)
val e13 = Let("max", Fn("x",If(Binop(Ith(1,Var "x"),Lte,Ith(2,Var "x")),
                              Ith(2,Var "x"), Ith(1,Var "x"))),
             App(Var "max", e10))

(* This example shows something going wrong...what's wrong with
 * this code?
 *
 *  let val fact =
 *   fn n => if n <= 1 then 1 else n * fact(n-1)
 *
 * in
 *   fact(3)
 * end
 *)
val e14 =
  Let("fact",
      Fn("n",If(Binop(Var "n",Lte,Const(Int 1)),
                Const(Int 1),
                Binop(Var "n",Times,
                      App(Var "fact",
                          Binop(Var "n",Minus,Const(Int 1)))))),
      App(Var "fact", Const(Int 3)))

(* Print expressions in full at SML prompt *)
val dummy = (Compiler.Control.Print.printDepth := 1000)

 

Pattern Matching

The language above doesn't support datatypes or pattern matching. Here is a definitional interpreter based on the substitution model that does support pattern matching.

code/interp2.sml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
 
(* Need to load the library -- change this path to point to your* version of SML *)
CM.make' "/sml/lib/smlnj-lib.cm"; 
val print_expressions = ref true
val print_substitutions = ref true
val skip_values = ref true

(* Some utility functions *)

(* zip [a,b,c] [d,e,f] -> [(a,d), (b,e), (c,f) ] *)
exception Zip
fun zip (xs:'a list) (ys:'b list) : ('a * 'b) list = 
  (case (xs, ys) of
     ([],[]) => []
   | (x::xrest, y::yrest) => (x,y)::(zip xrest yrest)
   | _ => raise Zip)

(* pause for user input *)
fun pause() = (TextIO.inputLine TextIO.stdIn; ())

(* sort used for record expressions and patterns *)
fun sort r = ListMergeSort.sort (fn ((x:string,_),(y:string,_)) => x < y) r

(* error reporting *)
exception Error of string
fun error(s:string):'a = raise (Error s)

(*************************************************************************)

(* we represent identifiers (variables) as strings *)
type id = string

(* constants *)
datatype const = Int of int | Real of real | String of string | Char of char

(* binary operations *)
datatype binop = Plus | Times | Minus | Equal | Concat

(* expressions *)
datatype exp = 
  Const of const              (* 3, 2.178, "foo", #"c", etc. *)
| Id of id                    (* variables *)
| Fn of (id*exp)              (* anonymous functions: fn id => exp *)
| App of (exp*exp)            (* function application:  exp1(exp2) *)
| Binop of (exp*binop*exp)    (* binary operations:  eg., exp1 + exp2 *)
| Tuple of (exp list)         (* tuples:  (3,"foo",true) *)
| Ith of (int*exp)            (* tuple projection:  #i exp *)
| Record of ((id*exp)list)    (* records: {name="Greg", age=100} *)
| Field of (id*exp)           (* record projection:  #name exp *)
| DataCon of (id*(exp option))(* data constructors:  true, false, nil, 
                              * NONE, SOME(exp), NODE{left=e1,elt=x,right=e2}*)
| Case of (exp*(pat*exp)list)(* case:  case exp of pat1=>e1 | ... | patn=>en *)
| Let of (decl*exp)          (* let:  let decl in exp *)
| Fun of (id*id*exp)         (* recursive functions:  fun f(x)=exp *)

(* declarations *)
and decl = 
  Val_d of (pat*exp)         (* val pat = exp *)
| Fun_d of (id*id*exp)       (* fun f(x) = exp *)

(* patterns *)
and pat = 
  Wild_p                       (* wildcard:  _ *)
| Id_p of id                   (* variable:  x *)
| Const_p of int               (* constant:  3 *)
| DataCon_p of (id*pat option) (* data constructor:  true, false, nil, 
                             * NONE, SOME(pat), NODE{left=p1,elt=p2,right=p3}*)
| Tuple_p of pat list          (* tuple patterns:  (pat1,...,patn) *)
| Record_p of (id*pat) list (* record patterns: {field1=pat1,...,fieldn=patn}*)

(* returns true iff the expression is a value *)
fun is_value (e:exp):bool = 
  case e of
    Const _ => true
  | Fn _ => true
  | Tuple (es) => List.all is_value es
  | Record(ides) => List.all (fn (_,e) => is_value e) ides
  | DataCon(_,NONE) => true
  | DataCon(_,SOME(e)) => is_value e
  | _ => false

(***************************************)
(* Functions for printing out the code *)
(***************************************)

(* insert a separator between each element of a string list and 
 * concatenate the whole thing.  *)
fun sep (s:string) (lis:string list) : string = 
  let fun f lis = 
    case lis of
      [] => []
    | [x] => [x]
    | hd::tl => hd::s::(f tl)
  in 
    List.foldr (op ^) "" (f lis)
  end

(* convert a constant to a string *)
fun const2s (c:const):string = 
  case c of
    Int(i) => Int.toString i
  | Real(r) => Real.toString r
  | String(s) => "\"" ^ s ^ "\""
  | Char(c) => "#\"" ^ (Char.toString c) ^ "\""

(* convert a binary operation to a string *)
fun binop2s (b:binop):string = 
  case b of
    Plus => "+"
  | Minus => "-"
  | Times => "*"
  | Concat => "^"
  | Equal => "="


(* maximum precedence *)
val max_prec = 999;

(* precedence for binary operations *)
fun binop_prec(b) = 
  case b of
    Plus => 7
  | Times => 8
  | Minus => 7
  | Concat => 6
  | Equal => 5

(* precedence for expressions *)
fun prec(e:exp):int = 
  case e of
    Const(c) => max_prec
  | Id(x) => max_prec
  | Fn(id,e) => 1
  | App(e1,e2) => 2
  | Binop(e1,b,e2) => binop_prec(b)
  | Tuple(es) => max_prec
  | Ith(i,e) => 2
  | Record(ides) => max_prec
  | Field(x,e) => 2 
  | DataCon(x,NONE) => 2
  | DataCon(x,SOME(e)) => 2
  | Case(e,cases) => 1
  | Let(d,e) => max_prec
  | Fun(f,x,e) => 1

(* convert an expression to a string *)
fun exp2s (p:int) (e:exp):string = 
  let val p' = prec(e)
    val e2s = exp2s p'
    val s = 
      case e of
        Const(c) => const2s c
      | Id(x) => x
      | Fn(id,e) => "fn "^id^" => "^(e2s e)
      | App(e1,e2) => (e2s e1)^" "^(e2s e2)
      | Binop(e1,b,e2) => (e2s e1)^(binop2s b)^(e2s e2)
      | Tuple(es) => "("^(sep "," (List.map (exp2s 0) es))^")"
      | Ith(i,e) => "#"^(Int.toString i)^" "^(e2s e)
      | Record(ides) => 
          ("{"^(sep "," 
                (List.map (fn (x,e) => x^"="^(exp2s 0 e)) ides))^
           "}")
      | Field(x,e) => "#"^x^" "^(e2s e)
      | DataCon(x,NONE) => x
      | DataCon(x,SOME(e)) => 
          (case e of
             Tuple _ => (x ^ (e2s e))
           | _ => (x ^ " " ^ (e2s e)))
      | Case(e,cases) => 
             "case "^(exp2s 0 e)^" of "^
             (sep " | " (List.map case2s cases))
      | Let(d,e) => 
             "let "^(decl2s d)^" in "^(exp2s 0 e)^" end"
      | Fun(f,x,e) => 
             "fun "^f^"("^x^") = "^(exp2s 0 e)
  in 
    if (p' > p) then s else "("^s^")"
  end

(* convert a declaration to a string *)
and decl2s (d:decl):string = 
  case d of
    Val_d(p,e) => "val "^(pat2s p)^" = "^(exp2s 0 e)
  | Fun_d(f,x,e) => "fun "^f^"("^x^") = "^(exp2s 0 e)

(* convert a pattern to a string *)
and pat2s (p:pat):string = 
  case p of
    Wild_p => "_"
  | Id_p(x) => x
  | Const_p(i) => Int.toString i
  | DataCon_p(id,NONE) => id
  | DataCon_p(id,SOME(p)) => id^" "^(pat2s p)
  | Tuple_p(ps) => "("^(sep "," (List.map pat2s ps))^")"
  | Record_p(idps) => 
      "{"^(sep "," 
           (List.map (fn (x,p) => x^"="^(pat2s p)) idps))^"}"

(* convert a case expression to a string *)
and case2s (p:pat,e:exp):string = 
  (pat2s p)^" => "^(exp2s 0 e)

(* print out an expression *)
fun print_exp(e:exp):unit = 
  print(exp2s 0 e); print "\n";

(* example data constructors true and false *)
val True = DataCon("true",NONE);
val False = DataCon("false",NONE);

(* apply a binary operation to two constants *)
fun apply_binop(b:binop,c1:const,c2:const):exp =
  case (b,c1,c2) of
    (Plus,Int i,Int j) => Const(Int(i+j))
  | (Plus,Real i,Real j) => Const(Real(i+j))
  | (Times,Int i,Int j) => Const(Int(i*j))
  | (Times,Real i,Real j) => Const(Real(i*j))
  | (Minus,Int i,Int j) => Const(Int(i-j))
  | (Minus,Real i,Real j) => Const(Real(i-j))
  | (Concat,String s1,String s2) => Const(String(s2 ^ s2))
  | (Equal,Int i,Int j) => if i = j then True else False
  | (Equal,String i,String j) => if i = j then True else False
  | (Equal,Char i,Char j) => if i = j then True else False
  | (_,_,_) => error("bad binop application")

(* raised when a pattern match fails *)
exception MatchFail

(* a substitution is a list of variables and their associated expressions *)
type substitution = (id * exp) list

(* match the value v against the pattern p to get a substitution *)
fun match(v:exp,p:pat):substitution = 
  case (v,p) of
    (_, Wild_p) => []
  | (_, Id_p(x)) => [(x,v)]
  | (Const(Int i), Const_p j) => if (i = j) then [] else raise MatchFail
  | (DataCon(id,NONE), DataCon_p(id',NONE)) =>
      if (id = id') then [] else raise MatchFail
  | (DataCon(id,SOME(v')), DataCon_p(id',SOME(p'))) =>
        if (id = id') then match(v',p') else raise MatchFail
  | (Tuple(vs), Tuple_p(ps)) =>
          List.foldr (op @) [] (map match (zip vs ps))
  | (Record(idvs), Record_p(idps)) => 
          let val idps = sort idps
          in 
	    List.foldr (op @) [] (map (fn ((_,v),(_,p)) => match(v,p))
				  (zip idvs idps))
          end
  | (_, _) => raise MatchFail


(* return the list of variables that occur in a pattern *)
fun pat_vars(p:pat):id list = 
  case p of
    Wild_p => []
  | Id_p(x) => [x]
  | Const_p(_) => []
  | DataCon_p(_,NONE) => []
  | DataCon_p(_,SOME(p')) => pat_vars p'
  | Tuple_p(ps) => List.foldr (op @) [] (map pat_vars ps)
  | Record_p(idps) => 
      List.foldr (op @) [] (map (fn (_,p) => pat_vars p) idps)

(* substitute the value v for the variable x within the expression e *)
fun subst(s as (x:id,v:exp),e:exp):exp =
  case e of
    Const _ => e
  | Id(y) => if (x = y) then v else e
  | Fn(y,e) => if (x = y) then e else Fn(y,subst(s,e))
  | App(e1,e2) => App(subst(s,e1),subst(s,e2))
  | Binop(e1,b,e2) => Binop(subst(s,e1),b,subst(s,e2))
  | Tuple(es) => Tuple(List.map (fn e => subst(s,e)) es)
  | Ith(i,e) => Ith(i,subst(s,e))
  | Record(ides) => Record(List.map (fn (lab,e) => (lab,subst(s,e))) ides)
  | Field(id,e) => Field(id,subst(s,e))
  | DataCon(id,NONE) => e
  | DataCon(id,SOME(e)) => DataCon(id,SOME(subst(s,e)))
  | Case(e,cases) => Case(subst(s,e),List.map (subst_case s) cases)
  | Let(Val_d(p,e1),e2) => 
      let val pvs = pat_vars(p)
        val d = Val_d(p,subst(s,e1))
      in
        if List.exists (fn y => y = x) pvs then
          Let(d,e2)
        else
          Let(d,subst(s,e2))
      end
  | Let(Fun_d(f,y,e1),e2) => 
      if (x = f) then e
      else if (x = y) then Let(Fun_d(f,y,e1),subst(s,e2))
	   else Let(Fun_d(f,y,subst(s,e1)),subst(s,e2))
  | Fun(f,y,e') => 
             if (x = f) orelse (x = y) then e
             else Fun(f,y,subst(s,e'))
(* substitute v for x within a case *)
and subst_case (s as (x:id,v:exp)) (p:pat,e:exp) : pat*exp = 
  if List.exists (fn y => y = x) (pat_vars p) then (p,e)
  else (p,subst(s,e))


(* substitute a substitution (list of variables and associated values) within
 * an expression *)
fun substitute(S:substitution,e:exp):exp = 
  (* print out the substitution *)
  (if (!print_substitutions) then
     (print "\nsubstituting [";
      print (sep "," 
             (List.map (fn (x,e) => "("^x^","^(let val s = exp2s 0 e
                                               in 
                                                 if (size s) > 15 then
                                                   (String.substring(s,0,14)) ^ "..." else s
                                               end)^")") S));
      print "]\n within ";
      print_exp e;
      print "\n";
      pause())
   else ();
   List.foldr subst e S)

(* evaluate the expression e to get a value *)
fun eval'(e:exp):exp =
  case e of
    Const(c) => Const(c)
  | Fn(x,e') => Fn(x,e')
  | Id(x) => error("Id: unbound variable "^x)
  | App(e1,e2) =>
      let val v1 = eval e1
        val v2 = eval e2
      in
        case v1 of
          Fn(x,e') => eval(substitute([(x,v2)],e'))
        | _ => error("App: not a function")
      end
  | Binop(e1,b,e2) =>
      (case (eval e1,eval e2) of
         (Const c1, Const c2) => apply_binop(b,c1,c2)
       | _ => error("Binop: arguments not constants"))
  | Tuple es => Tuple (map eval es)
  | Ith(i,e) => 
         (case (eval e) of
            Tuple(vs) => List.nth(vs,i)
          | _ => error("Ith: not a tuple"))
  | Record lab_es => Record (sort (map (fn (lab,e) => (lab,eval e)) lab_es))
  | Field(id,e) =>
            (case (eval e) of
               Record(lab_vs) => 
                 (case List.find (fn (x,v) => x = id) lab_vs of
                    SOME(_,v) => v
                  | NONE => error("Field: record missing field "^id))
             | _ => error("Field: not a record"))
  | DataCon(id,NONE) => DataCon(id,NONE)
  | DataCon(id,SOME(e)) => DataCon(id,SOME(eval e))
  | Case(e,cases) => find_match(eval e,cases)
  | Let(Val_d(p,e1),e2) =>
               let val v = eval e1
                 val S = match(v,p)
               in
                 eval(substitute(S,e2))
               end
  | Let(Fun_d(f,x,e1),e2) => eval(substitute([(f,Fun(f,x,e1))],e2))
  | Fun(f,x,e) => eval (Fn(x,substitute([(f,Fun(f,x,e))],e)))

(* print out the expression and evaluate it *)
and eval(e:exp):exp = 
  (* don't bother to evaluate expressions that are already values *)
  if (!skip_values andalso is_value e) then e else
    let val _ = 
      if (!print_expressions) then
        (print("\nThe current expression being evaluated is: \n");
         print_exp e; print "\n"; pause(); ())
      else ()
	val r = eval' e
    in 
      (if (!print_expressions) then
         (print "the result of eval(";
          print_exp e;
          print ") is: \n";
          print_exp r;
          print "\n\n";
          pause())
       else ()); r
    end

(* given a list of cases from a case expression, try to match v 
 * against the associated pattern.  If this succeeds, then apply
 * the resulting substitution to the right-hand-side of the case
 * and evaluate it.  Otherwise, go on to the next case. *)
and find_match(v:exp,cases:(pat * exp) list):exp = 
  case cases of
    [] => error("total match failure")
  | ((p,e)::rest) =>
      (eval(substitute(match(v,p),e)) handle MatchFail => find_match(v,rest))

(**************************************************************************)
(* Some examples to evaluate:  swap_example, fact_example, append_example *)
(**************************************************************************)
val one = Const(Int(1))
val two = Const(Int(2))
val three = Const(Int(3))
val four = Const(Int(4))
fun plus(e1,e2) = Binop(e1,Plus,e2)

(*
let val swap = 
    fn (p) => 
      let val (x,y) = p
      in
	  (y,x) 
      end
in
   swap(3+1,4+2)
end
*)
val swap = Fn("p",Let(Val_d(Tuple_p[Id_p "x", Id_p "y"], Id "p"),
		      Tuple[Id "y", Id "x"]))
val swap_example = 
    Let(Val_d(Id_p "swap",swap), App(Id "swap", Tuple[plus(three,one),
						      plus(four,two)]))

(* 
let fun fact(n) = 
        if (n = 1) then 1 else n * fact(n-1)
in
    fact 4
end
*)

fun ife e1 e2 e3 = 
    Case(e1,[(DataCon_p("true",NONE),e2),(DataCon_p("false",NONE),e3)]);
val fact = 
    ("fact","n",
     ife (Binop(Id "n",Equal,one)) one
     (Binop(Id "n",Times,App(Id "fact",Binop(Id "n",Minus,one)))))
val fact_example = Let(Fun_d(fact),App(Id "fact",Const(Int 4)))

(*
datatype intlist = Nil | Cons of (int * intlist)

let fun append(p) = 
        case p of
	  (Nil, y) => y
        | (Cons(hd,tl), y) => Cons(hd,append(tl,y))
in
    append (Cons(1,Nil), Cons(2,Cons(3,Nil)))
end
*)
fun cons(e1:exp,e2:exp):exp = DataCon("Cons",SOME(Tuple[e1,e2]))
val nil_e:exp = DataCon("Nil",NONE)
val nil_p:pat = DataCon_p("Nil",NONE)
fun cons_p(p1:pat,p2:pat):pat = DataCon_p("Cons",SOME(Tuple_p[p1,p2]))
fun pair_p(p1:pat,p2:pat):pat = Tuple_p[p1,p2]
val append = 
    ("append","p",
     Case(Id("p"),[(pair_p(nil_p,Id_p("y")), Id("y")),
		   (pair_p(cons_p(Id_p("hd"),Id_p("tl")),Id_p("y")),
		    cons(Id("hd"),App(Id("append"),Tuple[Id("tl"),Id("y")])))
		   ]))
val append_example = Let(Fun_d(append),App(Id("append"),
					   Tuple[cons(one,nil_e),
						 cons(two,cons(three,nil_e))]))