Lecture 4: More ML

Administrivia

Problem set #1 was due Wednesday night 11:59PM. Due to some confusion over various ML details, we will extend the due date to Friday 7PM. (Don’t expect this to happen again!)

Problem set #2 will be handed out on Wednesday at 11:59PM.

RDZ is away next week, lectures will be given by Tibor Janosi.

Last time we gave you the formal evaluation rules for a small subset of ML, and then started to add procedures.

Today I will cover recursive functions of lists, and currying. Then I will return to the substitution model.

Recursive functions of lists

ML has built in support for lists (in fact, not just lists of integers). More precisely, for any type T, there is a type T list, which is a list of objects of type T. Note that ALL of the objects must be of type T.

This actually uses a feature called parameterized types, which we will talk about in section. For the moment, note the following. What is the type of hd? It takes an int list and gives back a list. Or it takes a bool list and gives back a bool. Or… Hmm, how do we write this? hd:’a list->’a

It is easy and fun to write recursive functions on lists (which are, after all, a recursively defined datatype, much like our datatype MYLIST)

datatype MYLIST = EMPTY | CONS of int * MYLIST

fun mylength(lst:LIST):int = (* using our custom datatype *)

case lst of

EMPTY => 0

| CONS(x,rest) => 1+mylength(rest)

fun mylength2(lst:int list):int = (* using builtin datatype *)

case lst of

[] => 0

| x::rest => 1+mylength2(rest)

Sample builtin is length. What is the type of length? Not int list->int, but ‘a list->int. More about this in section, for the moment this means the input is a T list for any type T.

More sample built-ins (you can define these recursively, and will undoubtedly need to on a prelim…)

lst1@lst2 appends lst2 to end of lst1. Note this is also slow, linear in lst1 length (why?)

fun myapp(l1:int list,l2:int list):int list =

case l1 of

[] => l2

| x::rest => x::myapp(rest,l2)

fun myrev(lst:int list):int list =

case lst of

[] => []

| x::rest => myrev(rest)@[x]

fun myrev2(lst:int list):int list =

let fun helper(ans:int list, rest:int list) =

case rest of

[] => ans

| x::more => helper(x::ans, more)

in

helper([], lst)

end

The second example is much faster, because the operation we do to each element of the list is :: rather than append! This style is called tail-recursive; we will see a precise definition of it later on, but for the moment just note that when helper calls itself it returns immediately. Also note the use of helper functions, and what are sometimes called accumulators (like ans).

There are lots of fun examples like this. They tend to show up on prelim #1 and the final. How about summing up the squares of the numbers in a list?

fun ssqr(lst:int list):int =

case lst of

[] => 0

| z::rest => z*z + ssqr(rest)

OK, now for the really fun ones. map(f,lst) gives you a new list that is the result of applying f to each element of lst in turn. Examples:

map (fn x: int => x * x) [1, 2, 3, 4]

map (fn x: int => x > 0) [~1, 0, 1, 2]

What is the type of map? It takes an ‘a->’b and a ‘a list, and gives you back a ‘b list. We could write it ourselves, at least for simple cases :

fun mp(f:int->int,lst:int list):int list =

  case lst of

    [] => []

| x::rest => f(x)::mp(f,rest)

Hmm, this looks a lot like mylength2, app, ssqr. Walk down the list; if its null we are done, otherwise do something to the head and call ourselves on the tail. More precisely, somehow combine the result of some computation on the head, and calling ourselves on the tail.

Can we abstract out this pattern (avoid writing the same code twice?) Yes, but it’s really hard to think about without a clean semantics.

foldl(comb, base, lst) captures this pattern. The last argument is the list we process. The second argument is what we return if that list is empty. The first argument is how we combine 2 arguments: the head of the list, and the result of the recursive call to ourselves.

You can do amazing things with foldl… Examples:

foldl (fn (x:int, s:int) => x + s) 0 [1, 2, 3, 4] (* 10 *)

How about counting the elements?

foldl (fn (x: int, y: int) => y+1) 0 [1, 2, 3, 4, 2]; (* 5 *)

How about summing the squares?

foldl (fn (x: int, y: int) => sqr(x) + y) 0 [1, 2, 3, 4, 2]; (* 34 *)

IMPORTANT NOTES ABOUT FOLDL, which will save you a lot of grief on prelim 1.

· More generally, there are 2 types involved (let’s call them A and B).

o What is type A? It’s the type of each element of the list (3^rd arg). Moreover, since the combiner (1^st arg) takes as a first argument an element of the list, it is the type of the 1^st argument to the combiner.

o What is type B? It’s the type of the base (2^nd argument). Since the base can be returned (if the 3^rd argument is []) it is also the return type of the combiner. Finally, since the combiner’s second argument is the result of a previous call to the combiner, B is also the type of the second argument to the combiner.

o In fact, the actual type of foldl is (hold on…): ('a * 'b -> 'b) -> 'b -> 'a list -> 'b

· Special case: The recursive call could be on the empty list (i.e., the list could have only 1 element). So be sure that comb works on a single element plus the base! In other words, the type at the end of the combiner above needs to be the same type as the middle argument!

OK, we think we understand foldl…let’s square every element of a list:

foldl (fn (x: int, y: int list) => x*x::y) [] [1, 2, 3, 4] (* [16,9,4,1] *)

Huh?

foldl (fn (x: int, y: int list) => x::y) [] [1, 2, 3, 4] (* [4,3,2,1] *)

Well, at least it’s consistent.

How the heck do we think about what this function does?? It clearly captures a nice pattern of usage, and is a powerful abstraction. But without a more precise way to think about ML programs, we’re at a loss.

Note: Languages like C and Java simply don’t support functions that are as powerful (and has hard to think about) as foldl.

Currying

This is another example of why we need a semantics. For example:

val ford =

fn(x:int)=>fn(y:bool)=>fn(z:int list)=>(if y then null(z) else x = 42)

What is the type and value of:

ford (* int->bool->int list->bool *)

ford(42) (* bool->int list->bool *)

ford(42)(true) (* int list->bool *)

ford(42)(true)[4,2] (* false *)

Back to the Substitution Model:

Recap

syntactic class	syntactic variable(s) and grammar rule(s)	examples
constants	c	...`~2`, `~1`, `0`, `1`, `2` (integers) `1.0`, `~0.001`, `3.141` (reals) `true`, `false` (booleans) `"hello"`, `""`, `"!"` (strings) `#"A"`, `#" "` (characters)
unary operator	u	`~`, `not`, `size`, ...
binary operators	b	`+`, `*`, `-`, `>`, `<`, `>=`, `<=`, `^`, ...
expressions (terms)	e ::= c \| u e \| e₁ b e₂ \| `if` e `then` e `else` e	`~0.001`, `not true`, `2 + 2`,

Rule #E1 [constants]: constants evaluate to themselves

eval(c) = c

Rule #E2 [unary ops]: to evaluate u e where u is a unary operation such as not or ~, evaluate e to a value v', then perform the appropriate unary operation on the value v' to get the result v.

eval(u e) = v where

  (0) eval(e) = v'

(1)   v = apply_unop(u,v')

Rule #E3 [binary ops]: to evaluate e1 b e2 where b is a binary operation such as +, *, -, etc. Evaluate e1 to a value v1, then evaluate e2 to a value v2, then perform the appropriate operation on the values v1 and v2 to get the result v.

eval(e1 b e2) = v where

  (0) eval(e1) = v1

  (1) eval(e2) = v2

  (2) v = apply_binop(b,v1,v2)

Rule #E4 [if]: to evaluate (if e then e1 else e2), evaluate e to a value v. Then depending on the (boolean) value of v, the value is either the result of evaluating e1 or e2.

eval(if e then e1 else e2) = v' where

(0) eval(e) = v

(1) if v = true then v' = eval(e1)

(2) if v = false then v' = eval(e2)

Syntax and Semantics of Procedures

Now we need to add various things to our BNF table, to make fn part of the syntax, and to eval, to give fn the correct semantics. We also need to add identifiers, which are variable names. Both identifiers and anonymous functions are expressions, as is a particular expression called a combination. Finally, we need to add types.

syntactic class	syntactic variable(s) and grammar rule(s)	examples
identifiers	x, y	`a`, `x`, `y`, `x_y`, `ford1000`, ...
constants	c	...`~2`, `~1`, `0`, `1`, `2` (integers) `1.0`, `~0.001`, `3.141` (reals) `true`, `false` (booleans) `"hello"`, `""`, `"!"` (strings) `#"A"`, `#" "` (characters)
unary operator	u	`~`, `not`, `size`, ...
binary operators	b	`+`, `*`, `-`, `>`, `<`, `>=`, `<=`, `^`, ...
expressions (terms)	e ::= x \| c \| u e \| e₁ b e₂ \| `if` e `then` e `else` e \| fn `(`x₁:t₁, ..., x_n:t_n`):` t = e \| e `(`e₁`,` ...`,` e_n`)`	`ford`, `~0.001`, `not` `b`, `2 + 2`,
types	t ::= `int` \| `real` \| `bool` \| `string` \| `char` \| t₁``...``t_n`->`t	`int`, `string`, `int->int`, `bool*int->bool`

Adding support to eval for this is subtler than it first appears. To begin with, we need to expand the definition of a value (i.e., the final result of evaluating an expression). For reasons that will eventually become clear (perhaps!), it is desirable to allow anonymous functions to be values. This results in the new rule:

Rule #E5 [functions]: anonymous functions evaluate to themselves

eval(fn (id:t) => e) = (fn (id:t) => e)

Finally, we need to figure out what the value is of a combination. Here, the key concept is that we substitute the value of the identifier for the identifier in the body, and then evaluate that. But it’s a little trickier than it at first appears…

Rule #E6 [combinations]: to evaluate e1(e2), evaluate e1 to a function (fn (id:t) => e), then evaluate e2 to a value v, then substitute v for the formal parameter id within the body of the function e to yield an expression e'. Finally, evaluate e' to a value v'. The result is v'.

eval(e1(e2)) = v'  where

  (0) eval(e1) = (fn (id:t) => e)

  (1) eval(e2) = v

  (2) substitute([(id,v)],e) = e'

  (3) eval(e') = v'

OK, what does it mean to substitute? The simple version is we simply replace the identifier with the value in the expression.

Does this work? On simple cases, yes. Let’s try it:

(fn(z) => z*z + 17)(2+3)

[Note: I will often drop types in lecture. Don’t do this when you are writing code!]

Looks good so far. But actually, it doesn’t work and we need to do something more subtle. Can anyone see why it doesn’t work to simply replace z in the body by 5? Well, let’s think of some other things that the could be the body of the expression…

Consider another expression that has the value 17. By referential transparency we can use this instead of 17 and get the same answer. So far so good. But now suppose that the expression we use, which has the value 17, is actually

(fn(z) => z+7)(10)

So that makes our expression

(fn(z) => z*z + ((fn(z) => z+7)(10)))(2+3)

We substitute 5 for z in the body and end up with something seriously wrong, namely 5*5 + 12 = 39. Not the answer to life at all…

Clearly we need to substitute carefully.

The simple rule is that you don’t substitute for the variable z inside a combination whose parameter is the variable z. But we can look at this in more detail.

Let

We can make this issue clearer by introducing a new feature in ML that allows us to create temporary names for variables. This new feature does not add any power beyond what fn provides, but it is very convenient.

Suppose we want to evaluate the expression E with the variable z bound to 5. We can do this straightforwardly by writing the combination

(fn(z:int) => E)(5)

Let’s try it out on an example: eval(3 * (if (1 > 2) then 5 else (7+7))

Unfortunately, this kind of code is pretty hard to read. Consider: evaluate E’ with z bound to 5 and y bound to z*z. In the above we replace E by ((fn(y)=>E’)(z*z)) thus producing the totally unreadable

(fn(z:int)=>

((fn(y:int)=>E’)(z*z))

(5)

Not fun at all. Believe it or not, some pretty famous large programs have been written using this style, including the PhD thesis of MIT’s past provost (Joel Moses).

How do we do better? Well, informal definitions of special forms are best done by example. So here’s an example:

let val z:int = 5

in

end

let val z:int = 5

in

   let val y:int = z*z

in

      E’

end

end

Much easier to read! Note that this val declaration is needed for a language feature we haven’t yet added (but will shortly), namely fun.

In fact there is an even easier to read version of this, namely:

let val z:int = 5

    val y:int = z*z

in

  E’

end

Scope, Identifiers, and Substitution

Having briefly introduced let, we can now turn our attention to the issue of what it means to substitute a value for a variable.

Scope

We can define various functions but we need to avoid collisions. Often we only "need" a certain name within a certain piece of code (literally within). Where an identifier is defined is called its scope. This issue can be very confusing when you type things into ML, as opposed to loading a file into a fresh ML.

Here is a more complex function declaration which finds (an approximation to) the square root of a real number.

Underlying math fact: for any positive x, g, it is the case that g, x/g lie on opposite sides of sqrt(x).

(* Computes the square root of x using Heron of Alexandria's

 * algorithm (circa 100 AD). We "guess" that the square root

 * is 1.0 and then continue improving the guess until we're

 * with delta of the real answer.  The improvement is achieved

 * by averaging the current guess with x/guess.

*)

fun square_root(x: real): real =

let

    (* used to tell when the approximation is good enough *)

    val delta = 0.0001

    (* returns true iff the guess is good enough *)

    fun good_enough(guess: real): bool =

      abs(guess*guess - x) < delta

    (* improve the guess by averaging it with x/guess *)

    fun improve(guess: real): real =

      (guess + x/guess) / 2.0

    (* try a particular guess -- looping and improving the

     * guess if it's not good enough. *)

    fun try_guess(guess: real): real =

      if good_enough(guess) then guess

      else try_guess(improve(guess))

in

    (* start with a guess of 1.0 *)

    try_guess(1.0)

end

This is example shows a number of things. First, you can declare local values (such as delta) and local functions (such as abs, good_enough, improve, and try_guess.) Notice that "inner" functions, such as improve, can refer to outer variables (such as x). Also notice that later definitions can refer to earlier definitions. For instance, try_guess refers to both good_enough and improve. Finally, notice that try_guess is a recursive function -- it's really a loop. It's similar to writing something like:

while (!good_enough(guess)) {

   guess = try_guess(improve(guess));

in an imperative language such as Java or C.

If you type the square_root declaration above into the SML top-level, it responds with:

val square_root : fn real -> real

indicating that you've declared a variable (square_root), that its value is a function (fn), and that its type is a function from reals to reals. All of the internal structure of the function definition is hidden; all we know from the outside is that its value is a simple function. In particular, the function "try" is not defined!

After typing in the function, you might try it out on a real number such as 9.0:

- square_root(9.0);

  val it = 3.00000000014 : real

SML has evaluated the expression "square_root(9.0)" and printed the value of the expression (3.00000000014) and the type of the value (real).

At the moment we have only a sloppy, imprecise notion of exactly what happens when you type this expression into ML. In a few weeks we'll have a precise understanding (hopefully!)

If you try to apply square_root to an expression that does not have type real (say an integer or a boolean), then you'll get a type error:

- square_root(9);

stdIn:27.1-27.14 Error: operator and operand do not agree [literal]

operator domain: real

operand:         int

in expression:

  square_root 9

Binding and identifiers

Consider an ML expression like the one below:

let val ford = 3

    val ford:int->bool =

            fn(ford:int) => ford = ford

in

 ford(3)  (* how about ford(42)? *)

end

There are three different ways that one can use an identifier:

A binding occurrence, which binds the identifier to a particular value or type. For example, in the expression let val x:int = 1 in x end, the first occurrence is a binding occurrence that binds x to 1. Each binding occurrence introduces a new variable, and this new variable has a scope: a part of the program in which uses of that identifier refer to the variable. In this case the scope of the variable x is the body of the let expression.
A bound occurrence is a use of a variable in the scope of a variable binding. For each bound occurrence of a variable, there is a single corresponding binding of that variable. For example, in the expression (fn(x:int)=>x), the second occurrence of x is a bound occurrence; its corresponding binding occurrence is the first occurrence. At run time this variable will be bound to whatever value is passed to the function when it is invoked.
An unbound or free occurrence is a use of an identifier with no corresponding binding occurrence in whose scope. For example, in the expression let val y:ford = x+1 in y end, the use of x is an unbound occurrence because there is no containing binding of x. The identifier ford is also an unbound occurrence of a type identifier. A legal SML program cannot contain an unbound occurrence of an identifier. However, for the purpose of understanding how SML works, sometimes it is useful to write down syntactically legal fragments of SML programs and talk about the unbound variables that occur in them.

Given an occurrence of an identifier that is not a binding occurrence, there is a simple way to figure out whether it is bound or unbound, and if the former, to which binding occurrence the identifier is bound. An identifier is bound if it is in the scope of a binding occurrence. For ML programs, the scope of a variable can be seen by simply looking at the program text. If the variable lies within the scope of more than one binding occurrence, then one of those bindings shadows the rest. It will be the binding occurrence whose scope most tightly encloses the use of the identifier.

In SML it is possible to figure out just by looking at the program code which occurrence binds each use of a variable. A language with this property is said to have lexical scoping : the scope of each variable is apparent from the lexical form of the program, without knowing anything about how the program runs. The alternative to lexical scoping is dynamic scoping, in which a given variable occurrence may have different binding occurrences depending on how the program runs. In most modern languages, such as Java or C, variable have lexical scope. However, Perl and Python are examples of languages with dynamic variable scoping. Dynamic scope is harder to implement efficiently, and can lead to unpleasant surprises for programmers because variables don't always mean what they expect.

Substitution

Earlier we saw some rewriting rules that explained how to evaluate terms of the SML language. For example, we said that a simple expression evaluates according to the following rewrite rule:

let val x:t = v in e end --> e (with occurrences of x replaced by v)

Remember that we use e to stand for an arbitrary expression (term), x to stand for an arbitrary identifier, and v to stand for a value -- that is, a fully evaluated term.

We now know this cannot be the full story, because e₂ may contain occurrences of x whose binding occurrence is not this binding x:t = v₁. It doesn't make sense to substitute v for these occurrences. For example, consider evaluation of the expression:

let val x:int = 1

    fun f(x:int) = x

    val y:int = x+1

in

  fn(a:string) => x*2

end

The next step of evaluation replaces the green occurrences of x with 1, because these occurrences have the first declaration as their binding occurrence. Notice that the two occurrences of x inside the function f, which are respectively a binding and a bound occurrence, are not replaced. Thus, the result of this rewriting step is

let fun f(x:int) = x

    val y:int = 1+1

in

  fn(a:string) => 1*2

end

Let's write the substitution e{v/x} to mean the expression e with all unbound occurrences of x replaced by the value v. Then we can restate the rule for evaluating let more simply:

let val x:t = v in e end --> e{v/x}

This works because any occurrence of x in e must be bound by exactly this declaration val x:t = v. Here are some examples of substitution:

x{2/x} = 2
x{2/y} = x
(fn(y:int)=>x) {"hi"/x} = (fn(y:int)=>"hi")
f(x) { fn(y:int)=>y / f } = (fn(y:int)=>y)(x)

One of the features that makes ML fairly unique is the ability to write complex patterns containing binding occurrences of variables. Pattern matching in ML causes these variables to be bound in such a way that the pattern matches the supplied value. This is can be a very concise and convenient way of binding variables. We can generalize the notation used above by writing e{v/p} to mean the expression e with all unbound occurrences of variables appearing in the pattern p replaced by the values obtained when p is matched against v. Using this notation, we can express the let rule simply:

let val p = v in e end --> e{v/p}

What if a let expression introduces multiple declarations? Such an expression is identical in effect to a series of nested let expressions. Thus, we can use the following rewrite that pulls out the first declaration so the rules above apply.

let d₁...d_n  in e end  -->

  let d₁in let d₂...d_n in e end end

We can use the same substitution operator to give a more precise rule for what happens when a function is called. Consider a function declared as fun f(p) = e, where f is the identifier naming the function. Then the expression for a function call whose argument has been evaluated, f(v), is rewritten as follows:

f(v) --> e{v/p}

Similarly, consider a call to an anonymous function:

(fn( p )=> e )( v ) --> e{v/p}

We have seen a model of how programs evaluate in SML. It is important to realize that this is just a model. The actual implementation of SML evaluation is quite different (and much more complex to explain). This model is an abstraction that hides the complexity you don't need to know about. Some aspects of the model should not be taken too literally. For example, you might think that function calls take longer if an argument variable appears many times in the body of the function. It might seem that calls to function f are faster if it is defined as fun f(x) = x*3 rather than as fun f(x) = x+x+x because the latter requires three substitutions. Actually the time required to pass arguments to a function typically depends only on the number of arguments. Chances are the definition on the right is at least as fast as that on the left.

Let syntax and semantics

OK, we now need to add a syntax and semantics for let. Conceptually it’s pretty easy, but there are a few details.