The mysterious 3rd section now also has a permanent home in 215 Upson.
Problem set #1 is out. Get started on it!
Demo session tonight 7-8PM in B7 Upson.
See web page for consulting schedule for this semester.
Minor PS#1 simplifications to make your life easier:
· last problem is only 5% so don’t kill yourselves (it’s hard)
· How to sort a list in ML: ListMergeSort.sort (fn(x:int,y:int)=>(x>y)) [1,~1,3,~7] BUT this doesn’t work in Windows…
Last time we gave you the formal evaluation rules for a small subset of ML, and then started to add procedures.
Today’s lecture will have two parts. To start with, I’m going to show you very informally how to do various things in ML that you need for the problem set. This is the standard (and, to me, wrong-headed) way in which programming languages are taught, namely “anecdotally” – here is a construct, here is sort of what it does, here is an example.
Then I’m going to show you some examples of 1-line programs that are really hard to understand. And not because they are badly written, but because a vague understanding of the language just isn’t good enough.
can be just named symbolic constants, containers for separate underlying types (these are "union types"), recursive types.
datatype ORDER = LESS | GREATER | EQUAL (predefined in the standard library)
datatype BAG = INT of int | REAL of real | BOOL of bool
datatype LIST = EMPTY | CONS of int * LIST
let
val x: int = 42
in
E (*arbitrary expression *)
end
let
val x: int = 42
val y: int = x+3
in
E (*arbitrary expression *)
end
datatype
PAIRORNIL = PAIR of int * int | NIL
let
val p:PAIRORNIL = PAIR(6,9)
val q:PAIRORNIL = NIL
in
case q of
(* or p *)
PAIR(x,y) => x * y (* note that this case establishes a
binding for x and y! *)
| NIL => 42
end
let
val p:PAIRORNIL = PAIR(6,9)
in
case p of
PAIR(6,y) => 1
| PAIR(x,9) => 2
| PAIR(x,y) => 3
| NIL => 42
end
let
val p:PAIRORNIL = PAIR(5,9)
in
case p of
PAIR(6,y) => 1
| PAIR(x,9) => 2
| PAIR(x,y) => 3
| NIL => 42
end
let
val p:PAIRORNIL = PAIR(5,10)
in
case p of
PAIR(6,y) => 1
| PAIR(x,9) => 2
| PAIR(x,y) => 3
| NIL => 42
end
let
val p:PAIRORNIL = PAIR(5,10)
in
case p of
PAIR(x,y) => 24 (* Warning: match
nonexhaustive *)
end
let
val p:PAIRORNIL = PAIR(5,10)
in
case p of
PAIR(x,y) => 24 (* No warning *)
| NIL => 42
end
let
val p:PAIRORNIL = PAIR(5,10)
in
case p of
PAIR(_,_) => 24 (* Same thing *)
| NIL => 42
end
let
val r = {name="daffy", iq=25}
in
case r of
{name = _, iq = 25} =>
"dummy"
| _ => "not so dumb"
end
let
val r = {name="daffy", iq=25}
in
case r of
{iq = _, name = "daffy"} =>
"self-described genius"
| {name = _, iq = 25} =>
"dummy"
| _ => "not so dumb"
end
ML has built in support for lists (in fact, not just lists of integers). More precisely, for any type T, there is a type T list, which is a list of objects of type T. Note that ALL of the objects must be of type T.
As an example, int list is a type. You can create a list by using square brackets. So for example [1,2,3] is an int list. You can get the first element (head, sometimes I’ll slip up and say “CAR”) by using hd(lst) and the rest of the list (“CDR”) by using tl(lst). You can create a new list that has the element x in front of the old list lst by doing x::lst. And you can test if a list is empty (written []) by using null. These all have equivalent operations using the LIST custom datatype I defined above, but the built-ins are faster.
Examples:
let
val lst:int list = [1, 2, 3]
in
hd(lst) + hd(tl(tl(lst)))
end
let
val lst:int list = 5::[6,7]
in
hd(lst) + hd(tl(tl(lst)))
end
Based upon these primitive operations on lists we can build all kinds of cool recursive functions.
fun
mylength(lst:LIST):int = (* using our
custom datatype *)
case lst of
EMPTY => 0
| CONS(x,rest) => 1+mylength(rest)
fun
mylength2(lst:int list):int = (* using
builtin datatype *)
case lst of
[] => 0
| x::rest
=> 1+mylength2(rest)
Sample builtin is length. What is the type of lenth? Not int list->int, but ‘a list->int. More about this later, for the moment this means the input is a T list for any type T.
More sample built-ins (you can define these recursively, and will undoubtedly need to on a prelim…)
rev(lst)
reverses a list (takes time linear in length)
lst1@lst2 appends lst2 to end of lst1. Note this is also slow, linear in lst1 length (why?)
fun app(l1:int
list,l2:int list):int list =
case l1 of
[] => l2
| x::rest =>
x::app(rest,l2)
There are lots
of fun examples like this. They tend to show up on prelim #1 and the final. How
about summing up the squares of the numbers in a list?
fun ssqr(lst:int
list):int =
case lst of
[] => 0
| z::rest =>
z*z + ssqr(rest)
OK, now for the really fun ones. map(f,lst) gives you a new list that is the result of applying f to each element of lst in turn. Examples:
map (fn x: int
=> x * x) [1, 2, 3, 4]
map (fn x: int
=> x > 0) [~1, 0, 1, 2]
What is the type of map? It takes an ‘a->’b and a ‘a list, and gives you back a ‘b list. We could write it ourselves, at least for simple cases (I don’t really want to do parameterized types yet…):
Hmm, this looks a lot like mylength2, app, ssqr. Walk down the list; if its null we are done, otherwise do something to the head and call ourselves on the tail. More precisely, somehow combine the result of some computation on the head, and calling ourselves on the tail.
Can we abstract out this pattern (avoid writing the same code twice?) Yes, but it’s really hard to think about without a clean semantics.
foldl(comb, base, lst) captures this pattern. The last argument is the list we process. The second argument is what we return if that list is empty. The first argument is how we combine 2 arguments: the head of the list, and the result of the recursive call to ourselves.
IMPORTANT NOTE ABOUT FOLDL, which will save you a lot of grief on prelim 1. The recursive call could be on the empty list (i.e., the list could have only 1 element). So be sure that comb works on a single element plus the base! In other words, the type at the end of the fn below needs to be the same type as the middle argument!
You can do amazing things with foldl… Examples:
foldl (fn
(x:int, s:int) => x + s) 0 [1, 2, 3, 4] (* 10 *)
How about counting the elements?
foldl (fn (x:
int, y: int) => y+1) 0 [1, 2, 3, 4, 2]; (* 5 *)
How about summing the squares?
foldl (fn (x:
int, y: int) => sqr(x) + y) 0 [1, 2, 3, 4, 2]; (* 34 *)
OK, let’s square every element of a list:
foldl (fn (x:
int, y: int list) => x*x::y) [] [1, 2, 3, 4] (* [16,9,4,1] *)
Huh?
foldl (fn (x:
int, y: int list) => x::y) [] [1, 2, 3, 4]
(* [4,3,2,1] *)
Well, at least it’s consistent.
How the heck do we think about what this function does?? It clearly captures a nice pattern of usage, and is a powerful abstraction. But without a more precise way to think about ML programs, we’re at a loss.
Note: Languages like C and Java simply don’t support functions that are as powerful (and has hard to think about) as foldl.
For our particular function we would write it as
fn (z:int):int => z*z
To use this on an argument we simply write
(fn (z:int):int => z*z)(2+3)
Now we need to add various things to our BNF table, to make fn part of the syntax, and to eval, to give it the correct semantics. We also need to add identifiers, which are variable names. Both identifiers and anonymous functions are expressions, as is a particular expression called a combination. Finally, we need to add types.
|
syntactic class |
syntactic variable(s) and grammar
rule(s) |
examples |
|
identifiers |
x, y |
|
|
constants |
c |
... |
|
unary operator |
u |
|
|
binary operators |
b |
|
|
expressions (terms) |
e ::= x | c | u e | e1 b e2 | |
|
|
types |
t ::= |
|
Adding support to eval for this is subtler than it first appears. To begin with, we need to expand the definition of a value (i.e., the final result of evaluating an expression). For reasons that will eventually become clear (perhaps!), it is desirable to allow anonymous functions to be values. This results in the new rule:
Rule #E5 [functions]: anonymous functions evaluate to themselves
eval(fn (id:t) => e) = (fn (id:t) => e)
Finally, we need to figure out what the value is of a combination. Here, the key concept is that we substitute the value of the identifier for the identifier in the body, and then evaluate that. But it’s a little trickier than it at first appears…
Rule #E6 [combinations]: to evaluate e1(e2), evaluate e1 to a function (fn (id:t) => e), then evaluate e2 to a value v, then substitute v for the formal parameter id within the body of the function e to yield an expression e'. Finally, evaluate e' to a value v'. The result is v'.
eval(e1(e2)) = v' where
(0) eval(e1) = (fn (id:t) => e)
(1) eval(e2) = v
(2) substitute([(id,v)],e) = e'
(3) eval(e') = v'
OK, what does it mean to substitute? The simple version is we simply replace the identifier with the value in the expression.
Does this work? On simple cases, yes. Let’s try it:
(fn(z) => z*z + 17)(2+3)
[Note: I will often drop types in lecture. Don’t do this when you are writing code!]
Looks good so far. But actually, it doesn’t work and we need to do something more subtle. Can anyone see why it doesn’t work to simply replace z in the body by 5? Well, let’s think of some other things that the could be the body of the expression…
Consider another expression that has the value 17. By referential transparency we can use this instead of 17 and get the same answer. So far so good. But now suppose that the expression we use, which has the value 17, is actually
(fn(z) => z+7)(10)
So that makes our expression
(fn(z) => z*z + ((fn(z) => z+7)(10)))(2+3)
We substitute 5 for z in the body and end up with something seriously wrong, namely 5*5 + 12 = 39. Not the answer to life at all…
Clearly we need to substitute carefully.
The simple rule is that you don’t substitute for the variable z inside a combination whose parameter is the variable z. But we can look at this in more detail.
We can make this issue clearer by introducing a new feature in ML that allows us to create temporary names for variables. This new feature does not add any power beyond what fn provides, but it is very convenient.
Suppose we want to evaluate the expression E with the variable z bound to 5. We can do this straightforwardly by writing the combination
(fn(z:int) => E)(5)
Let’s try it out on an example: eval(3 * (if (1 > 2) then 5 else (7+7))
Unfortunately, this kind of code is pretty hard to read. Consider: evaluate E’ with z bound to 5 and y bound to z*z. In the above we replace E by ((fn(y)=>E’)(z*z)) thus producing the totally unreadable
(fn(z:int)=>
((fn(y:int)=>E’)(z*z))
(5)
Not fun at all. Believe it or not, some pretty famous large programs have been written using this style, including the PhD thesis of MIT’s past provost (Joel Moses).
How do we do better? Well, informal definitions of special forms are best done by example. So here’s an example:
let val z:int= 5
in
E
end
let val z:int= 5
in
let val y:int = z*z
in
E’
end
end
Much easier to read! Note that this val declaration is needed for a language feature we haven’t yet added.
In fact there is an even easier to read version of this, namely:
let val z:int= 5
val y:int = z*z
in
E’
end
OK, we now need to add a syntax and semantics for let. Conceptually it’s pretty easy, but there are a few details.