Recitation 3: Value carrying and Recursive Datatypes and Higher order functions.

Algebraic Datatypes and Even More Pattern Matching:

A record (or tuple) is logically like an "and".  For instance, a tuple of type (int,real,string) is an object that contains an int and a real and a string.   Datatypes, in the most general form, are used for defining "or" types -- when something needs to be one type or another.  In particular, suppose we want to define a new type "number" that includes both ints and reals.  This can be accomplished in SML by the following datatype definition:

datatype num = Int_num of int | Real_num of real

This declaration gives us a new type (num) and two constructors Int_num and Real_num.  The Int_num constructor takes an int as an argument and returns a num, while the Real_num constructor takes a real as an argument and returns a num.  In this fashion, we can create a type that is the (disjoint) union of two other types. 

But how do we use a value of type num?  We can't apply an operation such as + to it, because + is only defined for either ints or reals.  In order to use a value of type num, we have to use pattern matching to deconstruct it.   For example, the following function computes the maximum of two nums:

fun num_to_real(n:num):real =
  (case n of
     Int_num(i) => Real.fromInt(i)
   | Real_num(r) => r)
fun max(n1:num, n2:num):num =
  let val r1:real = num_to_real(n1)
      val r2:real = num_to_real(n2)
  in
    if r1 >= r2 then Real_num(r1) else Real_num(r2)
  end

The strategy is simple:  convert the numbers to reals and then compare the two real numbers, returning the larger of the two.  In order to make the return value a num (as opposed to a real), we have to put the result in a Real_num constructor.  We could just have well written this as:

fun max2(n1:num, n2:num):num =
  let val r1:real = num_to_real(n1)
      val r2:real = num_to_real(n2)
  in
    Real_num(if r1 >= r2 then r1 else r2)
  end

Here, we've wrapped the whole if-expression with the Real_num constructor.  This is one advantage of treating if as an expression as opposed to a statement.  

Notice that in the function num_to_real, we use a case-expression to determine whether the number n is an integer or real.  The pattern Int_num(i) matches n if and only if n was created using the Int_num data constructor, whereas Real_num(r) matches n if and only if it was created with the Real_num data constructor.  Also, notice that in the Int_num(i) case, we have bound the underlying integer carried by the data constructor to the variable i and that this is used in the expression Real.fromInt(i).  Similarly, the Real_num(r) pattern extracts the underlying real value carried by the data constructor and binds it to r.  

So, for instance, calling num_to_real(Int_num(3)) matches the first pattern, binds i to 3, and then returns Real.fromInt(i) = Real.fromInt(3) = 3.0.  Calling num_to_real(Real_num(4.5)) fails to match the first pattern, succeeds in matching the second pattern, binds the r to 4.5, and then returns r = 4.5.  

Here is an alternative definition of max on numbers.

fun max2(n1:num, n2:num):num =
  (case (n1, n2) of
     (Real_num(r1), Real_num(r2)) =>
     Real_num(Real.max(r1,r2))
   | (Int_num(i1), Int_num(i2)) =>
     Int_num(Int.max(i1,i2))
   | (_, Int_num(i2)) => 
     max2(n1, Real_num(num_to_real(i2))
   | (Int_num(i1), _) => 
     max2(n2, Real_num(num_to_real(i1)))

Notice that that case expression in max2 matches a tuple of the numbers n1 and n2.  Thus, all of the patterns in the case expressions are of the tuple form.  For example, the pattern (Real_num(r1), Real_num(r2)) matches if and only if both the numbers are reals.  

In the third and fourth patterns, we've used a "wildcard" (or default) pattern.  For instance, the third pattern (_, Int_num(i2)) matches iff the first number is anything, but the second is an integer.  In this case, we simply convert the integer to a real and then call ourselves recursively.  Similarly, the fourth pattern (Int_num(i1), _) the first number is an integer and the second number is anything.  In this case, we convert the first number to a real and call ourselves recursively.

Now suppose we call max2 with two integers max2(Int_num(3), Int_num(4)).  It appears as if this matches any of the last three cases, so which one do we select? The answer is that we try the matches in order.  So the second pattern will succeed and the other patterns won't even be tried.

Another question is, how do we know if there is a case for every situation?  For instance, suppose we accidentally wrote:

fun max3(n1:num, n2:num):num =
  (case (n1, n2) of
     (Int_num(i1), Int_num(i2)) =>
     Int_num(Int.max(i1,i2))
   | (_, Int_num(i2)) => 
     max3(n1, Real_num(num_to_real(i2))
   | (Int_num(i1), _) => 
     max3(n2, Real_num(num_to_real(i1)))

Now there is no case for when n1 and n2 are both reals.  If you type this in to SML, then it will complain that the pattern match is inexhaustive.  This is wonderful because it tells you your code might fail since you forgot a case!  In general, we will not accept code that has a match inexhaustive warning.  That is, you must make sure you never turn in code that doesn't cover all of the cases.

What happens if we put in too many cases?  For instance, suppose we wrote:

fun max2(n1:num, n2:num):num =
  (case(n1, n2) of
     (Real_num(r1), Real_num(r2)) =>
     Real_num(Real.max(r1,r2))
   | (Int_num(i1), Int_num(i2)) =>
     Int_num(Int.max(i1,i2))
   | (_, Int_num(i2)) => 
     max2(n1, Real_num(num_to_real(i2))
   | (Int_num(i1), _) => 
     max2(n2, Real_num(num_to_real(i1)))
   | (_, _) => n1

Then SML complains again that the last case will never be reached.  Again, this is wonderful because it tells us there's some useless code that we should either trim away, or reexamine (in its context) to see why it will never be executed.  Again, we will not accept code that has redundant patterns.  

So how can the SML type-checker determine that a pattern match is exhaustive and that there are no dead cases?  The reason is that patterns can only test a finite number of things (there are no loops in patterns), the tests are fairly simple (e.g., is this a Real_num or an Int_num?) and the set of datatype constructors for a given type is closed.  That is, after defining a datatype, we can't simply add new data constructors to it.  Note that if we could, then every pattern match would be potentially inexhaustive.  

At first, this seems to be a shortcoming of the language.  Adding new constructors is something that happens all the time, just as adding new subclasses happens all the time in Java programs.  The difference in SML is that, if you add a new data constructor to a datatype declaration, then the compiler will tell you where you need to examine or change your code through "match inexhaustive" errors.  This makes pattern matching an invaluable tool for maintaining and evolving programs. 

So sometimes, by limiting a programming language we gain some power.  In the case of the pattern-matching sub-language of ML, the designers have restricted the set of tests that can be performed so that the compiler can automatically tell you where you need to look at your code to get it in synch with your definitions.


Recursive Datatypes: Integer Lists

We can also use datatypes to define some useful data structures. One simple data structure that we're used to is singly linked lists. It turns out that SML has lists built in, but we can write them ourselves using datatypes. Suppose we want to have values that are linked lists of integers. A linked list is either empty, or it has an integer followed by another list containing the rest of the list elements. This leads to a very natural datatype declaration:

(* This datatype defines integer lists as either Nil (empty) or
 * a "Cons" cell containing an integer and an integer list.  The
 * term "Cons" comes from Lisp.
 *)
datatype intlist = Nil | Cons of (int * intlist)

(* Here are some example lists *)
val list1 = Nil 		(* the empty list:  []*)
val list2 = Cons(1,Nil) 	(* the list containing just 1:  [1] *)
val list3 = Cons(2,Cons(1,Nil)) (* the list [2,1] *)
val list4 = Cons(2,list2)       (* also the list [2,1] *)
(* the list [1,2,3,4,5] *)
val list5 = Cons(1,Cons(2,Cons(3,Cons(4,Cons(5,Nil)))))
(* the list [6,7,8,9,10] *)
val list6 = Cons(6,Cons(7,Cons(8,Cons(9,Cons(10,Nil)))))

(* test to see if the list is empty *)
fun is_empty(xs:intlist):bool = 
    case xs of
      Nil => true
    | Cons(_,_) => false

(* Return the number of elements in the list *)
fun length(xs:intlist):int = 
    case xs of
      Nil => 0
    | Cons(i:int,rest:intlist) => 1 + length(rest)

(* Notice that the case expressions for lists all have the same
 * form -- a case for the empty list (Nil) and a case for a Cons.
 * Also notice that for most functions, the Cons case involves a
 * recursive function call. *)
(* Return the sum of the elements in the list *)
fun sum(xs:intlist):int = 
    case xs of
      Nil => 0
    | Cons(i:int,rest:intlist) => i + sum(rest)

(* Create a string representation of a list *)
fun toString(xs: intlist):string = 
    case xs of
      Nil => ""
    | Cons(i:int, Nil) => Int.toString(i)
    | Cons(i:int, Cons(j:int, rest:intlist)) => 
       Int.toString(i) ^ "," ^ toString(Cons(j,rest))
    
(* Return the first element (if any) of the list *)
fun head(is: intlist):int = 
    case is of
      Nil => raise Fail("empty list!")
    | Cons(i,tl) => i

(* Return the rest of the list after the first element *)
fun tail(is: intlist):intlist = 
    case is of
      Nil => raise Fail("empty list!")
    | Cons(i,tl) => tl

(* Return the last element of the list (if any) *)
fun last(is: intlist):intlist = 
    case is of
      Nil => raise Fail("empty list!")
    | Cons(i,Nil) => i
    | Cons(i,tl) => last(tl)

(* Return the ith element of the list *)
fun ith(is: intlist, i:int):intlist = 
    case (i,is) of
      (_,Nil) => raise Fail("empty list!")
    | (1,Cons(i,tl)) => i
    | (n,Cons(i,tl)) =>
	if (n <= 0) then raise Fail("bad index")
	else ith(tl, i - 1)

(* Append two lists:  append([1,2,3],[4,5,6]) = [1,2,3,4,5,6] *)
fun append(list1:intlist, list2:intlist):intlist = 
    case list1 of
      Nil => list2
    | Cons(i,tl) => Cons(i,append(tl,list2))

(* Reverse a list:  reverse([1,2,3]) = [3,2,1].
 * Notice that we compute this by reversing the tail of the
 * list first (e.g., compute reverse([2,3]) = [3,2]) and then
 * append the singleton list [1] to the end to yield [3,2,1]. *)
fun reverse(list:intlist):intlist = 
    case list of
      Nil => Nil
    | Cons(hd,tl) => append(reverse(tl), Cons(hd,Nil)) 

fun inc(x:int):int = x + 1;
fun square(x:int):int = x * x;

(* given [i1,i2,...,in] return [i1+1,i2+1,...,in+n] *)
fun addone_to_all(list:intlist):intlist = 
    case list of
      Nil => Nil
    | Cons(hd,tl) => Cons(inc(hd), addone_to_all(tl))

(* given [i1,i2,...,in] return [i1*i1,i2*i2,...,in*in] *)
fun square_all(list:intlist):intlist = 
    case list of
      Nil => Nil
    | Cons(hd,tl) => Cons(square(hd), square_all(tl))

(* given a function f and [i1,...,in], return [f(i1),...,f(in)].
 * Notice how we factored out the common parts of addone_to_all
 * and square_all. *)
fun do_function_to_all(f:int->int, list:intlist):intlist = 
    case list of
      Nil => Nil
    | Cons(hd,tl) => Cons(f(hd), do_function_to_all(f,tl))

(* now we can define addone_to_all in terms of do_function_to_all *)
fun addone_to_all(list:intlist):intlist = 
    do_function_to_all(inc, list);

(* same with square_all *)
fun square_all(list:intlist):intlist = 
    do_function_to_all(square, list);

(* given [i1,i2,...,in] return i1+i2+...+in (also defined above) *)
fun sum(list:intlist):int = 
    case list of
      Nil => 0
    | Cons(hd,tl) => hd + sum(tl)

(* given [i1,i2,...,in] return i1*i2*...*in *)
fun product(list:intlist):int = 
    case list of
      Nil => 1
    | Cons(hd,tl) => hd * product(tl)

(* given f, b, and [i1,i2,...,in], return f(i1,f(i2,...,f(in,b))).
 * Again, we factored out the common parts of sum and product. *)
fun collapse(f:(int * int) -> int, b:int, list:intlist):int = 
    case list of
      Nil => b
    | Cons(hd,tl) => f(hd,collapse(f,b,tl))

(* Now we can define sum and product in terms of collapse *)
fun sum(list:intlist):int = 
    let fun add(i1:int,i2:int):int = i1 + i2
    in 
        collapse(add,0,list)
    end

fun product(list:intlist):int = 
    let fun mul(i1:int,i2:int):int = i1 * i2
    in
        collapse(mul,1,list)
    end

(* Here, we use an anonymous function instead of declaring add and mul.
 * After all, what's the point of giving those functions names if all
 * we're going to do is pass them to collapse? *)
fun sum(list:intlist):int = 
    collapse((fn (i1:int,i2:int) => i1+i2),0,list);

fun product(list:intlist):int = 
    collapse((fn (i1:int,i2:int) => i1*i2),1,list);

(* And here, we just pass the operators directly... *)
fun sum(list:intlist):int = collapse(op +, 0, list);

fun product(list:intlist):int = collapse(op *, 1, list);

Datatype constructors

When we create a new datatype, we are not just creating a new type, we are also creating new values and functions. For example, let us consider the intlist definition above. Examine the following SML interactive session:

  Standard ML of New Jersey, Version 110.0.7, September 28, 2000 [CM; autoload enabled]
  - datatype intlist = Nil | Cons of (int * intlist);
  datatype intlist = Cons of int * intlist | Nil
  - Nil;
  val it = Nil : intlist
  - Cons;
  val it = fn : int * intlist -> intlist
  

In the first line, we defined the datatype intlist. Note however that in the following two lines we now have a new value, Nil, that is of type intlist and that we have a new function Cons, that is of type int * intlist -> intlist. This type of function is known as a type constructor, and is similar to java constructors in that they are the main way of creating a value that is of the new type. In other words, when we are defining a new datatype, we are not just defining a type, but we are also defining new values (simple datatypes) and type constructor functions (value-carrying datatypes).


Higher-order functions

Functions are values just like any other value in SML. What does that mean exactly? This means that we can pass functions around as arguments to other functions, that we can store functions in data structures, that we can return functions as a result from other functions. The full implication of this will not hit you until later, but believe us, it will.

Let us look at why it is useful to have higher-order functions. The first reason is that it allows you to write more general code, hence more reusable code. As a running example, consider functions double and square on integers:

fun double (x:int):int = 2 * x
fun square (x:int):int = x * x

Let us now come up with a function to quadruple a number. We could do it directly, but for utterly twisted motives decide to use the function double above:

fun quad (x:int):int = double (double (x))

Straightforward enough. What about a function to raise an integer to the fourth power?

fun fourth (x:int):int = square (square (x))

There is an obvious similarity between these two functions: what they do is apply a given function twice to a value. By passing in the function to apply_twice as an argument, we can abstract this functionality and thus reuse code:

fun apply_twice (f:int -> int, x:int):int = f (f (x))

Using this, we can write:

fun quad (x:int):int = apply_twice(double,x)
fun fourth (x:int):int = apply_twice(square,x)

The advantage is that the similarity between these two functions has been made manifest. Doing this is very helpful. If someone comes up with an improved (or corrected) version of apply_twice, then every function that uses it profits from the improvement.

The function apply_twice is a so-called higher-order function: it is a function from functions to other values. Notice the type of apply_twice is ((int -> int) * int) -> int

In order not to pollute the top level namespace, it can be useful to locally define the function to pass in as an argument. For example:

fun fourth (x:int):int = 
  let 
    fun square (y:int):int = y * y
  in
    apply_twice (square,x)
  end

However, it seems silly to define and name a function simply to pass it in as an argument to another function. After all, all we really care about is that apply_twice gets a function that doubles its argument. We can do that using some new syntax:

fun fourth (x:int):int = apply_twice (fn (y:int):int => y*y,x)

We introduce a new expression to denote "a function that expects such and such argument and returning such an expression":

e ::= ...  |  fn (id : type) => e

The fn expression creates an anonymous function: a function without a name. The type makes things actually clearer. Unlike top-level functions, the return type of an anonymous function is not declared (and is inferred automatically). What is the type of fn (y:int) => y = 3?

Answer: int -> bool

The declaration val square : int -> int = fn (y:int) => y*y has the same effect as fun square (y:int):int = y * y. In fact, the declaration using fun is just syntactic sugar for the more tedious val declaration.

Anonymous functions are useful for creating functions to pass as arguments to other functions, but are also useful for writing functions that return other functions! Let us revisit the apply_twice function. We now write a function twice which takes a function as an argument and returns a new function that applies the original function twice:

fun twice (f: int->int) = 
  fn(x: int):int => f (f (x))

This function takes a function f (of type int->int) as an argument, and returns a value fn (x:int) => f (f (x)), which is a function which when applied to an argument applies f twice to that argument. Thus, we can write

val fourth = twice (fn (x:int) => x * x)
val quad = twice (fn (x:int) => 2 * x)

and trying to evaluate fourth (3) does indeed result in 81.

Here are more examples of useful higher-order functions that we will leave you to ponder (and try out at home):

fun compose (f:int -> int, g:int -> int) = 
  fn (x:int) => f (g (x))
fun ntimes (f:int -> int,n:int) = 
  if (n=0)
    then (fn (x:int) => x)
  else compose (f, ntimes (f,n-1))

Implementing Binary Trees with Tuples

In recitation, we saw an example of using a datatype to define integer lists in terms of an empty list (Nil) and cons cells (a head containing an integer and a tail consisting of another list). We were able to iterate over the list, doing various manipulations on the data, and we were able to represent this concisely using higher-order functions. Today we're going to start by doing the same thing with binary trees to make sure everyone is very comfortable with pattern matching in case expressions.

The obvious way to start is with the following datatype, which we saw at the end of the last lecture:

datatype inttree = Leaf | Branch of (int * inttree * inttree)

This defines a type inttree to be either a leaf node (containing no data) or a branch node (containing an int and left and right subtrees). We could have defined a leaf note to contain an integer and no subtrees (some people do this), but then we'd need another constructor to represent the empty tree. Consider the representation of a generic tree.

The first logical function to write is is_empty:

fun is_empty (xs:inttree) : bool =
    case xs of
        Leaf => true
      | _ => false

Then, just as we computed the length of a list, we can count the non-leaf nodes in a tree:

fun size (xs:inttree) : int =
    case xs of
        Leaf => 0
      | Branch(_, left, right) => 1 + size(left) + size(right)

The pattern matching done in this function is very powerful. (If you don't see the power yet, you certainly will when the datatypes become as complicated as our definition of expressions in ML.) We can make very trivial changes to this function to compute several other interesting values:


Implementing Binary Trees with Records

For both lists and trees, we've been using tuples to represent the nodes. But with trees, there may be some confusion with respect to the order of the fields: does the datum come before or after the left subtree? We can solve this problem using a record type. We can define it as

datatype inttree = Leaf | Branch of { datum:int, left:inttree, right:inttree }

(Note: Binary trees are simple enough that this probably would not be adequate motivation to use records in a real program, since we now have to remember the field names and spell them out every time we use them. Skipping to the next section on polynomials is fine if running low on time.)

Using this new representation, we can write size as

fun size (xs:inttree) : int =
    case xs of
        Leaf => 0
      | Branch{datum=i, left=lt, right=rt} => 1 + size(lt) + size(rt)

We've written several functions to analyze trees, but we don't yet have a way to generate large trees easily, so if you want to try these functions in your compiler, you'd have a lot of typing to do to spell out a tree of depth 10. Let's get the compiler to do it for us.

fun complete_tree (i:int, depth:int) : inttree =
    case depth of
        0 => Leaf
      | _ => Branch{datum=i,
                    left=complete_tree(2*i, depth-1),
                    right=complete_tree(2*i+1, depth-1)}

This function will take an integer i and a depth and recursively create a complete tree of the given depth whose nodes are given distinct indices based on i. If we start with i=1, then we get a complete tree whose preorder node listing is 1, 2, 3, etc. Consider the example given by

val test_tree = complete_tree(1,3)

Now that we have an example tree to work on, we need a cleaner way to visualize the tree than looking at the compiler's representation of records. Let's write a function to print the contents of a tree in order:

fun print_inorder (xs:inttree) : unit =
    case xs of
        Leaf => ()
      | Branch{datum=i, left, right} => (print_inorder(left);
                                         print(" " ^ Int.toString(i) ^ " ");
                                         print_inorder(right))

Notice that here we did not provide names for binding the left and right subtrees. Actually, the use of record labels only is just syntactic sugar for binding the same name to its value, so we could have written "datum=i, left=left, right=right". Anyway, our function behaves as follows on our test tree:

- print_inorder(test_tree);
 4  2  5  1  6  3  7 val it = () : unit

We could have applied many other functions to each element of the tree. A standard data structure operation is apply, which executes a given function on every element. The function is evaluated for side-effects only; the return value is ignored. How could we write apply_inorder for our trees?

fun apply_inorder (f:int->unit, xs:inttree) : unit =
    case xs of
        Leaf => ()
      | Branch{datum, left, right} => (apply_inorder(f,left);
                                       f(datum);
                                       apply_inorder(f,right))

Using this, we can write a very short version of print_inorder:

fun print_inorder (xs:inttree) : unit =
    apply_inorder(fn (i:int) => print(" " ^ Int.toString(i) ^ " "), xs)

Another common operation is map, which generates a copy of the data structure in which a given function has been applied to every element. We can write apply_inorder as

fun map_tree (f:int->int, xs:inttree) : inttree =
    case xs of
        Leaf => Leaf
      | Branch{datum=i, left, right} => Branch{datum=f(i),
                                               left=map_tree(f,left),
                                               right=map_tree(f,right)}

How could we use this to square a tree?

val tripled_tree = map_tree(fn (i:int) => i*3, test_tree)