CS312 Lecture 3: Recursive Datatypes

Administrivia

val sectionsize_335 = 8;
val sectionsize_1115 = 35;

Problem set #1 due Tue 4PM. Future problem sets will be due Thursday 1AM, yes we mean AM. Intended to give you a chance to get some sleep and come to lecture.  Note that this means 9 hours before Thursday's lecture starts, not 15 hours after Thursday's lecture starts. Also note that due dates are absolutely firm (the submission system shuts off).

A few people in CS312 inevitably try to cheat. Don't be among them: we have extremely sophisticated tools for catching cheaters, including copious records of old solutions (note that problems sets change from year to year). The possible penalties for being caught include getting a 0 in the assignment, an F in the course (dropping not allowed), or a letter in your permanent file (part of the process of being expelled from Cornell).  We are hard-nosed about this; don't find this out for yourself.

Recursive Datatypes

In recitation you should have seen datatypes, which are SML types that can have more than one kind of value. This involved a new kind of declaration, a datatype declaration:

decl ::= ...  | datatype id = constructor_decl1 |...|constructor_decln
constructor_decl
::= id | id of type

expr
::= ... | id expr | case expr of pattern1 => expr1 | ... | patternn => exprn
pattern
::= (id : type, ..., id : type) | id(id : type, ..., id : type) | ...

We can use datatypes to define many useful data structures. In particular, suppose we want to define a new type "number" that includes both ints and reals.  This can be accomplished in SML by the following datatype definition:

datatype num = Int_num of int | Real_num of real

This declaration gives us a new type (num) and two constructors Int_num and Real_num.  The Int_num constructor takes an int as an argument and returns a num, while the Real_num constructor takes a real as an argument and returns a num.  In this fashion, we can create a type that is the (disjoint) union of two other types. 

We can even define data structures that act like the natural numbers, demonstrating that we don't really have to have numbers built into SML! A natural number is either the value zero or the successor of (next number after) some other natural number. This definition (Peano arithmetic) leads naturally to the following definition for values that act like natural numbers nat:

datatype nat = Zero | Next of nat

This is how you might define the natural numbers in a mathematical logic course. We have defined a new type nat, and Zero and Next are constructors for values of this type. This datatype is more sophisticated than the ones we saw in recitation: it is a recursive datatype because the definition of what a nat is mentions nat itself. 

Question: after we enter this datatype into ML, what is the type of Zero? What about Next??

Using this definition, we can write define some values that act like natural numbers:

val zero = Zero
val one = Next(Zero)
val two = Next(Next(Zero))
val three = Next(two)
val four = Next(three)

When we ask the compiler what four represents, we get

- four;
val it = Next (Next (Next (Next Zero))) : nat

Thus four is a nested data structure. The equivalent Java definitions would be

public interface nat { }
public class Zero implements nat { }
public class Next implements nat { nat value; Next(nat v) { value = v; } /* etc */ }

nat zero = new Zero();
nat one = new Next(new Zero());
nat two = new Next(new Next(new Zero()));
nat three = new Next(two);
nat four = new Next(three);

And in fact the Java objects representing the various numbers are actually implemented similarly to the SML values representing the corresponding numbers.

Now we can write functions to manipulate values of this type.

fun iszero(n : nat) : bool = 
  case n of
    Zero => true
  | Next(m) => false

The case expression allows us to do pattern matching on expressions. Here we're pattern-matching a value of type nat. If the value is Zero we evaluate to true; otherwise we evaluate to false.

fun pred(n : nat) : nat = 
  case n of
    Zero => raise Fail "predecessor on zero"
  | Next(m) => m

Here we determine the predecessor of a number. If the value of n matches Zero then we raise an exception, since zero has no predecessor in the natural numbers. If the value matches Next(m) for some value m (which of course also must be of type nat), then we return m.

Similarly we can define a function to add two numbers: (See if the students can come up with this with some coaching.)

fun add(n1:nat, n2:nat) : nat = 
  case n1 of
    Zero => n2
  | Next(n_minus_1) => add(n_minus_1, Next(n2))

If you were to try evaluating add(four,four), the compiler would respond with:

- add(four,four);
val it = Next (Next (Next (Next (Next #)))) : nat

The compiler correctly performed the addition, but it has abbreviated the output because the data structure is nested so deeply. To easily understand the results of our computation, we would like to convert such values to type int:

fun nat_to_int(n:nat) : int = 
  case n of
    Zero => 0
  | Next(n) => 1 + nat_to_int(n)

That was pretty easy. Now we can write nat_to_int(add(four,four)) and get 8. How about the inverse operation?

fun int_to_nat(i:int) : nat =
  if (i < 0) then raise Fail "int_to_nat on negative number"
  else
    let fun loop(i:int) : nat =
              case i of
                0 => Zero
              | n => Next(loop(i-1))
    in
        loop(i)
    end

We've defined a local recursive function loop that takes a natural number (of type int) and evaluates to the corresponding value of type nat.

To determine whether a natural number is even or odd, we can write a pair of mutually recursive functions:

fun even(n:nat) : bool =
  case n of
    Zero => true
  | Next(n) => odd(n)
and odd (n:nat) : bool =
  case n of
    Zero => false
  | Next(n) => even(n)

You have to use the keyword and to combine mutually recursive functions like this. Otherwise the compiler would flag an error when you refer to odd before it has been defined.

Finally we can define multiplication in terms of addition. (See if the students can figure this out.)

fun mul(n1:nat, n2:nat) : nat =
  case n1 of
    Zero => Zero
  | Next(n_minus_1) => add(n2, mul(n_minus_1,n2))