Problem Set 2: Even More SML

Issued: September 10, 2002
Due: September 19, 1:00AM


Instructions: you will do this problem set by modifying the source files in ps2.zip and submitting the program that results. The program that you submit must compile without any warnings. Programs that do not compile or compile with warnings will receive an automatic zero.

Beginning with this project, you'll be using the SML Compilation Manager (CM) to compile all of your files together. It will also let you get at some of the libraries that were previously inaccessible. It's actually very easy, but the directions are a bit complicated at first. There are several things to do:


Update 9/16

Clarifications on the order of arguments:


Update 9/15

Clarification for insert and lookup: however you do this problem, your solution must satisfy the following invariants:


Part 1: Dictionaries, Lists, and Polymorphic Fun

We often need a data structure to group related pairs of values. Such a data structure is called a dictionary. Operations on a dictionary include inserting a key/value pair and looking up the value that corresponds to a given key.

We can implement dictionaries in ML by using a list of pairs, i.e. an associative list. For the moment, assume that the keys and values are both strings.

  1. To insert into a associative list, we can simply add a key/value pair to the head of the list. By appending to the head and by searching the list from its head toward its tail, we have an easy way to "overwrite" old entries with new ones. Write function insert with type string -> string -> (string * string) list -> (string * string) list that implements the insertion method described above.

  2. Write function lookup with type string -> (string * string) list -> string. Given a key and an associative list, this function returns the value corresponding to that key. For example, given a phonebook, we can retrieve Garfield's phone number with expression lookup "Garfield" phonebook. If the key is not present in the list, your code should raise a Fail exception. (Yes, our phone numbers can contain dashes and special characters like * and #, so they can't be simple integers.)

  3. Why restrict ourselves to strings? We can think of other kinds of key/value pairs. For example, we may want a data structure that maps Cornell ids to student names. Instead of rewriting the lookup function for each possible type of key/value pair, we will create its polymorphic version, polymorphicLookup, whose type will be 'a -> ('a * 'a -> bool) -> ('a * 'b) list -> 'b.

    Here the function of type 'a * 'a -> bool is a comparison function that returns true if and only if both arguments are equal.

  4. Use your polymorphicLookup to write function lookupCUID : int -> (int*string) list -> string that takes an id number and an associative list of (int * string) key/value pairs and returns the corresponding string.

Part 2: More Recursive Types

Consider the following declaration for nested lists:

datatype 'a llist = LList of 'a llist list | Elem of 'a

A nested list thus consists of an element of unspecified type, or of a list of nested lists. Consider some examples:

- List([]);
- Elem(3);
- LList([Elem(1), LList([Elem(2), LList([]), Elem(3)]), Elem(4)]);
  1. Write function flatten that takes a nested list as its unique argument and returns a "flat" list of all elements in the nested list. For example: flatten(LList([Elem(1), LList([Elem(2), LList([]), Elem(3)]), Elem(4)])) returns [1, 2, 3, 4]. Note that the elements in the resulting list are in the same order as in the nested list. Your solution must have the same property.

  2. Write function depth that takes a nested list as its unique argument and return the highest level of nesting in the list. For example: depth(Elem(5))= 0, depth(LList([])) = 1, depth(LList([Elem(1), LList([Elem(2), LList([]), Elem(3)]), Elem(4)])) = 3.

  3. Write function cutOff that takes a non-negative integer n, and a nested list, and returns only those parts of the nested list that are at a level equal to or less than n in the original nested list. Thus we must have depth(cutOff(n, lst)) = Int.min(n, depth(lst)). Some examples:

    cutOff(0, Elem(3)) = Elem(3)
    cutOff(1, LList([Elem(1), LList([Elem(2)])])) = LList([Elem(1)])
    cutOff(100, LList([Elem(1), LList([Elem(2)])])) = LList([Elem(1), LList([Elem(2)])])

    Given the above, what should cutOff(0, List([])) return? Well, a nested list of depth 0, i.e. an Elem. But there is no obvious value we could use to build an Elem, and we can't easily pick one (say, 0), as the nested list is a polymorphic type. Rather, this case and all the analogous ones should end with the function indicating a failure. This can be achieved by evaluating the expression raise Fail "message" for all the cases identified as impossible.

Part 3: Pattern Matching & Regular Expressions

A regular expression is a string containing a mixture of ordinary characters and operators. There are only three operators: +, *, and ?. All the other characters are ordinary. Except for ?, operators must immediately follow an ordinary character. Operators do no appear in ordinary strings (i.e. in strings that are not regular expressions).

Regular expressions encode sets of strings; a given string matches the regular expression if and only if the respective string is in the set of strings denoted by the regular expression.

We define the meaning of regular expressions recursively:

You can assume that all regular expressions are correct, and that ordinary strings do not contain any regular expression operators.

Consider the following examples:

Regular Expression String Matches?
"abc" "abc" yes
"a?c" "abc" yes
"ab" "abc" no
"a+b?c*" "aaaabx" yes
"a+b?c*" "abqcccc" yes
"a+b?c*" "bx" no
  1. Write a function match that takes a regular expression and an ordinary string, and returns a boolean indicating whether the string matches the regular expression or not. Make sure that you do not forget to check degenerate cases!


Prepared by Tibor and Abhishek for CS312 Fall 2002.