Problem Set 3: Abstractions and Specifications

Due Wednesday, October 3, 2007, 11:00 pm.

Sep 24: Part 2, f : Fold must perform a preorder (not inorder) traversal. Also, in part 1f, the function times() was missing from poly.sig and poly.sml. The files have been updated in ps3.zip.

Instructions

This problem several parts. You will write your solutions in the appropriate files and submit the files separately on CMS. To get you started, we are providing a zip file ps3.zip containing several files. You can edit the necessary files, and submit your solution using CMS when you have finished. Remember that non-compiles and warnings will receive an automatic zero.

You will compile multiple structures at once using the Compilation Manager (CM). We provided a file sources.cm that contains a list of all files that need to be compiled. Use the compilation manager by typing CM.make(); at the command prompt. This command will automatically compile all of the programs that have changed since the last time CM was invoked.

[Download the zip file ps3.zip]

Part 1: Multivariate Polynomials

A multivariate polynomial is a polynomial with multiple variables. For instance, P(x,y) = x²y+ 3xy³ - 2y + 1 is a polynomial with two variables. Since the actual names of variables are not relevant, we can use canonical names: x₁ for the first variable, x₂ for the second, and so on. For instance, the polynomial P becomes: P(x₁,x₂) = x₁²x₂+ 3x₁x₂³ - 2x₂ + 1. A polynomial is a sum of terms, where each term is the product between an non-zero integer coefficient and some non-negative, integer powers of the coefficients. That is, each term is of the form c₁ x₁^p1... x_n^pn. In the example above, x₁²x₂, 3x₁x₂³, -2x₂, and 1 are the four terms of polynomial P.

Polynomilas will be implemented as follows. Each term c*x₁^p1*...*x_n^pn is implemented using a record that contains the coefficient c and powers p1, ..., pn: {coeff=c, powers=[p1, .., pn]}. Terms with coefficients equal to zero are omitted. A polynomial is a record that holds the number of variables and a sorted list of terms. Polynomial zero has no terms.

type term = {coeff:int, powers:int list} type poly = {nvars:int, terms:term list}

For instance, the above polynomial P is represented as:

{ nvars=2, terms=[{coeff=1, powers=[2,1]}, {coeff=3, powers=[1,3]}, {coeff=~2, powers=[0,1]}, {coeff=1, powers=[0,0]}] }
The list of powers in each term must have length equal to the number of variables in the polynomial. This is also the case for terms where some, or all the powers are zero. For instance, {coeff=~2, powers=[0,1]} or {coeff=1, powers=[0,0]}. The list of terms is sorted by the lexicographic ordering of the powers in each term, in decreasing order. In the example polynomial P, term x₁²x₂ must occur before term 3x₁x₂³ (because power of x₁ decreases), and -2x₂ must occur before term 1 (because power of x₁ is the same, but the power of x₂ decreases). The provided function comparePowers yields the lexicographic ordering of two lists of powers. Finally, no duplicate terms are allowed in a polynomial, i.e., 2xy + 4x + 3xy should be 5xy + 4x.

Work to be done:

Specifications. Before beginning work on implementation, you need to complete the specifications. Take a look at poly.sig. The existing specs may be missing, incomplete, or even incorrect. Examine each function and decide what needs to be properly specified. You should not add new exceptions to the interface. In case of errors, the functions will raise Fail.
Document your implementation with a description of the abstraction function and of the representation invariant.
RepOK. Write a repOK function that checks if a polynomial satisfies the representation invariant. If it does, then the same polynomial is returned, otherwise Fail is raised. Document your repOk function with the representation invariant that this function checks.
```
val repOK : poly -> poly  
```
Operations. Implement the following two functions that construct polynomials. Function zero builds a polynomial with value zero; its argument represents the number of variables. Function singleton builds a polynomial representing a single term. Given a coefficient c and a list [p₁,...,p_n], singleton builds the polynomial c *x₁^p1*... *x_n^pn.
```
val zero : int -> poly
val singleton : int * int list -> poly
```
Write a function that evaluates a polynomial for given values of its variables. The values of variables are provided as a list of integers:
```
val eval : int list * poly -> int
```
Write functions that performs the addition and multiplication of polynomials:
```
val plus  : poly * poly -> poly
val times : poly * poly -> poly	 
```
We have provided an incomplete function that generates a string representation of a polynomial, with variable names starting from 'a' and going up to 'z'. Hence, it requires that the polynomial has at most 26 variables. Improve this function in the following ways. First, do not print variables with zero powers. Second, do not print powers of 1. Third, do not print the coefficient 1, except for the constant term 1. For instance the result for the above polynomial should be: "a^2b+ 3ab^3 - 2b + 1"
```
val toString : poly -> string	  
```

To do: Submit your solutions in the poly.sig and poly.sml files.

Part 2: Prefix Trees

Several algorithms, including the compression algorithms and string matching algorithms, make use of dictionaries where the keys are strings (or sequences) of values of a certain type. A prefix tree or trie (pronounced "try") is an efficient data structure for implementing the map ADT for this kind of maps. The set of possible elements in the strings is referred to as the alphabet of the trie.

A trie is a tree structure whose edges are labeled with the elements of the alphabet. Each node can have up to N children, where N is the size of the alphabet. Nodes correspond to strings of elements of the alphabet. More precisely, each node corresponds to the sequence of elements traversed on the path from the root to that node. The map binds a string s to a value v by storing the value v in the node of s. The advantage of this scheme is that if efficiently represents common string prefixes, and lookups or inserts are achieved using quick searches starting from the root.

Suppose we want to build a map from strings of ASCII characters to integer codes. In this case, edges are labeled with ASCII characters, and integer codes are stored in (some of) the nodes. Here is an example:

The above trie describes the map {"to"->7, "tea"->3, "ten"->12, "in"->5, "inn"->9}. The values (shown in blue) are stored in the nodes. Note that nodes might not have values. For instance the node that corresponds to string "te" doesn't have a value. Hence, "te" is not in the map.

Your task is to do the following:

Interface specifications. Before you begin implementing the trie data structure, you need to finish its specifications in the signature TRIE. The existing specifications may be incomplete, incorrect, or nonexistent.
Representation. You must decide how you will represent a trie. Briefly inspect the functions below to decide if certain pieces of information would make their implementation simpler. Then define an appropriate trie type.
```
type 'a trie = (* your implementation here *)
```
Next to the representation type write a comment giving the abstraction function and representation invariant for tries.
Implement the function repOK. If the input trie satisfies the representation invariant, then repOK simply returns the input trie. Otherwise, repOK raises Fail.
Basic operations. Implement the empty value, representing an empty trie. Then implement the functions put, get, size, and tabulate. The put function adds a new mapping to a trie. If the key already exists in the trie, its binding is updated. The function get returns the binding of a string in the trie. If the string does not exist in the trie, then an exception NotFound must be raised.
```
val empty: 'a trie
val put: 'a trie * string * 'a -> 'a trie
val get: 'a trie * string -> 'a
```
Note: the course staff will use these functions to generate the inputs for testing other functions. Therefore, be careful to implement them correctly.
Additional operations. Implement the following additional operations. Function size returns the size (i.e., number of bindings) of a trie; tabulate n f creates a new trie with n bindings, where binding k is f(k), for k between 0 and n-1; and fold f v t. performs folding over trie t, starting with initial accumulator value v, applying function f at each step. Computing the size of the trie must be an O(1) operation. Fold must visit key-value bindings using a preorder traversal of the trie.
```
val size: 'a trie -> int
val tabulate: int -> (int -> string * 'a) -> 'a trie
val fold: (string * 'a * 'b -> 'b) -> 'b -> 'a trie -> 'b
```
Sorting. Write a function sort: string list -> string list that sorts a list of strings using a trie. The returned list must be sorted in ascending lexicographical order. The algorithm must have O(n) complexity, where n is the sum of the lengths of all strings.

To do: turn in the solution in the files trie.sig and trie.sml.

Part 3 :Specification Refinement

Consider the following six possible specifications for a function f of type int*int -> int :

Returns: f(a, b) = a*b
Returns: f(a, b) >= 0
Returns: f(a, b) = max(a, b). Checks: a, b >= 0
Returns: f(a, b) >= b
Returns: f(a, b) = a*a
Returns: f(a, b) >= 0. Requires: a is even

Draw a diagram that describes the refinement relation among these specifications. The diagram should contain the names A through G. If specification 1 is a refinement of specification 2, draw a line between the two corresponding letters, and make sure that the name of specification 2 is higher in the diagram than that of spec 1. For example, the diagram should contain at least the following arc:

     B
     |
     E

When there is a line between 1 and 2, and between 2 and 3, the line between 1 and 3 can be omitted in the diagram. (This kind of diagram is known as a Hasse diagram, after its inventor).

To do: turn in the solution in the files written.txt.

Part 4 (Optional) Higher-order fun

The following problem will count for 5 bonus points (but the total max is still 100). Consider the following four higher-order functions.

fun peel f x = (f (x div 10) (x mod 10))
fun sub f x y = (f (Int.abs(y*y-x)))
fun swap f x y = (f y x)
fun dummy x = x

We can think of these four functions as basic building blocks, and construct new functions with them, without using any operators, literals, constants, keywords, functions, etc. For instance, we can define "val magic = peel(sub dummy)". The problem asks you to write three magic expressions magic1, magic2, and magic3 that will yield the values indicated below. For each answer, provide a high-level explanation (i.e., not the step-by-step evaluation) why they produce the desired results:

magic1 42 = 14
magic2 4 = 5 
magic3 312 = 62

Turn in the solution in the file hofun.sml.