Problem Set 3: Specification and Testing

Due Thursday, February 22, 2007, 11:59 pm.

Last updated February 17. Changed text is in this style. Changes:

In this problem set, you will implement two abstract data types. Although there are a few tricky parts, getting the code to work should not be too difficult; the challenging part is to produce code that is as clear and well specified as possible, following the methodology given in class. We will be setting a high standard.

What to do: Write your solutions in the appropriate files and submit the files separately on CMS. To get you started, we are providing a zip file containing several stub files. You can edit the necessary files, and submit your solution using CMS when you have finished. Remember that non-compiles and warnings will receive an automatic zero.

You will compile multiple structures at once using the Compilation Manager (CM). We are providing a file that contains a list of all files that need to be compiled. Use the compilation manager by typing CM.make(); at the command prompt. This command will automatically compile all of the files that have changed since the last time CM was invoked.

Part 1: Multisets

A multiset (also known as a bag) is a set that may contain duplicate elements. For example, {1,1,2,2,2,3} is a multiset that happens to contain some duplicate elements. Of course, an ordinary set cannot contain duplicates. As with ordinary sets, the order of the elements is not important, so {3,1,2,1} and {2,1,1,3} are the same. However, {3,1,2,1}, {1,2,3}, and {1,1,1,2,3} are different multisets because they do not contain the same number of 1's.

For this problem, you will create a multiset ADT, including several multiset operations. This will be a parameterized multiset, that is, a multiset that can contain any type of value, not just integers:

type 'a multiset =  (* Your type goes here *)

The functions that you will need to implement are as follows:

val empty: ('a * 'a -> order) -> 'a multiset

This function creates an empty multiset, which we write as ∅ or {}. Note that the parameter is a comparison function—this function is used to implement union, intersect, and difference. Because this multiset is parameterized, it can be used for types or datatypes that do not have =, >, and < defined.

val add: 'a * 'a multiset -> 'a multiset
val remove: 'a * 'a multiset -> 'a multiset

These functions should add and remove elements, respectively. The remove function should only remove one element.

add(42, {3,42}) = {3,42,42}
remove(42,{3,42,42}) = {3,42}
val frequency: 'a * 'a multiset -> int
val isEmpty: 'a multiset -> bool

The frequency function should return the number of times a given element occurs in the multiset, and isEmpty should return true if and only if the multiset contains no elements.

frequency(3,{1,2,3,3}) = 2
frequency(42,{1,2,3,3}) = 0
isEmpty({}) = true
val union: 'a multiset * 'a multiset -> 'a multiset
val intersect: 'a multiset * 'a multiset -> 'a multiset
val difference: 'a multiset * 'a multiset -> 'a multiset

These functions should perform the usual set operations, extended to take element frequency into account.

Let s1 = {1,2,3,4,4,5}, and s2 = {3,4,5,5}. Then,
union(s1,s2) = union(s2,s1) = {1,2,3,3,4,4,4,5,5,5}
intersect(s1,s2) = intersect(s2,s1) = {3,4,5}
difference(s1,s2) = {1,2,4}
difference(s2,s1) = {5}
val fromList: 'a list * ('a * 'a -> order) -> 'a multiset
val toList: 'a multiset -> 'a list

These functions convert to and from lists. The list should have one element for each element in the multiset.

fromList([1,2,3,2], = {1,2,2,3}
fromList([], = {}
toList({1,2,2,3}) = [1,2,2,3] (or [3,2,1,2], etc.)
toList({}) = []
val fold: ('a * 'b -> 'b) -> 'b -> 'a multiset -> 'b
val exists: ('a -> bool) -> 'a multiset -> bool

These are standard higher-order operations on data structures. Like List.fold, The fold function should apply the argument function to each element and produce an accumulated value. The function should be applied once per occurrence of each element, in case an element appears more than once in the multiset. The function exists should return true if the multiset contains at least one element for which the argument function evaluates to true.

The first step is to think about how you want to design your multiset, considering the functionality required. Choose your representation for multisets carefully. You might think to represent multisets as lists:

type 'a multiset = 'a list

and to use list operations to implement the multiset operations. However, this implementation would likely be inefficient for some applications. You should assume that clients of the multiset abstraction will be using multisets with many duplicate occurrences, and design your implementation accordingly. However, you should not need to create any fancy data structures to accomplish this.

Make sure to carry out the following tasks:

  1. Document any clarifications or changes to the problem specification at the top of the source file in a comment with the form, (* SPECIFICATION CHANGES... *).
  2. Define the type that represents 'a multiset. Choose wisely.
  3. Make sure to include the representation invariant (RI) and abstraction function (AF) in a comment provided above the representation type you defined.
  4. Implement the function repOK: 'a multiset -> 'a multiset, which raises an error if the input set does not satisfy the RI and otherwise acts as an identity function.

  5. Correct the incomplete/buggy function specifications defined in multiset.sig.
  6. In multiset.sml, implement the specified functions so that the run time of any operation (except fromList) is at worst O(n2) in the number of unique elements or O(m) in the total number of elements within the multiset. Runtime of fromList should be O(nm) where n is the number of unique elements and m is the total number of elements. Include an informal justification of the runtime in a comment (just a few sentences).

    (For karma points, make your implementation better than O(n2). Could you do better?.

To do: submit the files multiset.sml and multiset.sig.

Part 2: Interval arithmetic

An interval is a connected subset of the real numbers. For example, we write [x,y] to mean the interval that is set of all real numbers between x and y. In general, an interval is any set of real numbers such that if a and b are in the set, then every c such that a ≤ c ≤ b is also in the set. Therefore, the empty set ∅ is also an interval. More information about intervals may be found in Wikipedia.

In this problem, you will implement an abstract data type for closed intervals on the extended real numbers (The extended reals are the real numbers augmented with positive and negative infinity. Closed intervals include all limit points of Cauchy sequences within the interval.). So we will not worry about open intervals such as (0,1) or half-open intervals such as [0,∞). Examples of valid closed intervals include [100, 200], [-1, 1], [10,10], ∅, [-∞, ∞], [-∞, 10], and [1, ∞].

For this problem, your task is to implement a number of operations on intervals. First, you will need to define how you will represent the type interval. Your representation should take into account the possibility of all the different cases above. In a comment above the type you define, write out your representation invariant and abstraction function for your implementation of the interval type. Implement the function repOK as in Problem 1.

type interval = (* Your type goes here *)
val repOK: interval -> interval

Also implement the toInterval, lowerBound, upperBound functions, which translate between reals and intervals.

val toInterval: real * real -> interval
val lowerBound: interval -> real
val upperBound: interval -> real
val isEmpty: interval -> bool

It is probably useful to note that the type real directly supports values representing infinities: Real.posInf and Real.negInf. Reals also have a value NAN (not a number) which should not appear in any interval. You can test for NAN with Real.isNan.

You will also implement interval arithmetic on these intervals. Interval arithmetic lifts ordinary arithmetic operations to work on intervals, by returning the smallest interval that contains all the possible results that could be obtained by applying the arithmetic operation in question to elements of the operand intervals. For example, to add two intervals I and J, the result should be the smallest interval containing all possible results: {z | ∃ x ∈ I and y ∈ J and z = x+y}. Since this set is always an interval, it is in fact the correct answer. For example, [1,2] + [3,5] = [4,7] but [1,2] + ∅ = ∅. Because results conservatively include all possible outcomes, interval arithmetic is useful for bounding the results that can come out of a computation where some of the inputs are only approximately known.

The remaining functions you will need to implement are:

val point: real -> interval

The point function creates an interval starting and ending at the given real value.

val plus: interval * interval -> interval
val times: interval * interval -> interval

The plus and times functions perform addition and multiplication on intervals, in the standard fashion.

val neg: interval -> interval
val abs: interval -> interval

The neg and abs functions respectively negate an interval (switch the sign of both ends) and take the absolute value of an interval (putting the whole interval in the positive reals).

val pow: interval * int -> interval

The pow function raises both components of an interval to the specified integer power.

val recip: interval -> interval
val sqrt: interval -> interval
val exp: interval -> interval
val ln: interval -> interval

These functions perform their respective mathematical operations of the reciprocal, the square root, applying the exponential function, and applying the natural logarithm to the interval.

Here are a few examples of using these operations:

[1.0, 2.0] + [3.0, 4.0] = [4.0, 6.0] (plus)
[1.0, ∞] + [-∞, 4.0] = [-∞, ∞] (plus)
[1.0, 2.0] * [3.0, 4.0] = [3.0, 8.0] (times)
|[~1.0,2.0]| = [0.0, 2.0] (abs)
[3.0, 4.0] ^ 2 = [9.0, 16.0] (pow)
1/[1.0, 2.0] = [0.5, 1.0] (recip)
1/[0.0, 1.0] = [1.0, ∞] (recip)

Make sure to carry out the following tasks:

  1. Document any clarifications or changes to the problem specification at the top of the relevant source file in a comment with the form, (* SPECIFICATION CHANGES... *).
  2. Define the type that represents interval. Choose wisely.
  3. Make sure to include the representation invariant (RI) and abstraction function (AF) in a comment provided above the representation type you defined.
  4. Implement the function repOK: interval->interval, which raises an error if the input set does not satisfy the RI and otherwise acts as an identity function.
  5. Make any necessary corrections or clarifications to the function specifications defined in interval.sig.
  6. Implement all the interface operations in interval.sml. Make sure to think about all the cases. Watch out for negative numbers, zero, and infinities!

To do: submit the files interval.sig and interval.sml.

Finding zeros using intervals

One use of intervals is to assist in finding the roots (zeros) of functions. Consider, say, the function f(x) = 6 - x*x*x. For a continuous function such as this, if we know an interval on which it has a zero, we can find the zero of f on that interval using binary search. Write a function that finds a zero of a this function f on any interval, assuming that the interval contains a zero. It should return the value of the zero to within a relative or absolute accuracy of 10-4; that is, if z is the value of the zero and z' is the value returned, then |(z'-z)/z| ≤ 0.0001 or (z'-z) ≤ 0.0001.

Your task is to first implement a version of this function f that operates on intervals, and second to implement the function find_zero: interval -> real, which should compute the cube root of 6.

To do: submit the file interval-fz.sml.

Unit testing

Please submit a file named test_cases.txt, listing the test cases you used to validate the functions in interval.sig.

For each function, you should provide a small number of representative test cases that cover all of the inputs that you consider interesting. As a whole, your tests should verify that your functions behave correctly for obvious inputs as well as tricky corner cases. You should also check that your functions handle invalid input as specified, when required.

Try to avoid redundant test cases. There’s no reason to submit test cases that check that both neg [1.0, 2.0] and neg [1.0, 3.0] behave correctly. These test cases verify similar behavior, and it is very unlikely that one would fail while the other would succeed.

Please write the function name, parameters, and expected output in the following format:

plus [1.0, 2.0] [3.0, 4.0] = [4.0, 6.0]
lower_bound [~1.0, 1.0] = ~1.0
sqrt [~2.0, ~1.0] = empty (* the empty interval *)

When we grade your test cases, we will a) check that you accounted for enough interesting inputs and b) don’t have a lot of redundant test cases. If you would like to justify a particular testing decision, for whatever reason, then feel free to write a short explanation at the end of the file.

You should develop a good testing strategy and practice it throughout the assignment. We recommend that you create a test harness so that you can run all of your tests automatically, although you are not required to use a specific strategy. Please look at the course notes on testing to learn more about testing strategies.

Please include a very brief description of your testing strategy after listing your test data. This description should only be about two or three sentences, and it doesn't need to be very detailed.

To do: submit a file named test_cases.txt, listing the test cases you used to validate each function in interval.sig with a very short description of your testing strategy. Test carefully. If you incorrectly describe the result of running your code on your test cases, you will receive no credit for testing.

Part 3: Specification refinement

Consider the following six possible specifications for a function

val f: int -> int
  1. (* Returns: f(x) >= 0 *)
  2. (* Returns: f(x) >= 0. Requires: x > 0 *)
  3. (* Returns: f(x) is the natural log of x. Requires: x > 0 and x = ey for some y*)
  4. (* Returns: f(x) >= 0 and f(x) < 100 *)
  5. (* Returns: f(x) >= 0 and f(x) < x. Requires: x > 0 *)
  6. (* Returns: f(x) >= 0. Requires: x is even *)
  7. (* Returns: 100 > f(x) >= 0. Checks: x is even *)

Draw a diagram that describes the refinement relation among these specifications. The diagram should contain the names A through G. If specification 1 is a refinement of specification 2, draw a line between the two corresponding letters, and make sure that the name of specification 2 is higher in the diagram than that of spec 1. For example, the diagram should contain at least the following arcs:


A line between 1 and 2 may be omitted if there is some other specification in the diagram that is a refinement of spec 2 and that spec 1 refines. In the above diagram the line between D and B is omitted even though D refines B. (This kind of diagram is known as a Hasse diagram, after its inventor.)

To do: Hand in the solutions to this problems in a file written.txt. These must be in ASCII text format, and all lines should be shorter than 78 characters.

Part 4 (Karma problem): Higher-order fun

Consider the following four higher-order functions.

fun peel f x = (f (x div 10) (x mod 10))
fun sub f x y = (f (Int.abs(y*y-x)))
fun swap f x y = (f y x)
fun dummy x = x

We can think of these four functions as basic building blocks, and construct new functions with them, without using any operators, literals, constants, keywords, functions, etc. For instance, we can define "val magic0 = peel(sub dummy)".  The problem asks you to write four magic expressions magic1, magic2, magic3 and magic that will yield the values indicated below.  For each answer, provide a high-level explanation (i.e., not the step-by-step evaluation) of why they produce the desired results:

magic1 32 = 7
magic2 4 = 5
magic3 151 = 15
magic 1337 = 42

(Maybe) To do: Turn in the solution in the file hofun.sml.