CS 312 Problem Set 3
Implementing a Data Abstraction

Issued: February 11, 2003
Due: February 25, 11:59pm

Instructions: you will do this problem set by modifying the source files in ps3.zip and submitting the program that results. The program that you submit must compile without any warnings. Programs that do not compile or compile with warnings will receive an automatic zero.

This problem set is not algorithmically difficult. The goal is to do a high quality implementation of a data abstraction. Proper code documentation is very important; think about what specifications and other comments the program ought to include, and write them carefully.

Some parts of the assignment ask you to justify design decisions in a way that would not be appropriate for production code; place such justifications in a comment that reads:

(* Design justification: <your justification here> *)

If it is necessary to change the problem set specification add a comment that reads:

(* Specification change: <description and justification here> *)

It should not be necessary to do this; we recommend talking to the course staff if you are.

Background: A multiset (or bag) is a set that may contain more than one occurrence of a particular element. An ordinary set (e.g., {1,2,3}) cannot contain duplicate elements; a multiset (e.g. {1,1,1,2,2,3}) can. Note that multiset do not keep track of any ordering of their elements, so the multisets {1,2,2,3} and {2,3,1,2} are equal. However, {1,2,2,3} is not equal to {1,2,3} because they do not contain the same number of 2's. Multisets are useful for a variety of purposes; for example, keeping track of a collection of experimental measurements.

In this problem set you will finish defining a multiset data abstraction, implement a module that satisfies this abstraction, and test your module. The main goal for this problem set is to write correct code with clear specifications, and to do a good job of testing this implementation. We will be looking for code that is easy to understand and maintain. The run-time efficiency of your implementation will carry some weight but will be a secondary consideration.

Part 1: Multiset specification (20 pts)

Before you begin implementing, you need to finish the specification. The existing specifications may be incomplete, incorrect, or nonexistent. If you find it desirable to change the specification of a function in a way that is not a refinement of the specification given, include an additional comment justifying your design change. Think carefully about what specification would be a good design for each function from the standpoint of both user and implementer.

Some exceptions are declared in the current interface. You are not allowed to add new exceptions to the interface. If you specify that an operation raises an exception under some condition, it should be one of the existing exceptions.

To do: Correct and complete the specifications in the file ms.sig.

Part 2: Multiset implementation (40 pts)

Representation. Your first task is to decide how you will represent multisets. You should consider what functions are likely the most important and select a representation that will result in simple and reasonably efficient code. Add a design justification explaining your reasoning.

type 'a multiset = your implementation here

Along with the representation type, include a comment giving the abstraction function and representation invariant.
In addition you should implement the function repOK. This function is an identity for any multiset that satisfies the representation invariant; for representations that do not satisfy the invariant, repOKraises Fail.

val repOK : 'a multiset -> 'a multiset

Simple operations. Implement the empty function, which creates a new, empty multiset.
```
val empty : ('a * 'a -> order) -> 'a multiset
```
It takes in a comparison function. This will help you if you want to keep your internal representation sorted, because >, <, and = are not defined on all datatypes (e.g., reals). You can assume that the comparison function is well-behaved (it correctly implements a total order on elements of type 'a).
Implement add and delete, which allow items to be added and removed from a multiset. delete removes one element. Remember that this is a functional implementation, so both operations produce a new multiset.

val add : 'a * 'a multiset -> 'a multiset
Example: add(42,{3,42}) = {3,42,42}

val delete: 'a * 'a multiset -> 'a multiset
Example: delete(42,{3,42,42}) = {3,42}

The function member is used to determine if an element exists in the multiset. The function frequency returns the number of occurrences of a given element in the multiset, and isEmpty returns true if there are no elements in the multiset and false otherwise. The function numItems returns the number of elements in the multiset. Implement all of these functions.

val member : 'a multiset * 'a -> bool

val frequency : 'a multiset * 'a -> int
Example: frequency({1,2,3,3},3) = 2
         frequency({1,2,3,3},42) = 0

val isEmpty : 'a multiset -> bool

val numItems : 'a multiset -> int
Example: numItems({1,1,2}) = 3

The union operation creates a new multiset with all of the elements in each multiset.

val union : 'a multiset * 'a multiset -> 'a multiset
Example: union({3,1,2},{4,2})={1,2,2,3,4}

Converters. The fromList operation creates a multiset from a list, and toList creates a list from a multiset. The list should have one element for each element in the multiset.

val fromList : 'a list * ('a * 'a -> order) -> 'a multiset
Example: fromList([1,2,3,2], Int.compare) = {1,2,2,3}
         fromList([], Int.compare) = {}

val toList : 'a multiset -> 'a list
Example: toList({1,2,2,3}) = [1,2,2,3]
         toList({}) = []

Higher-order operations. fold folds over each element in the multiset. If an element occurs more than once it should be folded over more than once. There is only one fold operation because multisets do not order their elements.

val fold : ('a * 'b -> 'b) -> 'b -> 'a multiset -> 'b

The filter function returns a multiset with only the elements for which the given function returns true.

val filter : ('a -> bool) -> 'a multiset -> 'a multiset

tabulate creates a new multiset of size n where each element is calculated by applying a function to 0..n-1.

val tabulate : int * (int -> 'a) * ('a * 'a -> order) ->'a multiset
Example: tabulate(5, fn (x:int) => x mod 2, Int.compare) = {0,0,0,1,1}

The nMin and nMax operations return a new multiset containing the n smallest or n largest elements in the given multiset, as defined by the ordering from the comparison function.

val nMin : int * 'a multiset -> 'a multiset
Example: nMin(2,{1,2,2,2,3,3}) = {1,2}

val nMax : int * 'a multiset -> 'a multiset
Example: nMax(2,{1,2,2,2,3,3}) = {3,3}

To do: select and document a representation for multisets and implement repOk, empty, add, delete, member, frequency, isEmpty, numItems, fromList, toList, fold, filter, tabulate, union, nMin, and nMax in the file ms.sml.

Part 3: Add your own operation (7 pts)

In the previous part, you implemented a number of functions for manipulating multisets. These are not the only operations that one might like to have available for use on multisets. Think of another useful operation that is worth adding to the multiset interface, add its specification to ms.sig along with a justification for your choice, and implement it in ms.sml.

To do: Add appropriate code and comments to ms.sig and ms.sml.

Part 4: Black-box testing (20 pts)

For this problem, you will be creating test cases that test the functionality of multisets. The signature testms.sig looks as follows:

signature TESTMS = sig

  datatype fn_tests = EmptyMS | TabulateMS | AddMS | DeleteMS | FromlistMS |
                     TolistMS | IsemptyMS | MemberMS | FrequencyMS | NumitemsMS | FoldMS |
                     UnionMS | FilterMS | NminMS | NmaxMS

  datatype test_result = PASS of fn_tests | FAIL of (fn_tests * string)

  type test = unit -> test_result
  type test_generator = unit -> test list

  (* Returns: all 20-25 tests *)
  val allTests : test_generator

  (* Returns: five good tests, at most one per testable function *)
  val hardTests : test_generator

  (* Effects: Prints out the results from all tests in the list. *)
  val printTests : test_result list -> unit

  (* Performs unit testing on the multiset ADT and returns a list of
   * results from the tests returned by the test_generator. *)
  val runTests : test_generator -> test_result list
end

In testms.sml, add functions of the type unit->test_result. Each function should test a different aspect of a multiset operation. You may find it useful to use toList and fromList in your testing.

Each testing function returns a test_result that indicates whether the test passed or failed and which function it was testing. If the test failed the test result contains a string with details about how the test failed.

We have already implemented printTests, so you can view the results of your test cases by executing printTests(runTests allTests).

While you should be testing all of your functions (and note that we will test all of your functions ourselves), you should only submit test cases for empty, add, delete, fromList, and nMax. You should test both basic functionality and corner cases. To isolate individual functions for testing, you may find it useful to define additional identifiers in the structure. These identifiers, bound to useful values (e.g., multiset constants) can then be used in the test cases proper. Each test case can then use just the one function that is to be tested.

We will run some of your test cases against some other buggy implementations of multisets to see how good your test cases are. Implement hardTests() so it returns a list of these test cases. The list should contain exactly five tests, one for each of the five functions empty, add, delete, fromList, and nMax. Remember that you can only interact with multisets through the signature. You cannot assume anything about the underlying implementation; therefore, you should write black-box tests.

To do: create 20–25 test cases that test empty, add, delete, fromList, and nMax, and implement allTests() in testms.sml that returns them. Also implement hardTests() so it returns a list of just 5 of these test cases suitable for testing other multiset implementations.

Part 5 : Written problem (10 pts)

Consider the following code:
```
fun f h i g =
  case g of (g, 3) => h g
          | (_, 4) => i g
          | (f, 0) => i (f,f)
          | (g, j) => f h i (j-1, g-1)
```
What is the type of the function f? Justify your answer. (Hint: look at the types of h, i, and g. Watch out for shadowing!)

Given the following datatype definition:

datatype T = Roll of int*T -> int

Write out each step of the evaluation of the following expression according to the evaluation model given in class.

let fun factr(x:int, f: T) : int  =
  case (x, f) of
    (1, _) => 1
  | (y, Roll(g)) => x * g(y-1,f)
in
  factr(3, Roll(factr))
end

How can writing good specifications protect you from malicious co-workers? Discuss briefly.

To do: Hand in the solutions to these problems in a file written.txt. These must be in ASCII text format, and all lines should be shorter than 78 characters.

CS 312 Problem Set 3 Implementing a Data Abstraction