Problem Set 2: Data Structures

Due Thursday, September 22, 11:59:59 pm.


Note (9/10/2005): we have added template files. You will extend them to create your solutions

Instructions: This problem set has three parts. You will submit a total of six separate files: part1.sml , part2.sml , part3a.sml, part3b.sml, part3c.sml, and part3d.txt. Modify the .sml template files from ps2files.zip. We will autograde your homework, so do not modify the .sig files, and do not rename or change the signatures of the variables or functions that we provide. The autograder will directly access these name bindings. Do not modify sources.cm, as it will not be submitted. Make sure that your code compiles under CM with the provided sources.cm before submitting! To compile the code with CM, change to the directory containing your PS2 files, start SML, and evaluate

  CM.make();
Non-compiling code will receive an automatic zero. All your files should have your name, netID, and your section  in a comment at the top. Do NOT compress the files into a zip file. You must submit each file individually using CMS.


Part 1: Roman Numbers

A) To learn the Hindu number system (also known as the Arabic number system for some inexplicable reason), the Emperor roMuLus wants to write a function that converts integers in this system to Roman numerals. Write a function

intToRoman : int -> RomanNumber
that converts integers between 1 and 999 to their Roman equivalents. Roman numbers should be elements of the following datatype:
datatype RomanDigit  =  I | V | X | L | C | D | M

datatype RomanNumber = Empty | Cons of RomanDigit * RomanNumber

The following chart shows how integers are written as Roman numerals.

          1  2   3    4  5   6  7    8    9
ones      I  II  III  IV V  VI VII  VIII IX
tens      X  XX  XXX  XL L  LX LXX  LXXX XC
hundreds  C  CC  CCC  CD D  DC DCC  DCCC CM

To convert an int (say 435) into a Roman numeral, translate the hundreds digit (CD), then the tens digit (XXX) and then the ones digit (V), and then concatenate all of them to obtain CDXXXV.

B) Emperor roMuLus has an Indian counterpart Emperor kaMaL who is learning Roman numerals. Help him by writing a function

romanToInt: RomanNumber -> int
that performs the conversion from RomanNumbers to int.

C) Write a function:

romanToString: RomanNumber -> string
which returns a readable string representation of a Roman number.
For this part you must complete and submit the file part1.sml . The file contains a structure RomanNumber containing: You only need to add the implementation for the three functions above.

Part2: Permutations

A 312 student named eMiL wants to write a function
 perms: int -> int list list
that takes an integer n as a parameter, and returns a list containing all permutations of the integers between 1 and n. Each permutation is represented as a list of integers. For example,
 perms(2) = [[1,2],[2,1]]
The order in which permutations are stored is irrelevant. Help eMiL by writing his code for him. Start by writing a function
insertAt: int * int * (int list) -> int list
A call insertAt(e,p,l) should insert an integer e at a position p in a list l.

Use this function to produce new permutations from existing smaller permutations. For instance, to produce all permutations for n = 3, you can extend the permutations of 2 by adding 3 at every available position:

 perms(3) = [[1,2,3], [1,3,2], [3,1,2],   (* extending [1,2] *)
             [2,1,3], [2,3,1], [3,2,1]]   (* extending [2,1] *)

For this part you must modify and submit the file part2.sml. Maintain the structure of the file, simply add an implementation for the functions:
insertAt : int * int * (int list) -> int list 
perms : int -> int list list

Part 3: Priority queues

A new Major League team called aMeLie is about to be formed. The team's owner, Mike Lewis, is looking for good players from other major league teams. To choose these players, he has given each player a priority, which is an integer value that determines how valuable that player is. A priority of 1 means that the player is most valuable. Help Mike by implementing priority queues in three different ways.

A priority queue is a data structure that stores records of type elem shown below. The field value is a string that specifies the last name of the player, while the field key is the priority of the player.

type elem = {key: int, value: string};
For instance an element {key = 1, value = "Galarraga"} can be used to denote a player named Andres Galarraga, and associate him with the highest priority. To compare the priority of two elements, you can use the following function:
fun elemLT (e1: elem, e2: elem): bool =
    let val {key = k1, value = _ } = e1
        val {key = k2, value = _ } = e2
    in
        (k1 < k2)
    end;

Priority queues support two main operations: insert and findMin. Insert adds a new element (with an associated priority) into the structure. FindMin returns the element in the set with highest priority (minimum key value). Whenever findMin is applied on an empty priority queue, the operation must raise an exception:

exception NoElements;

We have divided this part in four subproblems. First you will study three different implementations of priority queues: using sorted lists, binary trees as heaps and binomial heaps. Finally we will ask you to compare your solutions and give insights about your experience.


3a: Sorted list implementation

First we would like to look at priority queues implemented as a sorted list. The priority queue will be a list sorted by the key values. For this sub-problem use the following datatype:
 
type Slist = elem list;
We want you to implement two functions: insert and findMin. The function insert takes an element and a priority queue (as a sorted list) and returns a list with the element inserted in the appropriate position. The input and returned list must be sorted. The function findMin simply takes a priority queue and returns a pair. The first component of the pair is the element in the queue with lowest key value (min). The second component of the pair is the queue produced when removing the min element.
For this subproblem you must modify and submit the file part3a.sml, providing an implementation for the insert and findMin functions:
insert: elem * Slist -> Slist
findMin: Slist -> (elem * Slist)

3b: Heaps using a binary tree

A different implementation of priority queues can be done using complete binary trees, as you know from CS 211. The complete binary tree must maintain the heap property with respect to the element keys. A tree is said to be a heap if the value in a node is smaller than the values of the child nodes. For instance, consider the following trees (showing only the key values):
      2                       2     
     / \                     / \    
    3   7                   8   7   
   / \                     / \  
  6  15                   6  15 
The tree on the left is a heap, but the tree on the right is not (since 6 is less than 8).

To implement this data structure, we will use the following datatypes:

datatype Btree = Nil | E of elem * Btree * Btree; 
type PQTree = Btree * int;
A Btree can be either empty (Nil), or a node (E(x,l,r)), where x is the element stored in the node, l and r are the left and right subtrees respectively.

The int in type PQTree stores the number of elements in the Btree.

Inserting and removing an element in the tree has two challenges: keeping the tree balanced, and maintaining the heap property.


For this subproblem you must modify and submit the file part3b.sml, providing an implementation for the insert and findMin operations.
insert: elem * PQTree -> PQTree
findMin: PQTree -> (elem * PQTree)

3c: Binomial heaps

Binomial heaps are collections of binomial trees. A binomial tree is an ordered tree (based on key values), and is defined recursively. Binomial trees are not binary trees - a node can have more than two children! Given a size parameter k, the structure of a binomial tree is specified unambiguously; we refer to any binomial tree with this structure as being a Bk tree. Thus two instances of Bk trees are identical in structure, but they do not necessarily contain the same keys.

There are no empty binomial trees. Tree B0 consists of a single node. Binomial tree Bk+1 consists of two instances of binomial tree Bk linked so that the root of one instance is the leftmost child of the root of the other instance. Graphically,

Binomial tree Bk has height equal to k, it contains 2k nodes, and its root has degree k (it has k children, the most of all nodes in the respective tree). We show below the structure of Bk for k = 0, 1, 2, 3:

An alternative, but equivalent definition of Bk+1, is a tree which root has k+1 children and each children is the root of a binomial B0 to Bk:

The directions of the edges linking nodes to their children do not matter; what matters is the number and the order of the children. Each node of a binomial tree has a key that is smaller than the key of any of its descendants. By convention, we will refer to the leftmost child of a node as being the first child of the respective node.

A binomial heap consists of an array of binomial trees. We say that a binomial tree Bk has rank k. There is at most one binomial tree of the same rank in the binomial heap. It is possible for a binomial heap to contain no trees, i.e. to be empty. A binomial heap containing a total of n nodes will contain exactly m binomial trees, where m is the number of digits 1 in the base-2 representation of n. In addition, the heap will contain a tree of rank r for all values r that are the exponents of 2 that correspond to digits 1 in the binary expansion of m. For example, if a heap contains 11 nodes (1011 in binary), then the heap will contain an instance each of B0, B1, and B3.

We now provide SML definitions for binomial trees and heaps:

datatype BinomialT = HE of elem * BinomialT list;

datatype 'a Maybe = Zero | One of 'a;

type Bheap = (BinomialT Maybe) list;

We note that the datatype constructor HE is not necessary, we could use only a pair of type elem * BinomialT list. The reason for this is that we don't have empty binomial trees. We chose to use a datatype constructor to make the various constructs more easily recognizable. The leftmost child of a node is the first in the list of its children. The heap type presents an encoding of an array using lists. If the ith element of a list is Zero, means the array is empty at the position i. Otherwise, when the ith element of the list is One(bt), means the position i of the array holds a binomial tree (bt) which must be an instance of Bi. For instance, a valid heap would be a list: [One(B0), One(B1) , Zero, One(B3)].

We want you to implement the operations insert and findMin for binomial heaps. Inserting an element consist of adding a binomial B0, and then recursively normalize the heap structure to maintain that there is at most one binomial tree for each rank. This can be done by merging binomials of the same rank into a binomial of the next rank. For example, the following diagrams show how the insertion happens step by step:

Step
Input
Adding new elem
Normalizing
After normalization
The merging of binomials is quite simple: add one binomial as child of the other, choosing as the parent the one with smaller root.

findMin needs to look for the minimum element (which is the root of one of the binomial trees), and removing the element. The removal will separate the children of that binomial tree into smaller binomials. A normalization of the structure may also be necessary. For example, in the picture below suppose the min element is the root of B2. The findMin operation would change the binomial heap as follows:

Step
Input
Removing element
After normalization

For this subproblem you modify and submit the file part3c.sml, providing an implentation for the insert and findMin operations.
insert: elem * Bheap -> Bheap
findMin: Bheap -> (elem * Bheap)

3d: Discussion


For this subproblem you must submit your answers in the file part3d.txt.