CS410, Summer 1998, Homework 3

CS410, Summer 1998 Homework 3

Due: 11:35 AM, Tuesday July 14 (Bastille Day)

Last update: July 7, 10:35PM.

Note: No late homework will be accepted.

Note: You are responsible for reading and understanding the course policy on academic integrity.

Begin with an empty red-black tree. Do the following operations in this order and show the tree after each operation. Assume the numbers below are the keys and the ordinary < relation is used. (Remember always to color the root black.)
- Insert 1
- Insert 2
- Insert 3
- Insert 4
- Insert 5
- Delete 4
- Insert 4
- Delete 3
- Delete 4
- Delete 5
- Delete 1
Be sure to color all your nodes. (Feel free to use "clear" instead of "red".)
Repeat the previous problem with 2-3 trees. Obviously, you will not include any colors.
Prove the following: If the difference between the longest and the shortest root-to-leaf path in a binary tree is a constant factor, then the length of the longest path is O(log n). That is, if the tree has n nodes, its shortest path has length s and its longest path has length h, and h <= ks, then h is O(log n).

Hint: Do not use induction. Lower bound the number of nodes in the tree in terms of h, explain the correctness of your bound, and conclude that h is O(log n).
CLR Exercise 14.2-5, page 267.
Suppose you have records (for example, addressbook entries) and you want to be able to look them up by either of two different fields (for example, name or phone number). How would you implement a dictionary that efficiently handled insertion, deletion, and lookup by either field. Hint: This is an easy problem.
Normally we do not allow duplicate keys in a dictionary. Suppose we want to use a BST to implement a dictionary that allows duplicated keys, and inserts/deletes/searches for all elements with a particular key when asked. We will have to modify our definition of a BST of course.
1. Suppose we modify our definition such that duplicated keys can be in either subtree of a node with the same key. Explain why this would be a bad idea.
2. Suppose we modify our definition such that duplicated keys are always in the left subtree. Explain why this could be just as bad an idea.
3. Come up with a good idea. Your solution should support the search and delete in time O(d + log n) and insert in time O(log n) where n is the size of the tree and d is the number of nodes with the asked-for key. (Hint: This is not hard.)
In this problem, we want to augment the 2-3 tree data structure so that it can also support the operation rank(key) which takes a key and determines how many items in the tree have keys less than key. rank(key) should run in time O(log n).
1. This requires we have extra information at the internal nodes of the 2-3 tree. Describe the extra information you need and how you use it to implement rank.
2. Maintaining the extra information requires additions to the insert and delete operations. Describe these additions. Be sure the additions do not violate the O(log n) running time of the operations.
3. Suppose we now want to add the operation numBetween(key1, key2) which returns the number of items in the tree with keys between key1 and key2, not counting key1 or key2. How would you implement this operation. (You may assume key2 >= key1.)