CS410, Summer 1998 Homework 7

Due: 11:35 AM, Tuesday August 4

Last update: July 30, 10:00AM.

Note: You are responsible for reading and understanding the course policy on academic integrity.

  1. Write a program to find the intersection of two sets in time O(nlog n) in the comparison model (where n is the total number of elements). Note: Although we did not emphasize this in class, the comparison model allows "equals" as well as "less than".

  2. Merge sort is list-friendly except that we must follow n/2 pointers before we can divide the list in half (where n is the length of the list). Matters are worse if we don't know the length of the list since we must then do 3n/2 pointers traversals -- n to determine the length of the list and n/2 to get the middle element. These figures do not include the pointer traversals for finding the middle in the recursive calls.
    1. Including all recursive calls, how many (in terms of n) total pointer traversals will be used to find middles if we never know how long a list is? What is this number when n is roughly 1 million (or 220)?
    2. Including all recursive calls, how many (in terms of n) total pointer traversals will be used to find middles if we always know how long a list is? What is this number when n is roughly 1 million (or 220)?
    3. Give a version of merge sort for a linked list that uses n pointer traversals to determine the length of the original list and then knows (without following pointers) the lengths of all smaller lists used in the algorithm. How many total pointer traversals will this algorithm use to find middles (both in terms of n and for n = 1 million)?


  3. CLR Problem 10-1, page 193.

  4. Suppose we want to print the ith smallest and the jth smallest from a set of elements. One trivial solution is:
    quick-select2(array, i, j, low, high) =
      print quick-select(array, i, low, high);
      print quick-select(array, j, low, high);
    
    Write a more clever version of quick-select2 that does not separate the two problems until i and j are separated by a partition.

  5. Suppose radix sort is expected to take time (c1*n*d) where c1 is a constant, n is the number of elements to be sorted, and d is the number of "digits" in each element. Suppose insertion sort is expected to take time (c2*m2) where c2 is a constant and m is the number of elements to be sorted. Assume the elements to be sorted are distributed uniformly across the set of possible keys. Let b be the number of possible values for a particular "digit".
    1. Show how to sort n elements in expected time (c1*n*w + c2*(n/bw)2*bw) for any value of w between 0 and d.
    2. Explain why your method needs the assumption of uniform distribution in order for the expected time to be true.
    3. Let c1 = 100, c2=1, b=10, and n=100,000. Give the running time of your algorithm for each value of w between 0 and 5. Conclude what value for w would be best for these parameters.
    4. In the previous part, why is the formula actually misleading when w=5? Does this change the answer for the best value for w?


  6. [Sedgewick, Exercise 15.19, page 622] Write a program that prints out all keys in a trie that have the same initial t bits as a given search key. Assume you are given a function digit(i,k) which returns the ith bit of the key k.