1. (a) O(n log n): Sort the numbers using mergesort, which takes O(n log n) time; output the i largest, which takes O(i) <= O(n) time. (b) O(n + i log n): Build a heap of the n elements, which takes O(n) time, then extract-max i times, which takes O(i log n) time. (c) O(n + i log i): Find the i-th largest number, which takes O(n) time, then sort the i largest numbers, which takes O(i log i) time. 2. Search on the hash value first; compare keys only if the hash values match. Hash values are typically numbers and can be compared in one step, whereas the time to compare strings is proportional to the length of the string in the worst case. This saves a long string comparison whenever the hash values don't match. 3. Linear probing: 0 1 2 3 4 5 6 7 8 9 10 22 88 4 15 28 17 59 31 10 Quadratic probing with c1=1 and c2=3, i.e. h(k,i) = k + i + 3i^2 mod 11: 0 1 2 3 4 5 6 7 8 9 10 22 88 17 4 28 59 15 31 10 Double hashing with h2(k) = 1 + (k mod 10), i.e. h(k,i) = k + i + i(k mod 10) mod 11: 0 1 2 3 4 5 6 7 8 9 10 22 59 17 4 15 28 88 31 10 4. Let A be the adjacency matrix of the graph. Note that there can be at most one sink, since if there were two, say u and v, then A(u,v)=0 (since u is a sink) but A(u,v)=1 (since v is a sink), contradiction. We will maintain a current guess as to which vertex is the sink. Pick an arbitrary vertex as the first guess and put the rest of the vertices in a bag. Now while the bag is nonempty, draw a vertex out (say v) and check it against the current guess (say u). If A(u,v)=0, then discard v, since it cannot be the sink; the current guess is still u. If A(u,v)=1, then discard u, since it cannot be the sink; make v the new current guess. When the bag is empty, the current guess (say w) is the only possible sink; check whether it is a sink by checking whether A(w,v)=0 for all v and A(v,w)=1 for all v except w. The entire algorithm made 3n-2 probes of the adjacency matrix. 5. Solution 1. Divide the computation into log m stages. In each stage, merge the sorted lists pairwise, using the merge routing of mergesort. Merging two lists takes time proportional to the sum of the lengths of the lists, and each of the n elements appears in exactly one list, so the total time to merge all pairs in each stage is O(n). There are log m stages, since we start with m lists and each stage halves the number of lists. Therefore the total time is O(n log m). Solution 2. Create a heap whose elements are the m sorted lists, using their first (minimum) elements as key. This takes time O(m). Now repeatedly perform extract-min on the heap, giving the list whose first element is minimum among all remaining elements; remove the first element from this list and place it on an output queue, then reinsert the remainder of the list back into the heap. Each extract-min costs O(log m) since there are m lists in the heap, and it is performed n times. Therefore the total time is O(n log m). 6. For each hash value s, we could keep all the elements in the hashtable with hash value s in a balanced tree, such as a 2-3 tree or red-black tree. The hashtable would contain a pointer to the tree in location s. To access an element, we would compute the hash value s, then search the tree for the given key. The worst-case time for search, insert, or delete is O(log n), where n is the number of elements in the hash table. We don't really care about the worst-case time, because under reasonable assumptions on the distribution of inputs, the expected number of collisions for hash value s is much less than n, namely O(1 + n/m) where m is the size of the table and n is the number of elements inserted so far. For n = O(m), this is constant. So on the average it is not worth the extra overhead for the balanced tree scheme; a simple linked list suffices. 7. (a) adjacency matrix: exist(u,v): O(1) (check whether the u,v-th entry is 1); insert(u,v): O(1) (set the u,v-th entry to 1); delete(u,v): O(1) (set the u,v-th entry to 0); out(u): O(n) (search row u); in(u): O(n) (search column u). (b) adjacency list: exist(u,v): O(min outdegree u, indegree v) (search the outlist of u and the inlist of v); insert(u,v): O(min outdegree u, indegree v) (insert the edge (u,v) at the front of the outlist of u and the inlist of v after checking that the edge is not already there); delete(u,v): O(min outdegree u, indegree v) (search(u,v), then unlink the edge from the outlist of u and inlist of v); out(u): O(outdegree u) (traverse the outlist of u); in(u): O(indegree u) (traverse the inlist of u). (c) Use an adjacency matrix, but let the elements of the matrix be list elements that can be linked together instead of just boolean values. For each u, link all the nonzero elements of row u together to form the outlist of u and link all the nonzero elements of column u together to form the inlist of u. Then exist(u,v), insert(u,v), and delete(u,v) are O(1) (list elements can be linked and unlinked in constant time), and in(u) and out(u) are O(indegree u) and O(outdegree u), respectively, by traversing the inlist or outlist of u. 8. Let H be the set of all functions h:{0,1,...,n-1} -> {0,1,...,m-1}. To show that H is universal, we must show that for all x,y in {0,1,...,n-1}, x != y, |{h in H | h(x)=h(y)}| = |H|/m. There are m^n elements of H, since there are m ways to choose the value of h(0), m ways to choose the value of h(1), etc., and all these choices are independent. Therefore the right-hand side is m^(n-1). But for any x,y, x != y, there are m^(n-1) elements of {h in H | h(x)=h(y)}, since we can choose the values of h(z) for each z != y (including x) in m possible ways, and all these choices are independent, and thereafter there is exactly one way to choose the value of h(y), since it must be the same as h(x). H is an impractical choice for universal hashing because there is no succinct way to represent the elements of H so that one can be chosen at random. To represent all the elements of H in a single coding scheme, such as an input/output table, would require log(m^n) = n log m bits for each, where n is the number of keys.