Computer Science 409: Homework 3
Where do Most Students Live

Homework 3: Handed out: 2/8/01; Due: 2/22/01

Note: This assignment involves programming, which will have to be done using J++ in the CSUGLab. Even if you originally use another variant of Java for your program, make sure it runs in the CSUGLab.

As usual, you can discuss this assignment with anyone you like, but you must write it up yourself.

Purpose

The purpose of this assignment is to use hash tables and heaps to solve some interesting problems efficiently. It is also to emphasize that although space is often plentiful it should not be wasted without reason. Other goals include learning to use standard libraries and writing polymorphic data structures (polymorphic is defined below).

The Fun Part

The following two files contain the entire student directories for Cornell University and Ithaca College, respectively:

We want to know which addresses have the most Cornell students living at them. This should be done in time O(n log m) assuming hashing behaves, where n is the number of students and m is number of most-common-addresses asked for. We also want to know what addresses have both Cornell students and Ithaca students living at them. This should be done in time O(n) where n is the number of total students, assuming hashing behaves. Space requirements are discussed in more detail below.

Overview

You have to implement the class Heap and write 2 static methods in the provided class called BigHouses.

The first two methods take 3 arguments: the name of the file to be used as input, a value for m (the number of most-common-addresses desired), and a PrintWriter to which output should be sent. The last method takes 3 arguments: the two input file names, and a PrintWriter object.

You should use the Hashtable class provided by the java.util library. Your Heap class should be polymorphic over different kinds of objects that have integer keys. That means your Heap class should be able to store any instance of any class that implements the Heapable interface. Your Heap is being used as a priority queue and should implement the priorityQueueInterface.

Here are more specific requirements for each part:

Write-up

You need to write a 1 - 3 page lab report. Your report should contain at least the following:

Turn-in procedures

Turn in the report in class, also mail a copy of your code to pabo@cs.cornell.edu

Hints

We use Java's Vector to maintain a list of names, A Vector holds Objects, so you have to cast anything you take out of a Vector. The Hashtables are the same way, and your Heaps will be similar, so you will be doing quite a bit of casting.

You need to write a class that implements Heapable, and in your printBigHousesOnePass create instances of that class to be added to the Heap. Interface is like an abstract class (everything listed in an interface is abstract), but classes may implement multiple interfaces. If you're unfamiliar with interfaces, you can just think of the Heapable interface as an abstract class; the differences aren't relevant to this assignment. Of course, code using the Heap can only get Heapable out of Heap, you need to cast this to a particular kind of Heapable object so it can use the actual data in the object.

Use the heap to keep track of only the m addresses with the largest number of people (you need to put the min at the top of the Heap for this purpose) Then sort the m winners and print them out in reverse-sorted order

Note that algorithms for Heap given in class are written for array starting at index 1, and keeping the max at the top. You need to adjust this for this assignment.

Do not bother providing a buildHeap method in your Heap class; this is not an efficiency bottleneck for this assignment. You should, however, provide methods min() and extractMin() for efficiency reasons. You must also provide the method heapify(), for obvious reasons.

START EARLY!!!!!!

Input and Output

The input files are in a very simple format. There is one name or address per line. It alternates name, address, name, address, etc. The name corresponds to the address on the following line. The provided code has a method that takes a file name and returns a BufferedReader object. This object has a readLine() method which returns a String which is the next line of the file. It returns null after the end-of-file has been reached.

For output in your methods, all you have to do is call the println(String s) method of the Printwriter object passed to the methods.

The methods described above could throw an IOException. Hence the methods you are writing also declare that they may throw such an exception. You should not catch such exceptions.

If you are using a version of Java that doesn't have BufferedReader, it should be simple enough to write your own readLine() method for an input object.

Don't forget that your methods should work for any files in the format described above. Only the main method should explicitly mention the files being used.

We assume that two addresses are equal if and only if they are identical Strings. Hence the following are all different addresses as far as we're concerned:

Code We are Providing You

Grading

Style - 5%
Write-up - 30%
Code - 65%