CS211          Homework 4

Due: Thursday, Oct. 26, at the beginning of class

 

Watch this space for later updates.

[10/25] Having problems importing HashSet or TreeSet on HW4? See the newsgroup for help.

[10/22] In the example for problem 2, the correct output should be "30 22" instead of "30 23". This mistake has been corrected.

[10/22] In problem 3e, a semicolon was missing and has been corrected.

 

Purpose

 

There are three sub-problems in this homework : (1) and (2) require coding; (3) needs no coding at all.  Problem (1) is designed to encourage use of the Java API Documentation to find out what you need while coding for yourself.  Problem (2) provides practice on BSTs and on a form of tree traversal. The last problem is here so that you can practice with the ideas of runtime analysis.

 

What to hand in:

We plan to grade the homework based on your electronic submission.  The hardcopy is required mainly for backup.

 

Problem (1)

 

Write a program that reads in words from a text file and then prints out both the number of words and the number of unique words.

 

Input:

 

You can assume that the input file is a text file consisting of only letters (a-z, A-Z), commas(,), periods(.), blanks, tab characters and newline characters. A word is defined in usual sense and consists of only letters. Make sure your program ignores case; that is, "The", "the", "THE" and "tHE" are all considered to be the same word.

 

Output:

 

Output will be two numbers:

- The first is the total number of words in the file.

- The second is the number of unique words in the file. For this second number, even if a word appears multiple times, it's only counted once.

Example:

 

For the following input file,

 

The Test.   Each sentence HAS words,    which are

    separated  by Blanks or... each word consists

of    letters. If ...  a program that     tests

if thE sentence has...    , wORd that is A ..

 

output should be

 

30 22

 

Exceptions:

 

For this problem, you don't have to deal with exception handling.

 

Provided code:

 

The main() method in WordCounter.java includes code for opening a file. It opens a dialog box for a user to choose a file to read, and then creates textFile, a File object associated with the selected file. You are to complete the remaining part of main(). You are allowed to define additional data or methods if needed, although it seems unnecessary.

 

Hints:

 

There are several built-in Java classes that can make your job easier and the program pretty short. It is strongly recommended that you first browse the Java API, particularly the packages java.lang, java.io, and java.util, to find out what could be helpful for coding this problem. For example, look for BufferedReader and related classes such as FileReader for reading lines from a File. Using either HashSet or TreeSet makes for a convenient way to implement a set (a set has no duplicate elements; see Set as well.)  How can you break a string into tokens (i.e., words)?  How can you easily compare two words (see String) ignoring case?  Not every detail is explained here. Figure it out yourself!

 

Click here for on-line Java API Documentation

 

 

Problem (2)

 

A Binary Search Tree (BST) is a binary tree for which each node has the BST property: if the node contains data x then all data in the left subtree are less than x and all data in the right subtree are greater than x. Here we assume that no two nodes contain the same data. In general, BSTs support operations such as search, add and remove. For this assignment however, your BST will need only the add operation. Note that this a standard BST; you are not expected to maintain a balanced tree.

 

Below is the outline of class BinarySearchTree that you will complete.

 

public class BinarySearchTree
{

    static class Node {
        String data; // data field
        Node lchild; // link to left child
        Node rchild; // link to right child

        Node (String data) {
            this.data = data;
            lchild = null;
            rchild = null;
        }
    }

 

    private Node root; // root node of tree

 

    BinarySearchTree() { root = null; } // create an empty BST

    public void add (String data) { ... } // add to BST a new node holding data
    public String toString () { ... } // returns a string representation of BST

}

 

The Node class represents a node in the tree. It is defined as a static nested class; thus, classes outside of BinarySearchTree can refer to this class as BinarySearchTree.Node.  Within BinarySearchTree, the class can be referred to simply as Node. The field root is a reference to the root node of the tree (null if the tree is empty). Constructor BinarySearchTree() initializes a new empty BST.  Your job is to write the methods add() and toString().

 

The method add(String data) will add to the BST a new node containing data. Recall that we don't allow multiple instances of the same data, so add(data) should not change the BST if it already has a node containing data.

 

The method toString() is supposed to return a tree-structured string that shows the contents of the BST. As an example, suppose strings "jan", "feb", "mar", "apr", "may" and "jun" are added to an empty tree. If the strings are added in the order listed then the resulting tree should look like this:

      jan

     /   \

   feb   mar

  /     /   \

apr   jun   may

 

Your method toString() should return a single string that looks like this:

 

        may

    mar

        jun

jan

    feb

        apr

 

This is a picture of the BST, but it's drawn sideways. Each node occupies a single line. The root node is on the far left. The left subtree of root node is found by moving down, and the right subtree is found by moving up. The level of each node is indicated by the number of blanks that appear before it. As shown above, there should be (4 * k) blanks before a node at level k (the level of the root is 0; its children are at level 1; etc.)

 

Hints:

 

The add() operation can be written either iteratively or recursively.  The toString() method is most easily written recursively. Basically, you need to do something very similar to an inorder tree-traversal.  One way to write the recursive part is as a static method that takes 2 arguments: a Node (indicating the subtree to be printed) and a String (indicating how much indentation is needed for this subtree).  You can use "\n" within a String to indicate a newline.

 

Provided code:

 

BinarySearchTree.java - You must complete the methods add() and toString().

 

 

 

Problem (3)

 

For each of following code segments, determine its running time (as a Big-O bound) in terms of n. You should find the best possible bound.

 

(a)

 

for (i = 0; i < n; i++)

    for (j = 0; j < n; j++)

        sum += j;

 

(b)

 

for (i = 0; i < n; i++)

    a[i] = 0;

for (j = 0; j < n; j++) {

    a[j] = j;

    sum += j * j;

}

 

(c)

 

for (i = 0; i < n; i += 2)

    for (j = 0; j < n*n; j++)

        sum += (j - i);

 

(d)

 

for (i = n - 1; i >= 1; i--)

    for (j = 0; j < i; j++)

        if (a[j] > a[j + 1]) {

            temp = a[j];

            a[j] = a[j + 1];

            a[j + 1] = temp;

        }

 

(e)

 

// assume that link is an integer array and each element link[i] has one of -1, 0, ..., n-1.

 

for (i = 0; i < n; i++)

    if (link[i] != -1) {

        j = i;

        while (link[j] != -1) {

            temp = link[j];

            link[j] = -1;

            j = temp;

        }

    }