CS211 Assignment 4: Enumerations of tags and links

Due Tuesday, 2 October

Note

In class this week, we are discussing inner classes and their use to implement a window that closes and has two buttons. The code
for this can be retrieved from the course website. Bring up the webiste, click on "Handouts", and read the information provided for the lectures of 25-27 September.
 
 

What to hand in

We will tell you later what to hand in.

Purpose of assignment

To get you to write two enumerations, in preparation for the next project, which will be to write a program that prints a list of all links reachable from a given web page. That will be neat!

Html tags

Html pages are filled with "tags" like <b> and </b> --these two tags say to begin using boldface and end printing boldface, respectively. Each tag begins with "<" and ends with ">".

If you have never seen an html page, then look at this one. Open this document in your favorite browser (from the CS211 course page). If it's netscape, select menu View item Page source. If it's internet explorer, select menu View item Source. Notice that most tags come in pairs, eg.g. <p> and </p>, <i> and </i>, <head> and </head>, <title> and </title>, <a href="..."> and </a>.

Links

A link is a Url that appears in a tag on an html page. If a tag contains one of the following, where there may be blanks on either side of = and where "xxx" stands for any sequence of characters, then xxx is a link:

href="xxx"
src="xxx"
We assume that a tag contains at most one link.

Class TagEnumeration

Your first task is to write a class called TagEnumeration that implements interface java.util.Enumeration and that enumerates the tags in a BufferedReader. It should have a constructor with specification

        /** Constructor: an enumeration of the tags in br.    Precondition: br != null. */
        public TagEnumeration (BufferedReader br)

Since the class implements interface Enumeration, it has to implement two methods:

        /**  = "the buffered reader contains another tag to process" */
        public boolean hasMoreElements()

        /** = the next tag in the buffered reader --as an instance of class String*/
        public Object nextElement()

Here are some comments on this class. Our solution uses three private variables:

       private BufferedReader br; // The BufferedReader to be read
       private String line= "";       // Input that has been read but not yet processed
       private String tag;               // The next tag to return --the last tag formed
                                                    // but not yet returned (null if no more)

The definitions of these variables implies that the constructor should look for the next tag and store it (or null) in variable tag. This same process will have to be carried out elsewhere as well, so it makes sense to have the following kind of method in the class:

        /** Store next tag in 'tag', if it exists.
              Set tag to null if it does not exist.
           */
    private void getReadyForNext() {
 

Method hasMoreElements should always be called first, and calls to hasMoreElements and nextElement are expected to alternate. Before hasMoreElements can return true or false, it has to find the next tag (if there is one) and store it in field tag (or store null there if there is no more). To look for the next tag, find the next "<" in the input and then look for the next ">". If found, the tag is everything between and including  the "<" and ">".

Note that "<" and ">" are part of the tag and should be in the String that is returned by method nextElement.

You should check out class TagEnumeration completely before proceeding to the next part. Use any .html or .htm page as the input, create a BufferedReader for it, create a TagEnumeration for the BufferedReader, and use it to print all the tags on the Java console.

Note that you shouldn't have to use loops that sequence over the characters of String variable line. Instead, use methods that exist in class String for finding the next '<' or the next '>'.

Class LinkEnumeration

Your second task is to write a class called LinkEnumeration that implements interface java.util.Enumeration and that enumerates the links in a BufferedReader. It should have a constructor with specification

        /** Constructor: an enumeration of the links in br.    Precondition: br != null. */
        public LinkEnumeration (BufferedReader br)

Since the class implements interface Enumeration, it has to implement two methods:

        /**  = "the buffered reader contains another link to process" */
        public boolean hasMoreElements()

        /** = the next link in the buffered reader --as an instance of class String*/
        public Object nextElement()

Here are some comments on class LinkEnumeration.

Method hasMoreElements should be called first, and calls to hasMoreElements and nextElement are expected to alternate.

Since links appear within tags, this class should create a TagEnumeration te (say) for the BufferedReader and enumerate its tags. Thus, method hasMoreElements can simply repeatedly get tags from te until it finds one that contains a link; then, it can store that link in a private variable. Method nextElement can then return the value of that variables --this is similar to how TagEnumeration works.

You may assume that a tag contains at most one link.

You should check our class LinkEnumeration the way you did class TagEnumeration. Use any .html or .htm page as the input, create a BufferedReader for it, create a LinkEnumeration for the BufferedReader, and use it to print all the tags on the Java console.