Discussion 3 handout
Group members (names & NetIDs)
Objectives
- Explore ADTs [Abstract Data Types, another name for the interfaces for various Datastructures like Bags/Lists/Stacks/Queues that we’ve seen and Data Structures (classes) in the context of Java’s Collections library
- Select and compose ADTs [Also gives you practice on Generics]
- Evaluate implementations of ADTs for efficiency in a particular use case
- Some more practice on evaluating efficiency in general and notation.
- Some time dedicated to A3.
Prepration: Demo code and example data
Please download dis03.zip, extract it to a known location on your computer, and open it as a project in IDEA.
Task1: Text file index
Your goal is to create a reverse index of a text file. For each line in the file, you want to keep track of the distinct words that occur on that line. Then using this index, you can report which lines a given word occurs on.
This problem can be solved with fairly little code by leveraging Java’s built-in implementations of appropriate ADTs. These are essentially, Java’s standard implementations for some common Datastructures. Some of which we’ve discussed over the past few lectures. By solving the problem in terms of ADT operations (rather than jumping straight to code), it is easier to verify that your solution will work, and the same algorithm can readily be implemented in multiple languages.
Identify ADTs
The primary operation of our index is to report which distinct words appear on a given line. What would be an appropriate return type for this query?
Its secondary operation is to report which line numbers a given word occurs on. What would be an appropriate return type for this query?
Select a combination of ADTs (from those supported by Java’s collections framework) that would be suitable for storing the information needed to create the index. Specify their generic types.
Construct the index
In “Index.java”, declare fields of the types you selected above. Initialize these fields by constructing instances of appropriate classes from the Java collections framework.
Implement the constructor for Index
to populate these fields.
Query the index
Read the specification for wordsOneLine()
, then implement it using your fields. Note the word “creates” in the spec. Depending on your choice of fields, it might be possible to return an existing collection from your index’s state instead of creating a new one; why do you think creating a new collection is important?
Read the specification for linesWithWord()
, then implement it using your fields.
Look at main()
, which constructs an index for the included file “Austen.txt”. Add additional code to answer the following questions about the resulting index and check your results with a neighboring group or with a consultant:
-
How many dictinct words appear on line 18?
-
Does the word “young” appear on line 13?
-
How many different lines does the word “with” appear on?
Task2: Notes on efficiency
Take a look at the java snippet of code below. It’s runtime complexity is O(n)
. That is it scales linearly with the size of the input n.
for (int i = 0; i < n; i++) {
// Do some constant amount of work ....
}
Take a look at another java snippet of code below. It’s runtime complexity is O(n^2)
. That is it scales quadratically with the size of the input n.
for (int i = 0; i < n; i++) {
// Note that because of how j is initialized, this code runs a little less
// than n^2 times. But that's fine because we simply want to find an upper bound
// on how this code behaves as we scale the input size n.
for (int j = i + 1; j < n; j++) {
// Do some constant amount of work ....
}
}
What is the O() (also called Big-O) complixity of the snippet of code below.
for (int i = 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
for (int k = j; k < n; k++) {
for (int l = 0; l < 10; l++) {
// Do some constant amount of work ....
}
}
}
}
What is the O() (also called Big-O) complixity of the snippet of your linesWithWord()
method from Task 1.
Task3: Work on A3
This remaining time is just meant for you to work on A3. At the end of class, in the deliverable for this discussion, just turn in a brief description of what part of A3 you worked on.
Submission
Open the assignment page for “Discussion activity 3” in CMSX and turn in your solution in one document. Note that this is graded for participation and not credit.