Discussion 10: Reverse Indexing with Hashing

Solutions

Exercise 1: Reverse Index by Hand

Forward map (courses → students)

Course	Enrolled Students (in order)
"CS 1110"	["s1111", "s1234", "s9001"]
"CS 2110"	["s1234", "s9001", "s1111", "s9001", "s2222"]
"CS 2800"	["s7777", "s1111", `null`, "s1234"]
"CS 5154"	["s9001", "s2222", "s7777"]

Reverse map (students → courses)

Student	Enrolled Courses (in order)
"s1111"	["CS 1110", "CS 2110", "CS 2800"]
"s1234"	["CS 1110", "CS 2110", "CS 2800"]
"s9001"	["CS 1110", "CS 2110", "CS 5154"]
"s2222"	["CS 2110", "CS 5154"]
"s7777"	["CS 2800", "CS 5154"]

Exercise 2: Programming the Reverse Index

(a)

buildReverseIndex() Constructor

Most of the work of the EnrollmentIndex class is done by its constructor, which initializes both its forward and reverse indices to satisfy their class invariants. Complete the definition of this constructor to build these indices in one pass over the rosters map. It may help to reference the Map interface documentation to remind yourself what operations are available.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


/**
 * Constructs the forward and reveres enrollment indices based on the given `rosters` that
 * associating each course with its list of enrolled students (possibly containing duplicate
 * and/or null values).
 */
public EnrollmentIndex(Map<Course, List<Student>> rosters){
  coursesToStudents = new LinkedHashMap<>();
  studentsToCourses = new LinkedHashMap<>();

  for (Course c : rosters.keySet()) {
    for (Student s : rosters.get(c)) {
      if (s == null) {
        continue;
      }
      if(!coursesToStudents.containsKey(c)) {
        coursesToStudents.put(c, new ArrayList<>());
      }
      if (!studentsToCourses.containsKey(s)) {
        studentsToCourses.put(s, new ArrayList<>());
      }
      List<Course> schedule = studentsToCourses.get(s);
      List<Student> roster = coursesToStudents.get(c);
      if (!schedule.contains(c)) {
        schedule.add(c);
        roster.add(s);
      }
    }
  }
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


/**
 * Constructs the forward and reveres enrollment indices based on the given `rosters` that
 * associating each course with its list of enrolled students (possibly containing duplicate
 * and/or null values).
 */
public EnrollmentIndex(Map<Course, List<Student>> rosters){
  coursesToStudents = new LinkedHashMap<>();
  studentsToCourses = new LinkedHashMap<>();

  for (Course c : rosters.keySet()) {
    for (Student s : rosters.get(c)) {
      if (s == null) {
        continue;
      }
      if(!coursesToStudents.containsKey(c)) {
        coursesToStudents.put(c, new ArrayList<>());
      }
      if (!studentsToCourses.containsKey(s)) {
        studentsToCourses.put(s, new ArrayList<>());
      }
      List<Course> schedule = studentsToCourses.get(s);
      List<Student> roster = coursesToStudents.get(c);
      if (!schedule.contains(c)) {
        schedule.add(c);
        roster.add(s);
      }
    }
  }
}

(b)

Index Query Methods

Complete the definitions of the isEnrolled(), courseCountOf(), and enrollmentOf() methods, which run small queries against the indices that you built. Determine which index (forward or reverse) to use to get the best performance. Make sure that you carefully consider the case that a student or course may not be present in the index to avoid propagating any exceptions.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


/** Returns whether the given student `s` is enrolled in the given course `c`. */
public boolean isEnrolled(Student s, Course c){
  return coursesToStudents.containsKey(c) && coursesToStudents.get(c).contains(s);
}

/** Returns the number of courses in which `s` is enrolled. */
public int courseCountOf(Student s){
  if (!studentsToCourses.containsKey(s)) {
    return 0;
  }
  return studentsToCourses.get(s).size();
}

/** Returns the number of students enrolled in course `c`. */
public int enrollmentOf(Course c){
  if (!coursesToStudents.containsKey(c)) {
    return 0;
  }
  return coursesToStudents.get(c).size();
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


/** Returns whether the given student `s` is enrolled in the given course `c`. */
public boolean isEnrolled(Student s, Course c){
  return coursesToStudents.containsKey(c) && coursesToStudents.get(c).contains(s);
}

/** Returns the number of courses in which `s` is enrolled. */
public int courseCountOf(Student s){
  if (!studentsToCourses.containsKey(s)) {
    return 0;
  }
  return studentsToCourses.get(s).size();
}

/** Returns the number of students enrolled in course `c`. */
public int enrollmentOf(Course c){
  if (!coursesToStudents.containsKey(c)) {
    return 0;
  }
  return coursesToStudents.get(c).size();
}

(c)

Runtime Analysis

Determine the runtimes of your implementations of the methods from part (b) in terms of \(|S|\), the total number of students, and \(|C|\), the total number of courses.

If you only had the forward index coursesToStudents, which of these operations would have a strictly worse runtime? Explain your answer.

In the isEnrolled() method, the calls to containsKey() and get() on the coursesToStudents (forward) map require expected \(O(1)\) time. The contains() call on the List returned by get() performs a linear search on this list, so has time linear in its length. In the worse case (where every single student is enrolled in one course), this has runtime \(O(|S|)\).

In the courseCountOf() method, the calls to containsKey() and get() on the studentsToCourses (reverse) map require expected \(O(1)\) time. In addition, the size() call on the List returned by get() requires \(O(1)\) time, for an overall \(O(1)\) runtime.

In the enrollmentOf() method, the calls to containsKey() and get() on the coursesToStudents (forward) map require expected \(O(1)\) time. In addition, the size() call on the List returned by get() requires \(O(1)\) time, for an overall \(O(1)\) runtime.

If we only had the forward index, we would not be abe to efficiently implement the courseCountOf() method. To determine which courses student s was enrolled in, we’d need to iterate over all of the keys in the coursesToStudents map and check whether its roster contains s.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


/** Returns the number of courses in which `s` is enrolled. */
  public int courseCountOf(Student s) {
  int count = 0;
  for (Course c : coursesToStudents.keySet()) {
    if (coursesToStudents.get(c).contains(s)) {
      count++;
    }
  }
  return count;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


/** Returns the number of courses in which `s` is enrolled. */
  public int courseCountOf(Student s) {
  int count = 0;
  for (Course c : coursesToStudents.keySet()) {
    if (coursesToStudents.get(c).contains(s)) {
      count++;
    }
  }
  return count;
}

This loop runs for \(O(|C|)\) iterations. In each iteration, the contains() call requires an \(O(|S|)\) linear search of the student list, for an overall \(O(|C| \cdot |S|)\) runtime.

(d)

Iterator Methods

Complete the definitions of the students(), coursesOf(), and studentsIn() methods, which each return an Iterator. In the case of a student or course that is not present in the index, return Collections.emptyIterator(). You should not need to define any nested iterator classes to implement these methods.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


/** 
 * Returns an iterator over all the students who are enrolled in at least one course, 
 * in the order that they first appeared in the rosters. 
 */
public Iterator<Student> students() {
  return studentsToCourses.keySet().iterator();
}

/** 
 * Returns an iterator over all the courses in which `s` is enrolled, listed in the 
 * order that they appeared in the rosters. 
 */
public Iterator<Course> coursesOf(Student s){
  if (!studentsToCourses.containsKey(s)) {
    return Collections.emptyIterator();
  }
  return studentsToCourses.get(s).iterator();
}

/** 
 * Returns an iterator over all the students who are enrolled in the given course `c`. 
 * Each student should be yielded at most once, and no null entries should be returned. 
 */
public Iterator<Student> studentsIn(Course c){
  if (!coursesToStudents.containsKey(c)) {
    return Collections.emptyIterator();
  }
  return coursesToStudents.get(c).iterator();
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


/** 
 * Returns an iterator over all the students who are enrolled in at least one course, 
 * in the order that they first appeared in the rosters. 
 */
public Iterator<Student> students() {
  return studentsToCourses.keySet().iterator();
}

/** 
 * Returns an iterator over all the courses in which `s` is enrolled, listed in the 
 * order that they appeared in the rosters. 
 */
public Iterator<Course> coursesOf(Student s){
  if (!studentsToCourses.containsKey(s)) {
    return Collections.emptyIterator();
  }
  return studentsToCourses.get(s).iterator();
}

/** 
 * Returns an iterator over all the students who are enrolled in the given course `c`. 
 * Each student should be yielded at most once, and no null entries should be returned. 
 */
public Iterator<Student> studentsIn(Course c){
  if (!coursesToStudents.containsKey(c)) {
    return Collections.emptyIterator();
  }
  return coursesToStudents.get(c).iterator();
}