13. Linked Data

In the previous lecture, we introduced abstract data types (ADTs) and data structures to model collections of elements with a common type. We focused on the list ADT (in particular, our bespoke CS2110List interface), which models a collection of linearly-ordered data with an arbitrary size that can be accessed by index. We also developed one implementation of the list ADT that is backed by a dynamic array data structure. This implementation has some great features. In particular, the random access guarantee of the backing storage array lets us access (get()) and modify (set()) the element at any index of the list in \(O(1)\) time.

Our dynamic array implementation also has some drawbacks. Notably, the contiguous memory layout of arrays means that insertions and deletions could require expensive \(O(N)\) data shifts, and the fixed capacity of arrays necessitated occasional \(O(N)\) resize operations. While we saw that we could amortize the complexity of the add() method over multiple calls, its \(O(N)\) worst-case time complexity left something to be desired.

In today’s lecture, we’ll consider another data storage strategy, linked data, and see how we can leverage this idea to develop another CS2110List implementation with different performance characteristics.

Linked Data

An array stores a collection of data in a centralized way. The single array object can directly access all its entries (by either directly storing the values of primitive entries or storing a reference to each of its entry objects). It is this centralization that enables an array’s powerful random access guarantee. However, this same centralization imposes the fixed-capacity constraint on arrays; an object’s memory must be allocated upon its construction, meaning an array must determine how much space to reserve for its entries upfront.

As an alternative to this centralized approach of an array, we can conceptualize a data structure with decentralized storage. No longer would any object “see” the entire collection of the data structure. Instead, different objects would each have a “localized” view of a small portion of the overall structure. Critically, these objects need to be connected (or linked) together in some way that allows us to navigate the entire structure and access all its data.

Decentralization offers a more flexible method of data storage at the trade-off of some additional complexity of navigating the structure. To offer a real-world analogy before we dive into the details of programming these decentralized structures, let’s consider the process of writing a reference book for some subject (for example, math or a subset of math such as geometry). When the subject was new and its foundational concepts were being developed, it was conceivable that one person could master it in its entirety. All the information could be written in a single book volume, and the index of that book would provide an efficient (binary search) way to find any definition, theorem, or example.

Soon, though, as more is discovered, the amount of information will exceed what can be reasonably printed in one book, and the work will need to be split into volumes. The last page of the first volume presents a reference to the next volume. If you want to master the entire subject, you’ll need to read the entire first volume, use the reference to track down a copy of the second volume, and continue your reading. As the subject develops further, more and more volumes will be needed. Each of the publications on its own will contain only a tiny fraction of what is known about the subject and provide references only to the closest related publications. Nevertheless, by following these references, it should (in theory) still be possible to navigate through all the publications to access any desired information about the subject.

Remark:

At some point, different branches of the subject will begin to diverge, so the reading will stop being nice and sequential and instead branch out into a tree or web of ideas. We'll talk about these other ways to link data soon, but for now we'll restrict our focus to linear (or sequential) linking.

As another complication to the analogy, references tend to be reverse chronological with the most recent work referencing the work that it builds upon, which cites the work that it builds upon, etc. Thus, when we're just starting out in the subject, we'll often need to follow a lot of references before we "link back" to a sufficiently foundational presentation that we can understand.

Modeling Linked Data

As our reference books analogy hopefully demonstrates, storage objects in a decentralized data structure have a dual responsibility. First, they are responsible for storing (or carrying) a portion of the data. Second, they are responsible for conveying how to access additional data beyond what they store. Let’s take this idea to its extreme, where each of these storage objects stores one data element. To link these objects together, we can imagine arranging them in a line and having each object store a reference to the next object in the line (and having the last object in the line simply remember the fact that it is the last object). Then, as long as we can access the first object in the line, we can follow these references to eventually reach any element. The data structure that we are describing is a linked chain, and each of the storage objects that compose it is a node.

Definition: Linked Chain, Node

A linked chain is a data structure that stores linearly-ordered data in a collection of nodes, where each node is an auxiliary object that stores one data element and a reference to one or more other nodes.

To create a linked chain data structure, we will need to define two classes, the class for the LinkedChain itself as well as the auxiliary (i.e., supplementary or supporting) class for the chain Nodes. Both of these classes will need to be generic on the type of data that the chain stores (which we’ll again represent with the generic type parameter T). Let’s start by considering the Node class.

1

class Node<T> { ... }

1

class Node<T> { ... }

What fields should a Node have? First, it will need a field to store its data element (data) that will have the generic type T. Next, it will need a field to store a reference to the next Node (next). This field will have the generic type Node<T>, the same as the type we are defining. In this way, these Nodes are our first example of a recursive type.

1
2
3
4
5
6
7


class Node<T> { 
  /** The element stored in this node. */
  T data;

  /** A reference to the next node in the list. */
  Node<T> next;
}

1
2
3
4
5
6
7


class Node<T> { 
  /** The element stored in this node. */
  T data;

  /** A reference to the next node in the list. */
  Node<T> next;
}

We’ll need a way to model the end of the list, which we’ll do with a special Node object. This Node will have data == null and next == null (signaling that there isn’t a “next” node in the chain). Otherwise, all the earlier nodes must have data != null (they must store an element) and next != null (they must point to the next element in the chain).

Remark:

Some references will prefer to have the last Node storing an element serve as the end of the list (with next == null). This removes the need for the final "empty" Node in the chain, but it adds complexity to operations on linked chains. I prefer this approach with "empty" Nodes for singly-linked chains. See Exercise 13.8.a for more information about the alternate approach.

Now that we have modeled the Node type, let’s model the LinkedChain. In its simplest implementation, all that the chain must store is a reference to its first node, which we’ll call the head node.

Definition: Head

We use the term head to refer to the first node in a linked chain.

1
2
3
4


public class LinkedChain<T> {
  /** The first node in this linked chain. */
  private Node<T> head;
}

1
2
3
4


public class LinkedChain<T> {
  /** The first node in this linked chain. */
  private Node<T> head;
}

Through the head, all the other Nodes can be accessed. When we use a linked chain to implement the CS2110List interface, we’ll use additional fields to improve the performance of our code.

Visualizing Linked Data

Next, let’s visualize the structure of a linked chain. Suppose that chain is a LinkedChain<String> object storing two elements, “apple” and “banana”, in that order. We can visualize this using our usual object diagram notation.

This visualization includes complete information about the structure of this LinkedChain, including all six objects in its representation. While such a complete representation is a nice reference to have, it becomes cumbersome to draw, even for modestly sized chains. For this reason, we will often use an abbreviated representation, which we’ll refer to as a node diagram that visualizes the underlying connectivity of the linked data but abstracts away many of the other visual details.

Definition: Node Diagram

A node diagram provides a simplified visualization of a linked structure.

We represent the nodes in a node diagram using rounded rectangles with vertical bars to delineate their data and pointers to (an)other node(s). Where it is practical (such as for primitive wrappers or Strings), we store the data directly within the node visualization instead of drawing a reference to a separate object. In this way, we can simplify the depiction of the “apple” node in the above diagram as follows.

Object Diagram

Node Diagram

Additionally, we omit the main LinkedChain object from the visualization and instead use labeled arrows to depict the node objects referenced by its fields. Thus, the node diagram for the linked chain is:

Encapsulation with Nested Classes

We just saw that using an auxiliary Node class provides a natural representation of a linked chain. However, Nodes are an implementation detail. We would not want to expose their existence to the client of the linked chain, who should only be able to interact with the chain object as a whole. In other words, we would like a way to encapsulate the definition of this class. One thought is to use visibility modifiers on the class; however, there isn’t a natural option if Node is a “sibling class” of LinkedChain; we’d like Node to be marked private, but this would prevent access from within the LinkedChain class, where it is necessary. protected visibility also does not work since LinkedChain is not a subclass of Node. Since public visibility provides no encapsulation, we’ll need a different mechanism. The main issue is that Node currently can’t distinguish between the LinkedChain class and another potential client. We need a way to relate these classes, and we can do this by nesting them.

Definition: Nested Class, Outer Class

A nested class is one that is declared within the scope of another class, called its outer class.

There are two principal variants of nested classes that we will consider, differentiated through the use of the static keyword.

Definition: Static Nested Class, Inner Class

A static nested class is a nested class declared with the static keyword. Instances of a static nested class do not have access to the fields of the outer class instance that constructed them.

An inner class is a nested class declared without the static keyword. Instances of an inner class have access to the fields of the outer class instance that constructed them.

In the case of our linked chain, Node objects will not need access to the head field of the LinkedChain that they are a part of, so it makes sense to model Node as a static nested class.

Remark:

Static nested classes are preferred to inner classes whenever possible, since they have simpler dependencies that allow for better optimization by the compiler. When declaring a nested class, you should always consider whether it is possible for it to be static.

This gives us the encapsulated class structure.

LinkedChain.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


public class LinkedChain<T> {
  /**
   * A node in this singly-linked list.
   */
  private static class Node<T> {
    /**
     * The element stored in this node. `null` is used to indicate the node at 
     * the end of the chain.
     */
    T data;

    /**
     * A reference to the next node in the list. May only be null if 
     * `data == null`.
     */
    Node<T> next;

    // Node methods
  }

  /** The first node in this linked chain. */
  private Node<T> head;

  // LinkedChain methods
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


public class LinkedChain<T> {
  /**
   * A node in this singly-linked list.
   */
  private static class Node<T> {
    /**
     * The element stored in this node. `null` is used to indicate the node at 
     * the end of the chain.
     */
    T data;

    /**
     * A reference to the next node in the list. May only be null if 
     * `data == null`.
     */
    Node<T> next;

    // Node methods
  }

  /** The first node in this linked chain. */
  private Node<T> head;

  // LinkedChain methods
}

Remark:

Note the choice of visibility modifier for this static nested class. The Node class is marked as private so that it will not be accessible to any external classes. It will, however, be accessible anywhere within the LinkedChain class, which is its class scope. This achieves the encapsulation that we desired. The choice of visibility modifier for the fields of the Node class is irrelevant, as these will be fully visible everywhere within the LinkedChain class and invisible outside of it regardless of this choice.

Next, let’s use linked chains to provide an alternate implementation of the CS2110List interface, which we’ll call a SinglyLinkedList.

`SinglyLinkedList` Class

We refer to our list as singly-linked because each node has a single reference to the next node. This contrasts with a doubly-linked list that has references to both the next and previous nodes in the chain.

Definition: Singly-/Doubly-Linked List

In a singly-linked list, each Node contains a single reference to the next Node after it in the linked chain. Links can only be traversed "forward" in the list.

In a doubly-linked list, each Node contains references both to the next and previous Nodes in the linked chain. The list can be traversed in both the forward and backward directions.

Java’s LinkedList class is a doubly-linked list, and we explore doubly-linked lists more in Exercise 13.3. Both list implementations have their trade-offs. Doubly-linked lists require more memory (to store the backward references) and have a more complicated class invariant to manage. The lack of backward references presents a design challenge when implementing some of the methods of singly-linked lists.

State Representation

Following the general linked chain design that we outlined above, we’ll declare a SinglyLinkedList class that implements our CS2110List interface and contains an inner Node class. Since the fields of Node are visible within the SinglyLinkedList class, there is no need to declare any accessor or mutator methods for them. We’ll only define a Node constructor that initializes both fields. Within the SinglyLinkedList class, we’ll have 3 fields. Similar to the LinkedChain class, we’ll define a head field that references the first Node in the chain. Additionally, we’ll define a tail field that references the last Node. We’ll see that this tail reference will significantly improve the runtime of the add() method. Finally, we’ll define a size field that will improve the efficiency of the size() method (similar to the DynamicArrayList class).

SinglyLinkedList.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43


/** An implementation of the CS2110List ADT using a singly-linked list. */
public class SinglyLinkedList<T> implements CS2110List<T> {
  /** A node in this singly-linked list. */
  private static class Node<T> {
    /**
      * The element stored in this node. `null` is used to indicate the 
      * node at the end of the list.
      */
    T data;

    /**
     * A reference to the next node in the list. May only be `null` if `data == null`.
     */
    Node<T> next;

    /**
     * Constructs a new Node object with the given `data` and `next` reference.
     */
    Node(T data, Node<T> next) {
      this.data = data;
      this.next = next;
    }
  }

  /** The first node in this list. */
  private final Node<T> head;

  /** The final (empty) node in this list. */
  private Node<T> tail;

  /**
    * The current size of this list, equal to the number of `Node`s in the chain 
    * starting at (and including) `head` and ending at (and excluding) `tail`.
    */
  private int size;

  @Override
  public int size() {
      return size;
  }

  // ... other `SinglyLinkedList` methods 
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43


/** An implementation of the CS2110List ADT using a singly-linked list. */
public class SinglyLinkedList<T> implements CS2110List<T> {
  /** A node in this singly-linked list. */
  private static class Node<T> {
    /**
      * The element stored in this node. `null` is used to indicate the 
      * node at the end of the list.
      */
    T data;

    /**
     * A reference to the next node in the list. May only be `null` if `data == null`.
     */
    Node<T> next;

    /**
     * Constructs a new Node object with the given `data` and `next` reference.
     */
    Node(T data, Node<T> next) {
      this.data = data;
      this.next = next;
    }
  }

  /** The first node in this list. */
  private final Node<T> head;

  /** The final (empty) node in this list. */
  private Node<T> tail;

  /**
    * The current size of this list, equal to the number of `Node`s in the chain 
    * starting at (and including) `head` and ending at (and excluding) `tail`.
    */
  private int size;

  @Override
  public int size() {
      return size;
  }

  // ... other `SinglyLinkedList` methods 
}

What is the class invariant for the SinglyLinkedList? First, there are the constraints on the Nodes. Every node except for the tail must have non-null data and next, and the tail node must have data == null and next == null. Additionally, the value of size must be correct; it must be the number of “links” that we must follow to traverse the list from the head node to the tail node. We can write an assertInv() method that checks these invariants.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


/**
 * Asserts the SinglyLinkedList class invariant.
 */
private void assertInv() {
  Node<T> current = head;
  int currentSize = 0;
  while (current != tail && currentSize <= size) {
      assert current.data != null;
      assert current.next != null;
      currentSize++;
      current = current.next;
  }
  assert currentSize == size; // size is correct, no circular linking
  // when the above assertion succeeds, we know that current == tail
  assert current.data == null;
  assert current.next == null;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


/**
 * Asserts the SinglyLinkedList class invariant.
 */
private void assertInv() {
  Node<T> current = head;
  int currentSize = 0;
  while (current != tail && currentSize <= size) {
      assert current.data != null;
      assert current.next != null;
      currentSize++;
      current = current.next;
  }
  assert currentSize == size; // size is correct, no circular linking
  // when the above assertion succeeds, we know that current == tail
  assert current.data == null;
  assert current.next == null;
}

We’ve included a somewhat subtle, but important check in our loop guard, namely that currentSize <= size. If this is ever false, we will know that the class invariant is not satisfied, and we will fail the following assert statement, so we can break out of the loop. This extra guard avoids the possibility of entering an infinite loop if we accidentally “circularly link” our chain of Nodes.

It is a good idea to include such a check when you are developing a linearly linked structure.

Constructor

For the SinglyLinkedList constructor, we must initialize the fields to model an empty list satisfying the class invariant. We should initialize size = 0, but what about head and tail? tail must always refer to a Node with a data = null and next = null, so we can construct such a node and assign a reference to tail. When the list is empty, it contains no other Nodes, so this tail node is the first node in the chain, meaning it is also the head node.

SinglyLinkedList.java

1
2
3
4
5
6
7
8
9


/**
 * Constructs a new, initially empty, SinglyLinkedList.
 */
public SinglyLinkedList() {
  size = 0;
  tail = new Node<>(null, null);
  head = tail;
  assertInv();
}

1
2
3
4
5
6
7
8
9


/**
 * Constructs a new, initially empty, SinglyLinkedList.
 */
public SinglyLinkedList() {
  size = 0;
  tail = new Node<>(null, null);
  head = tail;
  assertInv();
}

To test our SinglyLinkedList implementation, we can use the same CS2110List tests that we wrote for the previous lecture. All we need to do is add a second CS2110ListTest subclass, SinglyLinkedListTest, whose constructList() method calls the SinglyLinkedList constructor.

CS2110ListTest.java

1
2
3
4
5
6


class SinglyLinkedListTest extends CS2110ListTest {
    @Override
    public <T> CS2110List<T> constructList() {
        return new SinglyLinkedList<>();
    }
}

1
2
3
4
5
6


class SinglyLinkedListTest extends CS2110ListTest {
    @Override
    public <T> CS2110List<T> constructList() {
        return new SinglyLinkedList<>();
    }
}

When we run these tests, we see that the testEmptyAtConstruction() test passes, confirming that our constructor works as intended and establishes the class invariant. We are now ready to implement the rest of the CS2110List methods for the SinglyLinkedList.

Remark:

While we have provided the complete code for all the SinglyLinkedList methods (hidden behind spoilers in these lecture notes), we recommend that you take some time to implement them yourself. You can use the discussion and animations below to help guide you through their implementations, which can be rather subtle and intricate. The best way to become comfortable managing linked structures, which we will do consistently throughout the rest of the course, is to practice implementing them and work through debugging these implementations.

Locating Elements by Index

We’ll separate the remaining CS2110List methods according to some of their behaviors, which we will extract into private helper methods. The first behavior we will consider is accessing a particular Node in the list based on its index. This will be required for all the methods that have an index as one of their parameters: insert(), get(), set(), and remove(). The random access guarantee of the backing storage array in our DynamicArrayList gave us access to all its contents. However, the decentralized nature of linked lists precludes this and motivates the following nodeAtIndex() helper method.

SinglyLinkedList.java

1
2
3
4
5


/**
 * Returns a reference to the node at the given `index` (counting from 0) in this 
 * linked list. Requires that `0 <= index <= size`.
 */
private Node<T> nodeAtIndex(int index) { ... }

1
2
3
4
5


/**
 * Returns a reference to the node at the given `index` (counting from 0) in this 
 * linked list. Requires that `0 <= index <= size`.
 */
private Node<T> nodeAtIndex(int index) { ... }

Let’s think about how to define such a method. If you were given the following SinglyLinkedList<Character>, how would you determine the value of nodeAtIndex(2)?

Most likely, you started at the beginning of the list and then traversed the chain of Nodes, counting up (starting from 0) as you went. When your count reached the target index (2), you knew that you were at the desired node and could return a reference to it. Since this “traversing and counting” procedure can require an arbitrarily large number of steps (depending on the value of index), we’ll model it with a loop.

Remark:

So far, the loops that we have considered in this course have iterated over the entries of arrays, so we could model the progress of the loop with an array diagram. Now, we will start to iterate over more abstract things, such as the linked structure of our list object. The same steps for developing loops (determine the loop variables, write the loop invariant, etc.) still apply.

During the loop, we’ll need to keep track of our current position in the list, which we can do with a Node variable current. We’ll also need to keep track of how many nodes we have traversed so far, which we can do with an int variable i. What is the loop invariant that describes the relationship between current and i?

View the loop invariant.

At any point in the loop, i keeps track of the number of nodes that we have traversed so far, and current holds a reference to the next node that we will visit. Therefore, our loop invariant is that current is the node at index i in the list.

Now, we can use this loop invariant to complete the definition of nodeAtIndex().

nodeAtIndex() definition

SinglyLinkedList.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


/**
 * Returns a reference to the node at the given `index` (counting from 0) in this linked list.
 * Requires that `0 <= index <= size`.
 */
private Node<T> nodeAtIndex(int index) {
  assert 0 <= index && index <= size; // defensive programming

  int i = 0;
  Node<T> current = head;
  /* Loop invariant: `current` is the Node at index `i` in this list. */ 
  while (i < index) {
      current = current.next;
      i++;
  }
  return current;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


/**
 * Returns a reference to the node at the given `index` (counting from 0) in this linked list.
 * Requires that `0 <= index <= size`.
 */
private Node<T> nodeAtIndex(int index) {
  assert 0 <= index && index <= size; // defensive programming

  int i = 0;
  Node<T> current = head;
  /* Loop invariant: `current` is the Node at index `i` in this list. */ 
  while (i < index) {
      current = current.next;
      i++;
  }
  return current;
}

After our defensive programming assertions, we initialize i to 0 (since we have not traversed any Nodes before entering the loop) and initialize current to head (the first node that we will visit). We guard our loop on the condition i < index, as this indicates that we must continue our traversal to reach the node at the desired index. We break out of the loop when i == index; the loop invariant tells us that, at that point, current is the correct return value. Within the loop, we advance current to point to the next node in the chain. From our Node specifications, we see that current.next stores a reference to this node, so can reassign current = current.next. Then, we must increment i (since we have "visited" one more node) to restore the loop invariant.

Note that the specification for nodeAtIndex() allows the possibility that index = size, in which case the tail node will be returned. Now, we can use nodeAtIndex() to define the get() and set() methods.

get() definition

SinglyLinkedList.java

1
2
3
4
5


@Override
public T get(int index) {
    assert index < size;
    return nodeAtIndex(index).data;
}

1
2
3
4
5


@Override
public T get(int index) {
    assert index < size;
    return nodeAtIndex(index).data;
}

The get() method has the stricter pre-condition than nodeAtIndex() that index < size, which we verify with an additional defensive programming assertion. After this, we use nodeAtIndex() to access the desired node and access the data field of this node, which contains its element.

set() definition

SinglyLinkedList.java

1
2
3
4
5
6
7
8
9


@Override
public void set(int index, T elem) {
  assert elem != null; // defensive programming
  assert index < size;

  Node<T> node = nodeAtIndex(index);
  node.data = elem;
  assertInv();
}

1
2
3
4
5
6
7
8
9


@Override
public void set(int index, T elem) {
  assert elem != null; // defensive programming
  assert index < size;

  Node<T> node = nodeAtIndex(index);
  node.data = elem;
  assertInv();
}

After our defensive programming assertions, we again use nodeAtIndex() to obtain a reference to the desired node. This time, we assign elem to its data field, which has the effect of modifying the element stored at this index of the list. Since set() is a mutating method, we should end its definition by re-asserting the class invariant.

Finding Elements

The next two methods that we will consider, contains() and indexOf(), involve searching for a particular element within the list. Both of these methods will require loops that have a very similar structure to the nodeAtIndex() loop. Since the elements in a list are unordered, both of these methods essentially perform a linear search over the list. Take some time to design the loop invariants for these methods and use them to complete their definitions in accordance with their specifications. Think carefully about how you should compare elements within these methods.

contains() definition

SinglyLinkedList.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


@Override
public boolean contains(T elem) {
  Node<T> current = head;
  /* Loop invariant: None of the Nodes before `current` contain `elem`. */
  while (current != tail) {
    if (current.data.equals(elem)) {
        return true;
    }
    current = current.next;
  }
  return false;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


@Override
public boolean contains(T elem) {
  Node<T> current = head;
  /* Loop invariant: None of the Nodes before `current` contain `elem`. */
  while (current != tail) {
    if (current.data.equals(elem)) {
        return true;
    }
    current = current.next;
  }
  return false;
}

For this method, we require a single loop variable, current which keeps track of our current position in the list (i.e., it references the next Node that we will "visit"). This variable is initialized to head (the start of the list). The loop is guarded by the condition current != tail which says that we have not yet reached the end of the list (i.e., visited all nodes). When the loop guard becomes false, our loop invariant tells us that none of the nodes before current == tail contain elem, meaning elem is not in the list and we should return false. Within the loop body, we check whether current contains elem, and if so return true. Otherwise, we can re-assign current = current.next, advancing it one position forward in the list, while maintaining the loop invariant. Notice that two different notions of equality are checked within this method. Within the loop guard, we check for reference equality when we write current != tail. We want to know when the current and tail variables point to the same Node object, as this will signal that we have reached the end of the list. Within the if condition, we compare current.data and elem using the equals() method. We want to know if these have elements the same value (even if these values are stored in different objects). Note that equals() for wrapper classes behaves like == for their primitive types.

indexOf() definition

SinglyLinkedList.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


@Override
public int indexOf(T elem) {
  int i = 0;
  Node<T> current = head;
  /* Loop invariant: `current` is the Node at index `i` in this list,
   * None of the nodes before `current` contain `elem`. */
  while (!current.data.equals(elem)) {
    current = current.next;
    i++;
  }
  return i;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


@Override
public int indexOf(T elem) {
  int i = 0;
  Node<T> current = head;
  /* Loop invariant: `current` is the Node at index `i` in this list,
   * None of the nodes before `current` contain `elem`. */
  while (!current.data.equals(elem)) {
    current = current.next;
    i++;
  }
  return i;
}

This method has the exact same structure as nodeAtIndex(), just with a different loop guard. Both loops simultaneously increment an index-tracking variable i while using a Node variable current to traverse the nodes in the list. While in nodeAtIndex(), we wanted to traverse until we reached a particular index, in indexOf() we want to traverse until we find a particular node. Therefore, we guard the loop on the condition that !current.data.equals(elem) (i.e., the element that we are "visiting" is not the target element); this is the same loop guard as linear search. When we break out of the loop, it will be because current.data.equals(elem), and the loop invariant ensures that this is the first occurrence of elem in this list. Note that the pre-condition that contains(elem) is true prevents the possibility of a `NullPointerException`.

Adding Elements

Next, we’ll consider the methods that add new Nodes to the linked list, add() and insert(). Most of the work of these methods is identical: both must create a new Node object, “wire” it into the chain, and then update the fields to restore the class invariant. Let’s extract this common work into a private helper method that we’ll call spliceIn().

SinglyLinkedList.java

1
2
3
4
5


/**
 * Adds the given `elem` to the given `node` in this list, moving the `node`'s current 
 * contents to a new subsequent node to make space. Requires that `elem != null`.
 */
private void spliceIn(Node<T> node, T elem) { ... }

1
2
3
4
5


/**
 * Adds the given `elem` to the given `node` in this list, moving the `node`'s current 
 * contents to a new subsequent node to make space. Requires that `elem != null`.
 */
private void spliceIn(Node<T> node, T elem) { ... }

Observe that this method accepts two arguments, a reference to a node where we’d like the new element to go (in place of its current contents) and the element elem that we’d like to place there. Let’s visualize the effect of this method. Suppose that we want to add ‘B’ into a list at the index where ‘C’ currently resides.

This should result in a list that looks like:

How do we accomplish this? We need to be careful as we develop this procedure so that we will never lose access to any objects or data that we’ll need. It will be helpful to draw a series of node diagrams to keep track of our progress. To complete this “splice”, we’ll need to construct a new Node object to add to the list. Which Node in the “after” picture will this be? You might be tempted to answer the ‘B’ node; after all, there was no ‘B’ node in the “before” picture. However, this will not work. To “wire” this ‘B’ node into the chain, its preceding node (the ‘A’ node) would need to update the reference stored in its next field. However, we have no good way to access the ‘A’ node, since our node variable points to later in the list.

In singly-linked lists, we can't go "backwards" so all the work must happen "ahead" of the node that we can reference.

The following animation walks through an alternate approach.

We can use the ideas from this animation to develop a definition of the spliceIn() method. Since this is a mutating method, we must make sure to restore the class invariant by considering all the fields.

spliceIn() definition

SinglyLinkedList.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


/**
 * Adds the given `elem` to the given `node` in this list, moving the `node`'s current contents
 * to a new subsequent node to make space. Requires that `elem != null`.
 */
private void spliceIn(Node<T> node, T elem) {
    assert elem != null; // defensive programming

    node.next = new Node<>(node.data, node.next);
    node.data = elem;
    if (node == tail) {
        tail = node.next;
    }
    size++;
    assertInv();
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


/**
 * Adds the given `elem` to the given `node` in this list, moving the `node`'s current contents
 * to a new subsequent node to make space. Requires that `elem != null`.
 */
private void spliceIn(Node<T> node, T elem) {
    assert elem != null; // defensive programming

    node.next = new Node<>(node.data, node.next);
    node.data = elem;
    if (node == tail) {
        tail = node.next;
    }
    size++;
    assertInv();
}

We must increment the size field to account for the new Node that has been spliced into the list. The head node cannot change as a result of the splice since the splice happens "after" the node pointer. Even if node == head, that same node object will still sit at the start of the list after the splicing is complete. A more subtle scenario happens when node == tail. We trace through this possibility in the following animation.

Now that we have defined the spliceIn() helper method, defining the add() and insert() methods should be very straightforward.

add() definition

SinglyLinkedList.java

1
2
3
4


@Override
public void add(T elem) {
    spliceIn(tail, elem);
}

1
2
3
4


@Override
public void add(T elem) {
    spliceIn(tail, elem);
}

The add() method performs a splice at the end of the list, we can achieve by calling spliceIn() with node = tail.

insert() definition

SinglyLinkedList.java

1
2
3
4


@Override
public void insert(int index, T elem) {
  spliceIn(nodeAtIndex(index), elem);
}

1
2
3
4


@Override
public void insert(int index, T elem) {
  spliceIn(nodeAtIndex(index), elem);
}

The insert() method performs a splice to a node at a particular index in the list, which we can obtain using our nodeAtIndex() helper method.

Removing Elements

It remains to implement the remove() and delete() methods to handle deletions from the list. Similar to spliceIn(), we’ll extract the common behavior of these methods into a private helper method spliceOut() that accepts a Node reference and takes care of the “re-wiring”.

SinglyLinkedList.java

1
2
3
4
5


/**
 * Removes the given `node` from this linked list and returns its `data` field.
 * Requires that `node != tail`.
 */
private T spliceOut(Node<T> node) { ... }

1
2
3
4
5


/**
 * Removes the given `node` from this linked list and returns its `data` field.
 * Requires that `node != tail`.
 */
private T spliceOut(Node<T> node) { ... }

To plan out the definition of this method, let’s again start with “before” and “after” pictures. Suppose that we wish to remove the node containing ‘B’ from the following list.

After we do this, we’ll need to return ‘B’, and the list will be in the following state:

We see that we’ll need to remove a Node object from the chain, but which one will it be? While our first intuition may be to remove the ‘B’ node, this will again be thwarted by the single-linking; there is no easy way to update the next reference of the ‘A’ node. Instead, it will be easier to remove the ‘C’ node using the process suggested by the following animation.

We can use the ideas from this animation to develop a definition of the spliceOut() method. Since this is another mutating method, we must again make sure to restore the class invariant by considering all the fields.

spliceOut() definition

SinglyLinkedList.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


/**
 * Removes the given `node` from this linked list and returns its `data` field. 
 * Requires that `node != tail`.
 */
private T spliceOut(Node<T> node) {
  assert node != tail;
  T removed = node.data;
  node.data = node.next.data;
  node.next = node.next.next;
  size--;
  if (node.next == null) {
      tail = node;
  }
  assertInv();
  return removed;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


/**
 * Removes the given `node` from this linked list and returns its `data` field. 
 * Requires that `node != tail`.
 */
private T spliceOut(Node<T> node) {
  assert node != tail;
  T removed = node.data;
  node.data = node.next.data;
  node.next = node.next.next;
  size--;
  if (node.next == null) {
      tail = node;
  }
  assertInv();
  return removed;
}

We must decrement the size field to account for the Node that has been spliced out of the list. Similar to spliceIn(), the head node will not change since the splice happens after the node pointer. We must again contend with the possibility of updating the tail pointer. For spliceOut(), this occurs when the last element of the list is removed, as demonstrated by the following animation.

Now that we have defined the spliceOut() helper method, defining the remove() and delete() methods should be very straightforward.

remove() definition

SinglyLinkedList.java

1
2
3
4
5


@Override
public T remove(int index) {
  assert index < size;
  return spliceOut(nodeAtIndex(index));
}

1
2
3
4
5


@Override
public T remove(int index) {
  assert index < size;
  return spliceOut(nodeAtIndex(index));
}

The remove() method performs a splice of a node at a particular index in the list, which we can obtain using our nodeAtIndex() helper method.

delete() definition

SinglyLinkedList.java

1
2
3
4
5
6
7
8
9


@Override
public void delete(T elem) {
  Node<T> current = head;
  /* Loop invariant: None of the Nodes before `current` contain `elem`. */
  while (!current.data.equals(elem)) {
      current = current.next;
  }
  spliceOut(current);
}

1
2
3
4
5
6
7
8
9


@Override
public void delete(T elem) {
  Node<T> current = head;
  /* Loop invariant: None of the Nodes before `current` contain `elem`. */
  while (!current.data.equals(elem)) {
      current = current.next;
  }
  spliceOut(current);
}

The delete() method performs a splice of the first node that contains a particular element. We can locate this node using a similar loop as in the indexOf() method.

Complexity of `SinglyLinkedList`

We’ll again let the variable \(N\) denote the size of our list. In this way, we’ll be able to directly compare the performance characteristics of the SinglyLinkedList and the DynamicArrayList from the previous lecture.

Space Complexity

Besides the \(O(1)\) space required for the main SinglyLinkedList object, each Node object takes up \(O(1)\) space to store a reference to its data element and the next node. Therefore, the overall memory footprint of a SinglyLinkedList is \(O(N)\). Moreover, none of the methods of the SinglyLinkedList require more than \(O(1)\) additional storage for local variables, so they all have worst case \(O(1)\) space complexity. This is an improvement over the \(O(N)\) space complexity of the additive methods of the DynamicArrayList. The decentralized storage of linked lists allowed us to avoid reallocating and copying large chunks of memory. In addition, the contiguous memory requirement of a DynamicArrayList may present a challenge in memory-constrained systems. Such a requirement is not present for SinglyLinkedLists.

Time Complexity

Next, let’s analyze the worst-case time complexities for the SinglyLinkedList methods. Since the method definitions are still relatively short, we summarize these analyses below. Recall that we do not factor the runtime of any assertInv() calls into our analysis, as assert statements will be ignored (or omitted) in production code.

size(): \(O(1)\), consisting of a single memory access.

nodeAtIndex(): \(O(N)\), more specifically, we require \(O(\texttt{index})\) time to linearly traverse the linked chain from head to reach the target node. Since \(\texttt{index}\) can be as large as \(N\), our worst-case time complexity is \(O(N)\).

get(): \(O(N)\), bounded by the runtime of nodeAtIndex().

set(): \(O(N)\), bounded by the runtime of nodeAtIndex().

contains(): \(O(N)\), each iteration of the while loop does \(O(1)\) work, and there can be at most \(N\) iterations.

indexOf(): \(O(N)\), each iteration of the while loop does \(O(1)\) work, and there can be at most \(N\) iterations.

spliceIn() \(O(1)\), this method consists of a fixed number of assignment statements and one conditional statement. We use the node parameter to directly access the location of the splice, eliminating the need for a traversal over the list.

add(): \(O(1)\), the tail field gives us direct access to the end of the list, where the splice takes place.

insert(): \(O(N)\), bounded by the runtime of nodeAtIndex().

spliceOut() \(O(1)\), similar to spliceIn(), this method consists of a fixed number of assignment statements and one conditional statement.

remove(): \(O(N)\), bounded by the runtime of nodeAtIndex().

delete(): \(O(N)\), each iteration of the while loop does \(O(1)\) work, and there can be at most \(N\) iterations.

We have some performance differences between SinglyLinkedLists and DynamicArrayLists. The decentralized storage of linked lists permits faster update operations given a reference to the splice location. In particular, SinglyLinkedLists support O(1) insertion at the head and tail and O(1) removal at the head. This improves upon the DynamicArrayList, which has O(N) worst-case insertions and removals due to shifting and resizing. Even the O(1) insertion at the end of a DynamicArrayList can only be obtained with an amortized analysis (as opposed to the worst-case time complexity guarantee for a SinglyLinkedList). On the other hand, DynamicArrayLists outperform SinglyLinkedLists for methods that can leverage the random access guarantee of arrays to get \(O(1)\) access to a particular index (e.g., get() and set()).

The preceding paragraph illustrates that neither the SinglyLinkedList nor the DynamicArrayList is strictly preferable to the other. Rather, the better data structure will vary across different use cases. For applications where many modifications happen at the start or end of a list (we’ll see examples of this soon), SinglyLinkedLists are likely a better choice. They are also beneficial in applications where resource requirements make expensive array copies (even periodic ones) undesirable. For applications where data is not added or deleted, but instead is frequently queried (such as some databases), DynamicArrayLists are likely a better choice because of their random access guarantees.

Main Takeaways:

A linked data structure has decentralized storage, where different objects each store a portion of the data as well as references to other storage objects.
We call the auxiliary storage objects in linked data structures its nodes. To encapsulate a linked data structure, we often define the nodes in a nested class within the data structure class.
Instances of static nested classes do not have access to fields of the outer class. Inner classes are non-static nested classes, and their objects can access outer class fields.
Node diagrams provide a simplified representation of the connections in a linked structure and are useful for developing methods.
Linked lists support some worst-case O(1) insertion and removal methods. Accessing an element in a linked list via its index (e.g., in get() or set()) is a worst-case \(O(N)\) operation since the decentralized storage of linked lists does not provide a random access guarantee.

Exercises

Exercise 13.1: Check Your Understanding

(a)

Consider the above immutable singly-linked chain of chars pointed to by Node variable h. We would like to replace the first element with the value ‘H’ and point to the resulting list from a new Node variable j. How many new Node objects must be created in order to do this?

Check Answer

(b)

Class Node represents an un-encapsulated, mutable node in a singly-linked chain.

1
2
3
4


class Node {
  Object data;
  Node next;
}

1
2
3
4


class Node {
  Object data;
  Node next;
}

Now consider the following algorithm.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


static Node alg1(Node n) {
  Node ans = null;
  while (n != null) {
    Node t = n.next;
    n.next = ans;
    ans = n;
    n = t;
  }
  return ans;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


static Node alg1(Node n) {
  Node ans = null;
  while (n != null) {
    Node t = n.next;
    n.next = ans;
    ans = n;
    n = t;
  }
  return ans;
}

What does alg1() do? Hint: Work out a small example on paper with a node diagram.

Check Answer

(c)

Suppose we have the same Node class from the previous subproblem.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


static Node alg2(Node n) {
  if (n == null || n.next == null) {
    return n;
  }
  Node r = n.next;
  Node t = r.next;
  r.next = n;
  n.next = alg2(t);
  return r;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


static Node alg2(Node n) {
  if (n == null || n.next == null) {
    return n;
  }
  Node r = n.next;
  Node t = r.next;
  r.next = n;
  n.next = alg2(t);
  return r;
}

What does alg2() do?

This algorithm recursively swaps every two adjacent nodes in a singly-linked list.

Exercise 13.2: Efficiency of Singly-Linked Lists

Why does our implementation of a linked list have \( O(1) \) addition to both the head and tail, but only \(O(1) \) removal from the head?

Exercise 13.3: Doubly-Linked Lists

In a doubly-linked list, each node has two pointers: next and prev, pointing to the previous node. We can also visualize an implementation of this doubly-linked list as a node diagram, where the first box now represents the prev pointer.

Consider the partial class definition for a DoublyLinkedList. In this implementation, we have an "empty" node at both the start and end of the doubly-linked list. This makes implementations for certain methods far simpler.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53


/** A doubly linked list. */
public class DoublyLinkedList<T> implements CS2110List<T> {
  /** A node in this doubly-linked list. */
  private static class Node<T> {
    /**
     * The element stored in this node. `null` is used to indicate the 
     * empty nodes at the start and end of the list.
     */
    T data;

    /**
     * A reference to the next node in the list. May only be null if `data == null`.
     */
    Node<T> next;

    /**
     * A reference to the previous node in the list. May only be null if `data == null`.
     */
    Node<T> prev;

    /**
     * Constructs a new Node object with the given `data`, `next`, and `prev` reference.
     */
    Node(T data, Node<T> prev, Node<T> next) {
      this.data = data;
      this.prev = prev;
      this.next = next;
    }
  }

  /** The first (empty) node in this list. */
  private final Node<T> head;

  /** The final (empty) node in this list. */
  private final Node<T> tail;

  /**
   * The current size of this list, equal to the number of `Node`s in the chain 
   * starting at (and excluding) `head` and ending at (and excluding) `tail`.
   */
  private int size;

  /**
   * Constructs a new, initially empty, DoublyLinkedList.
   */
  public DoublyLinkedList() {
    size = 0;
    head = new Node<>(null, null, null);
    tail = new Node<>(null, head, null);
    head.next = tail;
    assertInv();
  }
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53


/** A doubly linked list. */
public class DoublyLinkedList<T> implements CS2110List<T> {
  /** A node in this doubly-linked list. */
  private static class Node<T> {
    /**
     * The element stored in this node. `null` is used to indicate the 
     * empty nodes at the start and end of the list.
     */
    T data;

    /**
     * A reference to the next node in the list. May only be null if `data == null`.
     */
    Node<T> next;

    /**
     * A reference to the previous node in the list. May only be null if `data == null`.
     */
    Node<T> prev;

    /**
     * Constructs a new Node object with the given `data`, `next`, and `prev` reference.
     */
    Node(T data, Node<T> prev, Node<T> next) {
      this.data = data;
      this.prev = prev;
      this.next = next;
    }
  }

  /** The first (empty) node in this list. */
  private final Node<T> head;

  /** The final (empty) node in this list. */
  private final Node<T> tail;

  /**
   * The current size of this list, equal to the number of `Node`s in the chain 
   * starting at (and excluding) `head` and ending at (and excluding) `tail`.
   */
  private int size;

  /**
   * Constructs a new, initially empty, DoublyLinkedList.
   */
  public DoublyLinkedList() {
    size = 0;
    head = new Node<>(null, null, null);
    tail = new Node<>(null, head, null);
    head.next = tail;
    assertInv();
  }
}

(a)

Write an assertInv() method to assert the class invariants.

(b)

Override each method defined in the CS2110List interface. You may find it useful to define private helper methods, such as nodeAtIndex().

(c)

What is the runtime of each method? How does storing this extra prev field improve the runtime of some methods?

Exercise 13.4: Circular Singly-Linked Lists

Another version of a linked list is a circular linked list. Instead of the last node's next field pointing to an "empty" tail node, it points to head. View below to see a node diagram of a circular linked list.

Study the partial class definition below for CircularLinkedList. Note that we are not keeping track of a tail node, and we are using an "empty" node for the head.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44


/** A circular singly-linked list. */
public class CircularLinkedList<T> implements CS2110List<T> {
  /** A node in this singly circular linked list. */
  private static class Node<T> {
    /**
     * The element stored in this node. `null` is used to indicate the 
     * "empty" nodes at the start of the list.
     */
    T data;

    /**
     * A reference to the next node in the list. May only be null if `data == null`.
     */
    Node<T> next;

    /**
     * Constructs a new Node object with the given `data` and `next` reference.
     */
    Node(T data, Node<T> next) {
      this.data = data;
      this.next = next;
    }
  }

  /** The first (empty) node in this list. */
  private final Node<T> head;

  /**
   * The current size of this list, equal to the number of `Node`s in the chain 
   * starting at (and excluding) `head` and ending at (and including) the `Node`
   * whose `next` field is `head`.
   */
  private int size;

  /**
   * Constructs a new, initially empty, CircularLinkedList.
   */
  public CircularLinkedList() {
    size = 0;
    head = new Node<>(null, null);
    head.next = head;
    assertInv();
  }
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44


/** A circular singly-linked list. */
public class CircularLinkedList<T> implements CS2110List<T> {
  /** A node in this singly circular linked list. */
  private static class Node<T> {
    /**
     * The element stored in this node. `null` is used to indicate the 
     * "empty" nodes at the start of the list.
     */
    T data;

    /**
     * A reference to the next node in the list. May only be null if `data == null`.
     */
    Node<T> next;

    /**
     * Constructs a new Node object with the given `data` and `next` reference.
     */
    Node(T data, Node<T> next) {
      this.data = data;
      this.next = next;
    }
  }

  /** The first (empty) node in this list. */
  private final Node<T> head;

  /**
   * The current size of this list, equal to the number of `Node`s in the chain 
   * starting at (and excluding) `head` and ending at (and including) the `Node`
   * whose `next` field is `head`.
   */
  private int size;

  /**
   * Constructs a new, initially empty, CircularLinkedList.
   */
  public CircularLinkedList() {
    size = 0;
    head = new Node<>(null, null);
    head.next = head;
    assertInv();
  }
}

(a)

Write an assertInv() method to assert the class invariants.

(b)

Override each method defined in the CS2110List interface. Be wary of infinite loops. In a circular linked list, we cannot iterate through it by checking the next pointer anymore. Instead, leverage the size field to know when to terminate.

(c)

When might you want to model something as a CircularLinkedList?

Exercise 13.5: Arrays and Linked Lists

In database management systems, when a table is indexed with a B+ tree (trees will be introduced in an upcoming lecture), the leaf nodes are linked together to support efficient range and inequality queries. Each of these nodes is a page in the database, typically storing 4KB worth of data. We'll model this as an unrolled linked list, a data structure that mixes arrays and linked lists. Specifically, the data field in each node is an array. Consider the partial implementation below that leverages the SinglyLinkedList defined in the lecture notes.

1
2
3
4
5
6
7
8
9


/** An unrolled linked list. */
public class UnrolledLinkedList<T> implements CS2110List<T> {

  /** The backing linked list. The arrays in each node need not be full. */
  private SinglyLinkedList<T[]> nodes;

  /** The size of each array in each node. Requires `blockSize > 0`. */
  private int blockSize;
}

1
2
3
4
5
6
7
8
9


/** An unrolled linked list. */
public class UnrolledLinkedList<T> implements CS2110List<T> {

  /** The backing linked list. The arrays in each node need not be full. */
  private SinglyLinkedList<T[]> nodes;

  /** The size of each array in each node. Requires `blockSize > 0`. */
  private int blockSize;
}

(a)

Draw a diagram for a sample unrolled linked list with blockSize=4. Use a node diagram to represent nodes and object diagrams for the arrays stored within each node.

(b)

Define a constructor for this class.

1
2
3
4
5


/** 
 * Constructs an empty UnrolledLinkedList with the given `blockSize`. 
 * Requires `blockSize > 0`.
 */
public UnrolledLinkedList(int blockSize) { ... }

1
2
3
4
5


/** 
 * Constructs an empty UnrolledLinkedList with the given `blockSize`. 
 * Requires `blockSize > 0`.
 */
public UnrolledLinkedList(int blockSize) { ... }

(c)

When we add an element to an unrolled linked list, we add it to an existing node’s array if there is room. If there is no room, we add another node to the linked list, and add the element there. Since generic types are erased at runtime in Java, use the following code to create a T[] object.

1
2


@SuppressWarnings("unchecked")
T[] arr = (T[]) new Object[blockSize];

1
2


@SuppressWarnings("unchecked")
T[] arr = (T[]) new Object[blockSize];

1
2
3
4
5
6


/**
 * Adds `elem` to the list in the first node without a full array. If all are full,
 * adds another node to the backing list, and adds it there.
 */
@Override
public void add(T elem) { ... }

1
2
3
4
5
6


/**
 * Adds `elem` to the list in the first node without a full array. If all are full,
 * adds another node to the backing list, and adds it there.
 */
@Override
public void add(T elem) { ... }

(d)

Override the rest of the methods in CS2110List. When specifications are ambiguous with this implementation, such as with get() or remove(), refine specifications. We also want to support removals. To remove an element, find the first time it appears in an array of a linked list node and set it to null.

Exercise 13.6: SortedLinkedList

A sorted linked list imposes an ascending sorted order on its nodes.

(a)

Define a new class SortedLinkedList<T> that implements CS2110List<T> with your own fields and class invariants.

(b)

Analyze the runtime of each of your methods. Does this class provide any immediate runtime benefits over SinglyLinkedList? When might a developer want to use a SortedLinkedList? Consider what (new) operations the SortedLinkedList may provide an efficient implementation for.

Exercise 13.7: Recursive Linked Lists

In a recursive linked list, instead of defining a nested Node class, we use the class itself as part of the fields. Consider the partial class below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


/** A recursive singly-linked list. */
public class RecursiveLinkedList<T> implements CS2110List<T> {
  /** 
   * The first element of the list. If the list is empty, `head == null`.
   */
  private T head;

  /**
   * The rest of the list after head. If the list is empty, `tail == null`.
   */
  private RecursiveLinkedList<T> tail;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


/** A recursive singly-linked list. */
public class RecursiveLinkedList<T> implements CS2110List<T> {
  /** 
   * The first element of the list. If the list is empty, `head == null`.
   */
  private T head;

  /**
   * The rest of the list after head. If the list is empty, `tail == null`.
   */
  private RecursiveLinkedList<T> tail;
}

(a)

Define a constructor and an assertInv() method for this class.

(b)

Override each method defined in the CS2110List interface. As the name of the class suggests, these methods can be implemented elegantly with recursion.

Exercise 13.8: Removing “Empty” Nodes

(a)

Implement a singly-linked list without the “empty” tail node. The last element’s node should have next == null.

(b)

Implement a doubly-linked list (see Exercise 13.3) without the two “empty” nodes at its ends.

Exercise 13.9: Binary Searching Linked List

Suppose that we have a sorted linked list. Can we improve the runtime of contains() from \( O(N) \) to \( O(\log N) \) using binary search? If so, implement a static method to do so. If not, explain why.

Exercise 13.10: LinkedListBag

In Exercise 12.3, we introduced the Bag ADT, which is an unordered collection that may contain duplicate elements. We implemented the ADT using a dynamic array. Define a new class LinkedListBag<T> that implements Bag<T> that is backed by a singly-linked list instead.

13. Linked Data

Linked Data

Modeling Linked Data

Visualizing Linked Data

Object Diagram

Node Diagram

Encapsulation with Nested Classes

SinglyLinkedList Class

State Representation

Constructor

Locating Elements by Index

Finding Elements

Adding Elements

Removing Elements

Complexity of SinglyLinkedList

Space Complexity

Time Complexity

Main Takeaways:

Exercises

On this page:

`SinglyLinkedList` Class

Complexity of `SinglyLinkedList`