1. Introduction to Java
2. Reference Types and Semantics
3. Method Specifications and Testing
4. Loop Invariants
5. Analyzing Complexity
6. Recursion
7. Sorting Algorithms
8. Classes and Encapsulation
9. Interfaces and Polymorphism
10. Inheritance
11. Additional Java Features
12. Collections and Generics
13. Linked Data
14. Iterating over Data Structures
15. Stacks and Queues
16. Trees and their Iterators
17. Binary Search Trees
17. Binary Search Trees

17. Binary Search Trees

In the previous lecture, we introduced trees, a new abstract data type that connects its data in a branching structure. We focused most of our attention on binary trees, in which each node may have up to two child subtrees, its left and right subtrees. While this restriction provided us with some additional structure (allowing us to define iterators and other recursive methods in our base BinaryTree class), it still left too much ambiguity to use these binary trees as a practical dynamic data structure. We defined an immutable tree class, that placed the onus of arranging the tree’s nodes on the client at the time of construction. However, immutability prevents the modification of the tree at a later point, when elements may need to be added, removed, or modified. Moreover, without any invariants on the structure of the tree, finding a particular element (one of the foundational tasks in any data structure) devolves into an exhaustive, element-by-element search.

Today, we will specialize binary trees to binary search trees (or BSTs) by imposing an additional invariant, which we’ll call the BST order invariant. Recall that binary search leveraged the fact that an array was sorted to shortcut the search for an element, eliminating the need to check every entry (as was necessary in linear search). The fact that the array was sorted allowed it to make inferences about the values in array cells it hadn’t checked using the values from the cells it had checked. In the same way, the BST order invariant will allow us to shortcut the search process in a tree.

Definition: Binary Search Tree

A binary search tree is a binary tree whose elements satisfy the following order invariant. For any subtree t with t.root = v in a BST, all of the elements in its left subtree are "less than or equal to" v and all elements in its right subtree are "greater than or equal to" v.

We can visualize this definition as follows:

An example of a BST storing Integers is shown below. Confirm for yourself that all of the subtrees of this tree satisfy the ordering invariant.

Notice that if a binary tree satisfies the BST order invariant, then an in-order traversal will visit its elements in sorted order. In the case of this tree, an in-order iterator yields the elements in the order

\[ 1, 3, 4, 7, 9, 10, 12, 12, 15, 17, 20. \]

The BST order invariant ensures that the smallest elements appear farthest left in the tree, and the largest elements appear farthest right. An in-order traversal visits all of the elements in the left subtree before the root element. By the BST order invariant, all elements less than the root will be visited before the root. Similarly, the in-order traversal visits all of the elements in the right subtree after the root element. By the BST order invariant, all elements greater than the root will be visited after the root. By extending this reasoning to all of subtrees, we can reason that the elements are visited in sorted order.

To see a benefit of the BST order invariant, suppose that we want to check whether 14 is an element of this tree. We can start by checking the root node (recall that this is the only node accessible in the client interface, so it is where our search must begin). The following animation walks through our search.

previous

next

We see that we were able to carry out this search looking at only 4 elements in the tree, rather than all 11. The BST order invariant allowed us to “prune” off one “branch” (or subtree) of the search at each level, just as each element access in binary search eliminates half of the remaining array range from consideration.

As we have seen multiple times throughout the course, imposing an additional constraint (or invariant), in this case the BST order condition, provides extra structure that improves our code. It makes our code easier to document, reason about, and test, provides a simpler interface to the client, and can lead to significant improvements in our code’s performance (i.e., runtime complexity). However, it also adds an additional burden to us, as the implementer; we must ensure that this invariant is maintained. In today’s lecture, we’ll walk through the complete design of a BST class and all of its methods along with a complexity analysis.

Comparing Objects

The definition of the BST order invariant tells us that elements in the left subtree must be “less than or equal to” the root element (and imposes a similar condition for the right subtree). This definition makes sense when the values in the tree are numbers, such as Integers or Doubles, but it is less clear for other types. Does this mean that BSTs are limited to only storing numeric types? What if a client wants to create a BST storing Strings? There is also a fairly natural order for Strings, the alphabetical order. We can consider a String that is alphabetically earlier to be “less than” a String that is alphabetically later. This is exactly what is done to sort the entries of a dictionary, and means we should be able to store Strings in a BST. Pushing further, what if a client wants to create a BST storing a type of a class that they defined? Since we don’t have access to the internal details of this class, it’s unclear how we could determine which of its instances are “greater” or “lesser” than others. In other words, we lack the ability to compare elements of a generic type. Does this mean that we are out of luck?

Fortunately not; this is a situation we are used to. We’ve identified a behavior, element comparison, that is required of the type stored in our BST, and we require the implementor of this type to supply its definition conforming to our specifications. This is the exact use case of an interface, in this case, the Comparable interface. Let’s take a look at this interface and some example implementing classes. Then we’ll see how to connect the Comparable interface to our BST class using generic type bounds.

The Comparable Interface

Java’s Comparable interface enables the comparison of two elements. It is a generic interface, with type parameter T (that will typically be the type implementing Comparable) and includes a single method compareTo() with the following specification:

Comparable.java

1
2
3
4
5
6
7
8
/**
 * Returns an integer whose sign designates the direction of the comparison 
 * between `this` and `other`. A negative integer is returned when `this` is 
 * "less than" `other`, a positive integer is returned when `this` is "greater 
 * than" `other`, and 0 is returned if `this` and `other` are equivalent under 
 * this comparison.
 */
int compareTo(T other);
1
2
3
4
5
6
7
8
/**
 * Returns an integer whose sign designates the direction of the comparison 
 * between `this` and `other`. A negative integer is returned when `this` is 
 * "less than" `other`, a positive integer is returned when `this` is "greater 
 * than" `other`, and 0 is returned if `this` and `other` are equivalent under 
 * this comparison.
 */
int compareTo(T other);

When a class implements the Comparable interface, its objects can determine whether they are “less than” or “greater than” other objects from that class (using compareTo()). The int return type may look a bit weird at first, but it is a useful choice. A boolean will not work since we require at least three different return values (for less than, greater than, and equal). Permitting any int allows us to perform many comparisons using subtraction, for instance, the Integer.compareTo() method can simply return this.intValue() - other.intValue(); if this is the smaller number this difference will be negative and if this is the larger number this difference will be positive.

There are a few properties that are required of the compareTo() definition to ensure that it behaves like a proper ordering:

The String class implements Comparable<String> using lexicographic (or alphabetical) order. Thus,

1
"hello".compareTo("goodbye")
1
"hello".compareTo("goodbye")
would return a positive integer since “hello” is alphabetically after “goodbye” (it appears later in a dictionary).

As another example, recall that earlier in the course, we defined a Point record class that modeled a point in the 2D Cartesian plane with double coordinates.

1
2
/** An immutable point in the 2D coordinate plane with `double` coordinates. */
public record Point(double x, double y) { ... }
1
2
/** An immutable point in the 2D coordinate plane with `double` coordinates. */
public record Point(double x, double y) { ... }

Suppose that we want to compare Points based on their distance from the origin, where Points that are farther from the origin are “greater than” Points that are closer to the origin. We can achieve this by having our Point record class implement the Comparable interface with type parameter Point and defining the compareTo() method as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
/** An immutable point in the 2D coordinate plane with `double` coordinates. */
public record Point(double x, double y) implements Comparable<Point> {

  /** 
   * Compares `this` and `other` based on their distance from the origin, returning
   * a positive integer when `this` is farther from the origin, a negative integer
   * when `this` is closer to the origin, and 0 when `this` and `other` are 
   * equidistant from the origin. 
   */
  @Override
  public int compareTo(Point other) {
    double thisDistSquared = this.x * this.x + this.y * this.y;
    double otherDistSquared = other.x * other.x + other.y * other.y;
    return (int) Math.signum(thisDistSquared - otherDistSquared);
  }

  // ... additional methods
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
/** An immutable point in the 2D coordinate plane with `double` coordinates. */
public record Point(double x, double y) implements Comparable<Point> {

  /** 
   * Compares `this` and `other` based on their distance from the origin, returning
   * a positive integer when `this` is farther from the origin, a negative integer
   * when `this` is closer to the origin, and 0 when `this` and `other` are 
   * equidistant from the origin. 
   */
  @Override
  public int compareTo(Point other) {
    double thisDistSquared = this.x * this.x + this.y * this.y;
    double otherDistSquared = other.x * other.x + other.y * other.y;
    return (int) Math.signum(thisDistSquared - otherDistSquared);
  }

  // ... additional methods
}

Here, the Math.signum() function returns 1 when its argument is positive, -1 when it is negative, and 0 when it is 0. It provides a nice way to convert a double into an int with the same sign.

Remark:

Technically, our definition can violate the required properties of compareTo() because of the imprecision of arithmetic operations on the floating-point representation of doubles. This issue falls beyond our scope, though.

Generic Type Bounds

To enforce the BST order invariant, we’d like to ensure that the elements that we store in the BST are from a type that implements the Comparable interface. The way to achieve this in Java is with a generic type bound.

Definition: Generic Type Bound

A generic type bound is a constraint that the implementer can impose on a type parameter of their generic class which requires that it is a subtype or supertype of another type. Generic type bounds are statically checked by the compiler whenever their class is instantiated.

In this case, we’ll be defining a generic BST<T> class, a subclass of BinaryTree<T> from last lecture, that stores elements of type T in each of its nodes. We want to require that T is a subtype of Comparable<T> (which will allow us to compare an object of type T to another object of type T). Thus, T will be (a subclass of) a class that implements the Comparable interface. We add a type bound within the class declaration using the extends keyword within the angle brackets of the generic type declaration:

BST.java

1
2
3
4
/**
 * A binary search tree that can store any `Comparable` elements
 */
 public class BST<T extends Comparable<T>> extends BinaryTree<T> { ... }
1
2
3
4
/**
 * A binary search tree that can store any `Comparable` elements
 */
 public class BST<T extends Comparable<T>> extends BinaryTree<T> { ... }

The syntax <T extends Comparable<T>> enforces that T is a subtype of Comparable<T>. A supertype bound is handled similarly using the super keyword, so <T super U> enforces that T is a supertype of U.

By imposing this subtype bound within the BST class, we are guaranteeing that objects of type T have compareTo() method definition. Therefore, we may call compareTo() on these objects without needing a cast, as we will have satisfied the compile-time reference rule. Since generic type bounds are declared statically, they are something that can be enforced by the compiler. If our code does not satisfy a declared type bound, the compiler will throw an error. For example, suppose we try to declare a BST<CheckingAccount> for the CheckingAccount class that we wrote a few lectures ago. CheckingAccount does not implement Comparable (we didn’t even know Comparable existed when we wrote this class), so we have violated the BST generic type bound. If we try to compile code with this declaration, we receive the error message:

Type parameter 'CheckingAccount' is not within its bound; should implement 
'java.lang.Comparable'

We can use the Comparable interface and a generic type bound to expand the static sorting methods that we wrote earlier in the course to work for any Comparable type. For example, the following code is a generic implementation of insertion sort.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
/** Sorts the entries of `a` using the insertion sort algorithm. */
static <T extends Comparable<T>> void insertionSort(T[] a) {
  /* Loop invariant: a[..i) is sorted, a[i..] are unchanged. */
  for (int i = 0; i < a.length; i++) {
    insert(a, i);
  }
}

/**
 * Inserts entry `a[i]` into its sorted position in `a[..i)` such that `a[..i]` 
 * contains the same entries in sorted order. Requires that `0 <= i < a.length` 
 * and `a[..i)` is sorted.
 */
static <T extends Comparable<T>> void insert(T[] a, int i) {
  assert 0 <= i && i < a.length; // defensive programming
  int j = i;
  while (j > 0 && a[j - 1].compareTo(a[j]) > 0) {
    swap(a, j - 1, j);
    j--;
  }
}

/**
 * Swaps the entries `a[i]` and `a[j]`. Requires that `0 <= i < a.length` and 
 * `0 <= j < a.length`.
 */
static <T> void swap(T[] a, int i, int j) {
    T temp = a[i];
    a[i] = a[j];
    a[j] = temp;
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
/** Sorts the entries of `a` using the insertion sort algorithm. */
static <T extends Comparable<T>> void insertionSort(T[] a) {
  /* Loop invariant: a[..i) is sorted, a[i..] are unchanged. */
  for (int i = 0; i < a.length; i++) {
    insert(a, i);
  }
}

/**
 * Inserts entry `a[i]` into its sorted position in `a[..i)` such that `a[..i]` 
 * contains the same entries in sorted order. Requires that `0 <= i < a.length` 
 * and `a[..i)` is sorted.
 */
static <T extends Comparable<T>> void insert(T[] a, int i) {
  assert 0 <= i && i < a.length; // defensive programming
  int j = i;
  while (j > 0 && a[j - 1].compareTo(a[j]) > 0) {
    swap(a, j - 1, j);
    j--;
  }
}

/**
 * Swaps the entries `a[i]` and `a[j]`. Requires that `0 <= i < a.length` and 
 * `0 <= j < a.length`.
 */
static <T> void swap(T[] a, int i, int j) {
    T temp = a[i];
    a[i] = a[j];
    a[j] = temp;
}

We must include the type bounds in the generic type declarations for the static methods insert() and insertionSort(). We use the compareTo() method on line 17 within insert().

The Comparator Interface

As a more flexible alternative to the Comparable interface, Java provides a Comparator interface that also compares objects of a particular type. While we can view Comparable as an internal comparison mechanism (a Comparable object knows how to compare itself to another object of its type), a Comparator is an external comparison mechanism. A Comparator is a separate object that takes in two instances of a particular type and compare()s them. It is an interface that also declares a single required method.

Comparator.java

1
2
3
4
5
6
7
8
/**
 * Returns an integer whose sign designates the direction of the comparison 
 * between `o1` and `o2`. A negative integer is returned when `o1` is 
 * "less than" `o2`, a positive integer is returned when `o1` is "greater 
 * than" `o2`, and 0 is returned if `o1` and `o2` are equivalent under 
 * this comparison.
 */
int compare(T o1, T o2);
1
2
3
4
5
6
7
8
/**
 * Returns an integer whose sign designates the direction of the comparison 
 * between `o1` and `o2`. A negative integer is returned when `o1` is 
 * "less than" `o2`, a positive integer is returned when `o1` is "greater 
 * than" `o2`, and 0 is returned if `o1` and `o2` are equivalent under 
 * this comparison.
 */
int compare(T o1, T o2);

The same consistency properties are required of Comparators as Comparables. We can use a Comparator to define our own notion of order for a type. For example, we may be unhappy that the pairs were ordered by distance from the origin and instead wish to order them lexicographically (i.e., ordered using their first coordinate, with ties broken by their second coordinate). We can define a custom comparator PointLex to model this ordering.

PointLex.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
/** A `Comparator` that models a lexicographic ordering of `Point`s. */
public class PointLex implements Comparator<Point> {
  /**
   * Compares `o1` and `o2` lexicographically, returning a negative integer when `o1` is
   * lexicographically earlier, a positive integer when `o2` is lexicographically earlier, 
   * and 0 when `o1` and `o2` are equal.
   */
  @Override
  public int compare(Point o1, Point o2) {
      int dx = (int) Math.signum(o1.x() - o2.x());
      return dx == 0 ? (int) Math.signum(o1.y() - o2.y()) : dx;
  }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
/** A `Comparator` that models a lexicographic ordering of `Point`s. */
public class PointLex implements Comparator<Point> {
  /**
   * Compares `o1` and `o2` lexicographically, returning a negative integer when `o1` is
   * lexicographically earlier, a positive integer when `o2` is lexicographically earlier, 
   * and 0 when `o1` and `o2` are equal.
   */
  @Override
  public int compare(Point o1, Point o2) {
      int dx = (int) Math.signum(o1.x() - o2.x());
      return dx == 0 ? (int) Math.signum(o1.y() - o2.y()) : dx;
  }
}

As a separate object, a Comparator will typically need to be stored as a field so that its compare() method can be accessed from within a class. A BST implementation that utilized a Comparator would typically accept this parameter as an argument to its constructor. Since they are more straightforward, we will stick to an approach leveraging the Comparable interface for this lecture. However, we’ve included an alternate BST implementation using a Comparator, named BSTComparator in the lecture release code to model this design pattern.

The BST Class

Now, we have all of the tools that we will need to implement a binary search tree class, BST. This class will extend the BinaryTree class that we developed in the previous lecture, so it will inherit methods such as size() and height() as well as support for printing and iterators. It will also include new methods that will allow a user to add, remove, and check for membership of elements within the tree.

Representing State

Let’s start with deciding how we will represent the state of the tree. As we discussed last lecture, we’ll want our BST class to have left and right fields with static type BST to enable us to define methods recursively. There is an added complication for our BST that we did not face with our ImmutableBinaryTree class. A BST is a mutable data structure; elements can be dynamically add()ed and remove()d from it. If all of the elements are remove()d (or if none have been add()ed), we’ll need a way to represent an empty tree. We’ll let a BST with a root = null represent this. It will also be convenient to adopt the following invariant in our design.

Every subtree in our BST is either empty (so has root = null, left = null, and right = null) or it has two non-null (but possibly empty) subtrees.

In other words, every leaf node corresponds to a subtree with two empty child subtrees, rather than having left = null and right = null. This is similar to our inclusion of an “empty” node at the end of our SinglyLinkedList implementation. We’ll see that this simplifies the definition of our BST methods. We’ll visualize these empty trees with unlabeled, dashed circles in our node diagrams.

Remark:

It may appear that this choice to add lots of empty subtrees at the bottom of our tree can greatly increase the amount of memory required to represent a BST. This is true. The amount of memory required will roughly double. In many cases, we're willing to incur this additional memory usage (which does not affect the space complexity of the BST) to simplify the definition of the operations. In very memory-constrained applications, one can select an alternative representation that does not create all of these empty trees. See Exercise 17.7 for more details.

This leads to the following specification of the BST fields.

BST.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
/** A binary search tree that can store any `Comparable` elements */
public class BST<T extends Comparable<T>> extends BinaryTree<T> {

  /**
    * The left subtree of this BST. All elements in `left` must be "<=" `root`.
    * Must be `null` if `root == null` and `!= null` if `root != null`.
    */
  protected BST<T> left;

  /**
    * The right subtree of this BST. All elements in `right` must be ">=" `root`.
    * Must be `null` if `root == null` and `!= null` if `root != null`.
    */
  protected BST<T> right;
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
/** A binary search tree that can store any `Comparable` elements */
public class BST<T extends Comparable<T>> extends BinaryTree<T> {

  /**
    * The left subtree of this BST. All elements in `left` must be "<=" `root`.
    * Must be `null` if `root == null` and `!= null` if `root != null`.
    */
  protected BST<T> left;

  /**
    * The right subtree of this BST. All elements in `right` must be ">=" `root`.
    * Must be `null` if `root == null` and `!= null` if `root != null`.
    */
  protected BST<T> right;
}

The BST constructor should initialize the tree to be initially empty.

BST.java

1
2
3
4
5
6
/** Constructs an empty BST. */
public BST() {
  root = null;
  left = null;
  right = null;
}
1
2
3
4
5
6
/** Constructs an empty BST. */
public BST() {
  root = null;
  left = null;
  right = null;
}

To ensure the proper functionality of the iterators and other methods in our BinaryTree class, we’ll have the left() and right() methods return null rather than a reference to an empty subtree; this is another example where having methods left() and right() rather than inherited fields allows us to add a useful layer of encapsulation.

BST.java

1
2
3
4
5
6
7
8
9
@Override
protected BinaryTree<T> left() {
  return (left.root == null) ? null : left;
}

@Override
protected BinaryTree<T> right() {
  return (right.root == null) ? null : right;
}
1
2
3
4
5
6
7
8
9
@Override
protected BinaryTree<T> left() {
  return (left.root == null) ? null : left;
}

@Override
protected BinaryTree<T> right() {
  return (right.root == null) ? null : right;
}

We’ll also need to override the size() and height() methods to account for the possibility of an empty subtree. By convention, an empty subtree has height -1 so that adding an element gives a tree with height 0.

BST.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
/** Returns the number of elements stored in this BST. An empty BST stores 0 elements. */
@Override
public int size() {
    return (root == null) ? 0 : super.size();
}

/** Returns the height of this BST. An empty BST has height -1. */
@Override
public int height() {
  return (root == null) ? -1 : super.height();
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
/** Returns the number of elements stored in this BST. An empty BST stores 0 elements. */
@Override
public int size() {
    return (root == null) ? 0 : super.size();
}

/** Returns the height of this BST. An empty BST has height -1. */
@Override
public int height() {
  return (root == null) ? -1 : super.height();
}

With all of this set-up out of the way, we are ready to add support for the three additional methods of a BST’s client interface, contains(), add(), and remove().

The find() Helper Method

A common subroutine for the add(), contains(), and remove() methods is the navigation down the binary tree that we saw in the earlier animation. When we add() a new element to the BST, we need to determine where in the tree it should be added. To check whether a BST contains() a node, we need to navigate to the location in the tree where this node would be. Similarly, we need to locate the node containing a particular element in order to remove() it from the tree. We can extract this subroutine into a private helper method, find() with the following specification.

BST.java

1
2
3
4
5
6
/**
 * Locates and returns a subtree whose root is `elem`, or the leaf child where
 * `elem` would be located if `elem` is not in this BST. Requires that 
 * `elem != null`.
 */
private BST<T> find(T elem) { ... }
1
2
3
4
5
6
/**
 * Locates and returns a subtree whose root is `elem`, or the leaf child where
 * `elem` would be located if `elem` is not in this BST. Requires that 
 * `elem != null`.
 */
private BST<T> find(T elem) { ... }

Take some time to develop a recursive implementation of the find() method, taking inspiration from the reasoning in the above animation. Remember that T can be any Comparable<T> reference type, so you’ll need to make use of the compareTo() method. You may assume that our types have the property that x.compareTo(y) == 0 only when x.equals(y).

find() definition

BST.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
/**
 * Locates and returns a subtree whose root is `elem`, or the leaf child where
 * `elem` would be located if `elem` is not in this BST. Requires that 
 * `elem != null`.
 */
private BST<T> find(T elem) {
  assert elem != null; // defensive programming
  if (root == null || elem.compareTo(root) == 0) {
    return this;
  } else if (elem.compareTo(root) < 0) { // `elem < root`, recurse left
    return left.find(elem);
  } else { // `elem > root`, recurse right
    return right.find(elem);
  }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
/**
 * Locates and returns a subtree whose root is `elem`, or the leaf child where
 * `elem` would be located if `elem` is not in this BST. Requires that 
 * `elem != null`.
 */
private BST<T> find(T elem) {
  assert elem != null; // defensive programming
  if (root == null || elem.compareTo(root) == 0) {
    return this;
  } else if (elem.compareTo(root) < 0) { // `elem < root`, recurse left
    return left.find(elem);
  } else { // `elem > root`, recurse right
    return right.find(elem);
  }
}
If either root == null (in which case our search has reached an empty tree, where the element belongs as its root) or elem.compareTo(root) == 0 (in which case we have found the subtree rooted at the element we are looking for), we can return this, the current subtree we are considering. Otherwise, we need to continue our search in one of the two child subtrees of this. We determine which subtree by comparing the target elem with root. In the case that the target elem is "less than" the root (which is true when elem.compareTo(root) < 0), we must continue our search in the left subtree by making the recursive call left.find(elem) and returning its result. Otherwise, if the target elem is "greater than" the root (which is true when elem.compareTo(root) > 0), we must continue our search in the right subtree by making the recursive call right.find(elem) and returning its result.

From its specification, we see that we can use the find() method to give a simple definition of a contains() method. In particular, the find() method returns an empty subtree (that is, a subtree with root == null) exactly when elem is not present in the tree. Therefore, we may define

BST.java

1
2
3
4
5
6
7
/**
 * Returns whether this BST contains `elem`. Requires that `elem != null`.
 */
public boolean contains(T elem) {
  assert elem != null;
  return find(elem).root != null;
}
1
2
3
4
5
6
7
/**
 * Returns whether this BST contains `elem`. Requires that `elem != null`.
 */
public boolean contains(T elem) {
  assert elem != null;
  return find(elem).root != null;
}

Adding Elements

Next, let’s complete the definition of the add() method, which is used to insert a new element into the binary tree.

1
2
3
4
5
/**
 * Adds the given `elem` into this BST at a location that respects the BST order 
 * invariant. Requires that `elem != null`.
 */
public void add(T elem) { ... }
1
2
3
4
5
/**
 * Adds the given `elem` into this BST at a location that respects the BST order 
 * invariant. Requires that `elem != null`.
 */
public void add(T elem) { ... }

When we add a new element to the tree, we will always want to do this at one of the empty subtrees below a leaf, as this will prevent us from needing to change any of the existing elements. Therefore, we can think body of the add() method as two steps:

  1. Locate an appropriate empty subtree where elem can be added. We’ll store a reference to this empty subtree in a local variable BST<T> loc.
  2. Update the tree referenced by loc to store elem at its root.

The second step is the easier of the two, so let’s handle this first. We can store elem in the root of the loc subtree by re-assigning loc.root = elem. When we do this, we must restore the class invariant, which says that any non-empty subtree must have non-null left and right subtrees. We must construct two new empty subtrees to serve as these children, assigning loc.left = new BST<>() and loc.right = new BST<>().

For the first step, we can try to locate an appropriate spot for elem using our find() helper method. There are a few different cases that we must consider based on the return value of find().

Case 1: find(elem) returns an empty subtree (i.e., find().root == null)

From the find() specification, we see that that elem must not already be in this tree, and the empty subtree that was returned is a child subtree of a leaf node where we can insert elem to obey the BST order invariant. For example, if we call add(5) on the BST<Integer> shown below, find(5) will return the red-shaded empty subtree. Step through the animation to see why.

previous

next

This is the best case scenario, since we may directly assign loc = find(elem).

Case 2: find(elem) returns a non-empty subtree (i.e., find().root != null)

In this case, the find() specification tells us that the returned subtree has elem at its root, so we are adding another instance of elem to the tree. We’ll need to find an empty subtree lower down where we can put it. The BST order invariant allows this duplicate element to be in either subtree. We’ll (arbitrarily) choose to add it on the right. There are a further two sub-cases that we must consider.

Case 2a: The subtree returned by find(elem) has an empty right child

For example, if we call add(8) on the subtree shown below, find(8) will return the red shaded node, which falls into this case.

We can safely add elem as the right child of the find(elem), that is, set loc = find(elem).right.

Case 2b: The subtree returned by find(elem) has a non-empty right child

For example, if we call add(12) on the subtree shown below, find(12) will return the red shaded node, which falls into this case.

We no longer have the option to make the new 12 the right child of the old 12; 17 is already there. Instead, we’ll need to locate a spot in (the old) 12’s right subtree where the new 12 should go. By the BST order invariant, every element in this subtree is \(\geq 12\), meaning 12 is the smallest possible element that can be in this subtree. Thus, it should become the subtree’s leftmost element. We can achieve this by finding the current leftmost element in the subtree and making our new 12 the left child of this element (notice that this leftmost element must have an empty left subtree, otherwise it would not be the leftmost element).

We’ll call this leftmost element in the right subtree of (the old) 12 the successor of 12, since it is next element after 12 that will be visited in an in-order traversal of this tree. We can write a private helper method successorDescendant() to locate the subtree rooted at this successor with the following specification.

BST.java

1
2
3
4
5
6
/**
 * Returns the node in this subtree that comes immediately after its root in 
 * an in-order traversal. Requires that `right.root != null`, so this BST has 
 * a non-empty right subtree.
 */
private BST<T> successorDescendant() { }
1
2
3
4
5
6
/**
 * Returns the node in this subtree that comes immediately after its root in 
 * an in-order traversal. Requires that `right.root != null`, so this BST has 
 * a non-empty right subtree.
 */
private BST<T> successorDescendant() { }

Take some time to complete the definition of this method. As its specification references an in-order traversal, its logic will be very similar to the cascadeLeft() method of our InorderIterator class.

successorDescendant() definition

BST.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
/**
 * Returns the node in this subtree that comes immediately after its root in 
 * an in-order traversal. Requires that `right.root != null`, so this BST has 
 * a non-empty right subtree.
 */
private BST<T> successorDescendant() {
  assert right.root != null; // defensive programming
  BST<T> current = right;
  while (current.left.root != null) {
    current = current.left;
  }
  return current;
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
/**
 * Returns the node in this subtree that comes immediately after its root in 
 * an in-order traversal. Requires that `right.root != null`, so this BST has 
 * a non-empty right subtree.
 */
private BST<T> successorDescendant() {
  assert right.root != null; // defensive programming
  BST<T> current = right;
  while (current.left.root != null) {
    current = current.left;
  }
  return current;
}
We can locate the leftmost node in the right subtree by traversing down the left connection of right. Here, we give an iterative solution which uses the local variable current to store our current location in the tree. We initialize current to right and stop the traversal at the node whose left subtree is empty (i.e., current.left.root == null).

We can safely add elem as the left child of the find(elem), that is, set loc = find(elem).successorDescendant().left. In our example from above, the successorDescendant() of 12 is the subtree rooted at 15, so we’ll add (the new) 12 as its left child.

Putting these cases together, we arrive at the following definition for add().

BST.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
/**
 * Adds the given `elem` into this BST at a location that respects the BST order 
 * invariant. Requires that `elem != null`.
 */
public void add(T elem) {
  assert elem != null; // defensive programming
  BST<T> loc = find(elem);
  if (loc.root != null) { // `elem` is already in the BST
    if (loc.right.root == null) {
      // `elem` doesn't have a right subtree; make new `elem` its right child.
      loc = loc.right;
    } else {
      // `elem` has a right subtree; make new `elem` the left child of its successor.
      loc = loc.successorDescendant().left;
    }
  }
  loc.root = elem;
  loc.left = new BST<>();
  loc.right = new BST<>();
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
/**
 * Adds the given `elem` into this BST at a location that respects the BST order 
 * invariant. Requires that `elem != null`.
 */
public void add(T elem) {
  assert elem != null; // defensive programming
  BST<T> loc = find(elem);
  if (loc.root != null) { // `elem` is already in the BST
    if (loc.right.root == null) {
      // `elem` doesn't have a right subtree; make new `elem` its right child.
      loc = loc.right;
    } else {
      // `elem` has a right subtree; make new `elem` the left child of its successor.
      loc = loc.successorDescendant().left;
    }
  }
  loc.root = elem;
  loc.left = new BST<>();
  loc.right = new BST<>();
}

Removing Elements

Lastly, we’ll define a remove() method for our BST with the following specification.

BST.java

1
2
3
4
5
/**
 * Removes the given `elem` from this BST. Requires that `elem != null` and 
 * `contains(elem) == true`
 */
public void remove(T elem) { ... }
1
2
3
4
5
/**
 * Removes the given `elem` from this BST. Requires that `elem != null` and 
 * `contains(elem) == true`
 */
public void remove(T elem) { ... }

We can use our find() helper method to locate the subtree rooted at the node that we wish to remove, which we’ll denote by loc.

1
BST<T> loc = find(elem);
1
BST<T> loc = find(elem);

We’ll again consider two different cases.

Case 1: loc has an empty right subtree (i.e., loc.right.root == null)

As an example, this case arises when we remove(8) from the following tree, so loc is the red shaded node.

We can’t directly delete the 8 node from this tree, as this would also disconnect the 6 node. However, we can slide up the subtree rooted at 6 to take the place of the subtree rooted at 8.

This operation is safe only because 8’s right subtree is empty, so no connections to other nodes are lost when we slide up 6. We’ll call this “sliding up” operation a “supplant”, since you can imagine that we are cutting off a branch of the tree and moving it upward to replace the deleted node. We can extract this supplant subroutine into a private helper method, supplantWith().

BST.java

1
2
3
4
5
6
7
8
/**
 * Replace this tree with the given `other` tree.
 */
private void supplantWith(BST<T> other) {
  root = other.root;
  left = other.left;
  right = other.right;
}
1
2
3
4
5
6
7
8
/**
 * Replace this tree with the given `other` tree.
 */
private void supplantWith(BST<T> other) {
  root = other.root;
  left = other.left;
  right = other.right;
}

Then, our removal operation is carried out by supplanting loc with its left subtree, loc.supplantWith(loc.left). Pictorially, we can see that this maintains the BST order invariant.

By the BST order invariant in the original tree, \(a \leq b\). Therefore, all of the nodes in the subtree that are \(\leq a\) are also \(\leq b\), meaning the tree after the supplantWith() call also satisfies the BST order invariant.

Note that Case 1 handles the possibility that loc is a leaf. When this is the case, it will be supplanted by its left subtree, which is an empty tree. This has the effect of removing loc and leaving the rest of the tree unchanged.

Case 2: loc has a non-empty right subtree (i.e., loc.right.root != null)

Now, our trick of “sliding up” the left subtree will no longer work. This will cause us to lose track of the nodes in loc.right. We need to find another way to “fill the gap” in the tree that is left when we remove loc. One solution is to move another element to this position (without carrying out a more heavy-handed supplant operation). Which node can safely go in this position? This node’s in-order successor can; the BST order invariant ensures that it is “\(\geq\)” every node in loc’s left subtree and “\(\leq\)” every other node in loc’s right subtree.

After we copy loc’s successorDescendant() to reside at the root of loc, we must remove it from its original position. This may create another “gap” in the tree; however, this one is easier to fill. Earlier, we established above that loc’s successorDescendant() has an empty left subtree. By the reasoning from Case 1, we can therefore supplant it with its right subtree.

This BST removal case is the most intricate procedure on our BST (and likely the most involved reasoning that we have seen thus far in the course). The following animation steps through these ideas.

previous

next

Putting these cases together, we arrive at the following definition for remove().

BST.java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
/**
 * Removes the given `elem` from this BST. Requires that `elem != null` and 
 * `contains(elem) == true`
 */
public void remove(T elem) {
  assert contains(elem); // defensive programming
  BST<T> loc = find(elem);
  if (loc.right.root == null) { // `elem` has an empty right subtree
    loc.supplantWith(loc.left); // supplant it with its left subtree
  } else { // `elem` has a non-empty right subtree; replace it with its successor
    BST<T> successor = loc.successorDescendant();
    loc.root = successor.root; // copy up value in successor
    // successor has an empty left subtree
    successor.supplantWith(successor.right); // supplant successor with its right subtree
  }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
/**
 * Removes the given `elem` from this BST. Requires that `elem != null` and 
 * `contains(elem) == true`
 */
public void remove(T elem) {
  assert contains(elem); // defensive programming
  BST<T> loc = find(elem);
  if (loc.right.root == null) { // `elem` has an empty right subtree
    loc.supplantWith(loc.left); // supplant it with its left subtree
  } else { // `elem` has a non-empty right subtree; replace it with its successor
    BST<T> successor = loc.successorDescendant();
    loc.root = successor.root; // copy up value in successor
    // successor has an empty left subtree
    successor.supplantWith(successor.right); // supplant successor with its right subtree
  }
}

Complexity Analysis

To complete our discussion of binary search trees, let’s consider the complexities of the operations that we defined. Throughout, we’ll let \(N\) represent the size of the BST and \(H\) represent its height. As we already noted above, our BST state representation requires memory that grows linearly in \(N\). Now, let’s consider the work done by each of the methods that we defined.

find(): This method is recursive, so we must reason both about the non-recursive work that is done by a single invocation and its recursive call structure. The work done in find() consists of checking whether the root node is null and comparing the elem to root at most twice. We’ll assume that this invocation of compareTo() runs in constant time (its runtime will not depend on the size of the BST), so the non-recursive work in find() has \(O(1)\) complexity. Notice that the recursive calls are always made on a subtree rooted one level deeper in the tree (we either call left.find() or right.find()), and we reach a base case if we ever reach an empty tree just below a leaf. Therefore, the total number of calls is \(O(H)\), so the time complexity of find() is \(O(H)\). Additionally, the allocation of the call frames results in an \(O(H)\) space complexity for find(). We can reduce the space complexity to \(O(1)\) by re-implementing find() using a while loop.

contains(): The work done in contains() is dominated by the call to find(). Therefore, contains() has \(O(H)\) time and space complexities.

successorDescendant(): This method performs a traversal down the left branches of its right subtree. The number of iterations of its while loop is upper bounded by the number of levels in the tree, and a constant amount of work is done outside of the loop and in each loop iteration. Therefore, the time complexity of successorDescendant() is \(O(H)\). Its space complexity is \(O(1)\) to account for its single local variable current. In this case, we attained a constant space complexity using an iterative implementation.

add(): The work done in add() is dominated by the call to find() and the potential call to successorDescendant(). Outside of this, we do a constant amount of work to check which case we are in and to actually add elem once we determine in which loc subtree it belongs. Therefore, add() has \(O(H)\) time and space complexities.

supplantWith(): This method consists of three assignment statements, so has \(O(1)\) time and space complexities.

remove(): Similar to add(), the work done in remove() is dominated by the call to find() and the potential call to successorDescendant(). Outside of this, we do a constant amount of work to check which case we are in and for the call to supplantWith(). Therefore, remove() has \(O(H)\) time and space complexities.

Note that all of the space complexities can be reduced to \(O(1)\) by avoiding recursion (see Exercise 17.9). We expressed all of these complexities in terms of the tree’s height \(H\). Can we also express this dependence in terms of the size \(N\)? Asked differently, what is the worst-case (upper) bound on the height of a binary tree expressed as a function of its size \(N\)? What situation would lead to this case? Take some time to think about it before revealing the answer below.

What is the worst-case height of a BST?

In the worst case, the height of a tree can grow linearly in its size when the tree has exactly one (non-empty) node per level. This situation can arise if the elements are added to the BST in sorted order. For example, if we added the elements \(1,2,3,4,5,6,7\) to a BST<Integer> in that order, we'd end up with the following picture:
Each time that we add() an element, it will become the right child of the previous leaf in the tree. Our BST *degenerates*, resembling a linked list.

This is not so great, as it does not guarantee any performance improvement over a list. We motivated BSTs as a data structure that can achieve the \(O(N)\) to \(O(\log N )\) performance improvement that we get from binary search, so we’d like a way to formally guarantee this. Balanced binary search trees provide this guarantee.

Balanced Trees

What is the best-case (lower) bound we can derive for the height \(H\) of a binary tree in terms of its size \(N\)? This will be achieved when all (except possibly the lowest) level of the tree are full, they contain as many nodes as possible. One example of such a tree is the BST we get from adding \(4,2,6,1,3,5,7\) in that order:

A binary tree with height \(H\) (not counting the empty subtrees) will have 1 node at depth 0, at most 2 nodes at depth 1, at most 4 nodes at depth 2, etc. This pattern of doubling continues until the tree has at most \(2^H\) nodes at depth \(H\). In total, we may bound

\[ N = \sum_{d=0}^{H} \# \textrm{nodes at depth d} \leq \sum_{d=0}^{H} 2^d = 2^{H+1} - 1. \]

Rearranging this, we find that \(H \geq \log_2\big( \frac{N+1}{2}\big)\), or \(H = \Omega(\log N)\). Here, this “big Omega” notation is what we use to express an asymptotic lower bound (similar to the “big O” notation we have used throughout the course). In other words, the height of a binary tree grows at least logarithmically in its size. A BST that guarantees to achieve this lower bound is said to be balanced.

Definition: Balanced Binary Search Tree

A binary search tree size \(N\) is balanced if its height is guaranteed to be \(O(\log N)\).

A self-balancing binary search tree does extra work to make sure that the balance invariant is maintained after each call to add() or remove(). To maintain this invariant, some elements may need to be moved to other locations in the tree (in a way that also preserves the BST order invariant). Doing this efficiently often requires storing auxiliary information within each of the nodes to help keep track of the state of the tree. There are different approaches for self-balancing BSTs, including AVL trees and red-black trees (which are discussed in detail in CS 3110). Both of these approaches perform only \(O(1)\) work per level during their re-balancing steps, so they can achieve an \(O(\log N)\) time complexity for all of their methods.

Main Takeaways:

  • A binary search tree is a binary tree with an additional order invariant: for any node \(v\) in a BST, all nodes in its left subtree are "\(\leq\)" \(v\) and all nodes in its right subtree are "\(\geq\)" \(v\).
  • Comparable and Comparator are two Java interfaces that allow us to define an ordering for a reference type. We can use generic type bounds to enforce that the entries in a BST are Comparable.
  • The BST order invariant enables us to traverse down a single path through the tree when looking for an element, rather than branching out along multiple paths. This reduces the runtime of its methods from linear in the tree's size to linear in its height.
  • A balanced binary tree has a height that is logarithmic in its size. Clever data structures can simultaneously preserve the balance and order invariants, resulting in BSTs with great performance.

Exercises

Exercise 17.1: Check Your Understanding
(a)
How would you write the statement "if String a comes before String b" in Java?
Check Answer
Consider the following binary search tree, where node \(x\) is the right child of the node with value \( 4 \).
(b)
Which of the following could be the value of $x$?
Check Answer
(c)
Suppose that the integers in the range $[1..n]$ were randomly inserted into both a BST and a SinglyLinkedList in the same order. Based on the implementations of these classes in the lecture notes, which of the following is true?
Check Answer
Exercise 17.2: Comparing Dates
Many programs need to reason about when things happen with dates. When we start representing dates using just three integers, year, month, and day, we quickly run into the practical question of how to determine an earlier date.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
/** A date, comprising of year, month, and day. */
public class Date implements Comparable<Date> {
  /** The year. Requires `year >= 0`. */
  private int year;

  /** The month. Requires `1 <= month <= 12`. */
  private int month;

  /** The day. Requires `1 <= day <= the number of days in `month` of `year``. */
  private int day;

  /** 
   * Compares this Date object with another in chronological order. The comparison is 
   * performed first by year, then by month, and finally by day. A date that occurs 
   * earlier in time is considered "less than" one that occurs later.
   */
  @Override
  public int compareTo(Date other) { ... }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
/** A date, comprising of year, month, and day. */
public class Date implements Comparable<Date> {
  /** The year. Requires `year >= 0`. */
  private int year;

  /** The month. Requires `1 <= month <= 12`. */
  private int month;

  /** The day. Requires `1 <= day <= the number of days in `month` of `year``. */
  private int day;

  /** 
   * Compares this Date object with another in chronological order. The comparison is 
   * performed first by year, then by month, and finally by day. A date that occurs 
   * earlier in time is considered "less than" one that occurs later.
   */
  @Override
  public int compareTo(Date other) { ... }
}
(a)
Implement compareTo().
(b)

Making Date comparable handles common cases of comparison, but what if clients want different ways to sort Dates? In this case, we allow the client to pass in a Comparator<Date>.

1
2
/** Sorts `dates` based on `cmp`. */
static void sort(CS2110List<Date> dates, Comparator<Date> cmp) { ... }
1
2
/** Sorts `dates` based on `cmp`. */
static void sort(CS2110List<Date> dates, Comparator<Date> cmp) { ... }
Invoke this method to sort CS2110List<Date> arr in reverse chronological order. Note that Comparator is a functional interface, so you can use a lambda expression for the second parameter.
(c)
Sometimes we want to sort people by their birthdays, but the year they were born doesn’t matter. In that case, we need a way to order Date objects by month and day only. Invoke the above method to sort CS2110List<Date> birthdays in this way.
Exercise 17.3: insertionSort()’s Type Bound
Notice that insertionSort() does not use compareTo(). Why must we still enforce the type bound on T?
Exercise 17.4: Recasting Sorting Algorithms
We showed how to modify insertion sort to sort reference types using generic type bounds. Modify the following sorting algorithms to sort a T[].
(a)
1
static <T extends Comparable<T>> void mergeSort(T[] a) { ... }
1
static <T extends Comparable<T>> void mergeSort(T[] a) { ... }
(b)
1
static <T extends Comparable<T>> void quickSort(T[] a) { ... }
1
static <T extends Comparable<T>> void quickSort(T[] a) { ... }
(c)
1
static <T extends Comparable<T>> void selectionSort(T[] a) { ... }
1
static <T extends Comparable<T>> void selectionSort(T[] a) { ... }
Exercise 17.5: BST Order Invariant
Here we consider two weaker properties that we can impose on the nodes of a binary tree. For each of these properties, show that it is insufficient to guarantee the correctness of our BST methods by: (1) Drawing a tree that satisfies this property, (2) Tracing through an example invocation of the find() method that we defined during the lecture to show that it will not conform to its specifications.
(a)

Rather than imposing the order invariant on all subtrees, we only impose it on the root node:

If the root has value \(v\), then every element in the root’s left subtree is \(\leq v\) and every element in the root’s right subtree is \(\geq v\).

(b)

Rather than imposing the order invariant on all descendants, we only impose it on a node’s children:

Each node in the BST contains a value greater than or equal to its left child’s (if such a child exists) and less than or equal to its right child’s (if such a child exists).

Exercise 17.6: Verify BST
Given a BinaryTree<T> where T extends Comparable, write a method that determines whether this tree satisfies the BST order invariant. It may be helpful to delegate to a private helper method with additional parameters.
1
2
/** Returns whether `tree` is a BST. */
public static <T extends Comparable<T>> boolean isBST(BinaryTree<T> tree) { ... }
1
2
/** Returns whether `tree` is a BST. */
public static <T extends Comparable<T>> boolean isBST(BinaryTree<T> tree) { ... }
Exercise 17.7: BST Without Empty Nodes
Implement a BST whose nodes all hold non-null values.
1
2
/** A binary search tree with no empty nodes. */
public class NonEmptyBST<T extends Comparable<T>> extends BinaryTree<T> { ... }
1
2
/** A binary search tree with no empty nodes. */
public class NonEmptyBST<T extends Comparable<T>> extends BinaryTree<T> { ... }
Exercise 17.8: Ancestry Linked List
Given a BST and a target value val, return the path from the root to val as a SinglyLinkedList.
1
2
3
4
5
/** 
 * Returns the path from the root of `tree` to some instance of `val` in `tree`.
 * Requires `tree.contains(val) == true`.
 */
static <T> SinglyLinkedList<T> getAncestry(BST<T> tree, T val) { ... }
1
2
3
4
5
/** 
 * Returns the path from the root of `tree` to some instance of `val` in `tree`.
 * Requires `tree.contains(val) == true`.
 */
static <T> SinglyLinkedList<T> getAncestry(BST<T> tree, T val) { ... }
Exercise 17.9: More BST Methods
Suppose that each of these methods are defined in the BST<T> class. Implement each of the following according to its specification. State the runtime of each method in terms of \(N\), the size of the tree, and \(H\), its height.
(a)
1
2
3
4
5
6
/**
 * Locates and returns a subtree whose root is `elem` iteratively, or the 
 * leaf child where `elem` would be located if `elem` is not in this BST. 
 * Requires that `elem != null`.
 */
private BST<T> findIterative(T elem) { ... }
1
2
3
4
5
6
/**
 * Locates and returns a subtree whose root is `elem` iteratively, or the 
 * leaf child where `elem` would be located if `elem` is not in this BST. 
 * Requires that `elem != null`.
 */
private BST<T> findIterative(T elem) { ... }
(b)
1
2
/** Returns the number of nodes with value `elem`. */
public int frequencyOf(T elem) { ... }
1
2
/** Returns the number of nodes with value `elem`. */
public int frequencyOf(T elem) { ... }
(c)
1
2
3
4
5
/**
 * Returns the number of nodes in this tree that satisfies both 
 * `n.compareTo(left) >= 0` and `n.compareTo(right) <= 0`.
 */
public int range(T left, T right) { ... }
1
2
3
4
5
/**
 * Returns the number of nodes in this tree that satisfies both 
 * `n.compareTo(left) >= 0` and `n.compareTo(right) <= 0`.
 */
public int range(T left, T right) { ... }
(d)
Can you implement frequencyOf() and range() without looking at all nodes in the BST?
(e)
1
2
3
4
5
6
/**
 * Locates and returns a subtree that has at least one child with value `elem`.
 * Returns `null` if the only node with value `elem` is the root. 
 * Requires `contains(elem) == true`.
 */
public BST<T> predecessor(T elem) { ... }
1
2
3
4
5
6
/**
 * Locates and returns a subtree that has at least one child with value `elem`.
 * Returns `null` if the only node with value `elem` is the root. 
 * Requires `contains(elem) == true`.
 */
public BST<T> predecessor(T elem) { ... }
Exercise 17.10: Sizing up BSTs
Our implementation of a binary search tree allows us to understand the relative ordering of subtrees. If we want to understand absolute position of a node in the sorted order of the tree, we'd need to perform an in-order traversal over its elements. To improve this, we'll add a size field to our BST.
(a)
Add a size field with a reasonable specification to BST<T>. Modify the methods in the class to properly satisfy the new class invariant.
(b)

Not only would this speed up the size() method, but it would also improve other operations too! To retrieve the \(i\)-th smallest value in the tree, we previously had to (in-order) iterate over all nodes in the subtree, resulting in a \(O(N)\) operation. Now with the size field, we can more efficiently implement this method. Implement get() with a time strictly better than \(O(N)\).

1
2
/** Retrieves the i-th smallest node in this tree. Requires 1 <= i <= `tree.size()`. */
public BST<T> get(int i) { ... }
1
2
/** Retrieves the i-th smallest node in this tree. Requires 1 <= i <= `tree.size()`. */
public BST<T> get(int i) { ... }

Exercise 17.11:
Suppose that NotFoundException is a subclass of Exception (so is a checked exception). Implement the following method within the BST class according to its specifications.
1
2
3
4
5
/** 
 * Return the largest element in this tree that is strictly less than `upperBound`. 
 * Throws a `NotFoundException` if all values in tree are at least `bound`. 
 */
T maxBelow(T upperBound) throws NotFoundException { ... }
1
2
3
4
5
/** 
 * Return the largest element in this tree that is strictly less than `upperBound`. 
 * Throws a `NotFoundException` if all values in tree are at least `bound`. 
 */
T maxBelow(T upperBound) throws NotFoundException { ... }