Binary Search Trees
A binary search tree (BST) is a binary tree with the following representation invariant:
For any node n, every node in the left subtree of n has a value less than n's value, and every node in the right subtree of n has a value greater than n's value.
We call that the BST invariant.
Here is code that implements a couple operations on a BST:
type 'a tree = Node of 'a * 'a tree * 'a tree | Leaf (** [mem x t] is [true] iff [x] is a member of [t]. *) let rec mem x = function | Leaf -> false | Node (y, l, r) -> if x < y then mem x l else if x > y then mem x r else true (** [insert x t] is [t] . *) let rec insert x = function | Leaf -> Node (x, Leaf, Leaf) | Node (y, l, r) as t -> if x < y then Node (y, insert x l, r) else if x > y then Node (y, l, insert x r) else t
What is the running time of those operations? Since
insert is just a
mem with an extra constant-time node creation, we focus on the
The running time of
mem is , where
is the height of the tree, because every recursive call descends
one level in the tree. What's the worst-case height of a
tree? It occurs with a tree of nodes all in a single long
branch—imagine adding the numbers 1,2,3,4,5,6,7 in order into
the tree. So the worst-case running time of
mem is still
, where is the number of nodes in the tree.
What is a good shape for a tree that would allow for fast lookup? A perfect binary tree has the largest number of nodes for a given height , which is . Therefore , which is .
If a tree with nodes is kept balanced, its height is , which leads to a lookup operation running in time .
How can we keep a tree balanced? It can become unbalanced during element insertion or deletion. Most balanced tree schemes involve adding or deleting an element just like in a normal binary search tree, followed by some kind of tree surgery to rebalance the tree. Some examples of balanced binary search tree data structures include
- AVL trees (1962)
- 2-3 trees (1970's)
- Red-black trees (1970's)
Each of these ensures running time by enforcing a stronger invariant on the data structure than just the binary search tree invariant.