CS 312 Lecture 25: Priority Queues and Binary Heaps

Priority queues

Priority queues are a kind of queue in which the elements are dequeued in priority order.

They are a mutable data abstraction: enqueues and dequeues are destructive.
Each element has a priority, an element of a totally ordered set (usually a number)
More important things come out first, even if they were added later
Our convention: smaller number = higher priority
There is no (fast) operation to find out whether an arbitrary element is in the queue
Useful for event-based simulators (with priority = simulated time), real-time games, searching, routing, compression via Huffman coding

(Turn on JavaScript to see code examples)

There are many ways to implement this signature. For example, we could implement it as a linked list, with O(n) performance for extracting the minimum element.

A better implementation would be to use a balanced binary tree such as an AVL tree, a red-black tree, or a splay tree. In a binary search tree, we can find the minimum element by simply walking down the left children as far as possible from the root. Extracting the minimum element requires deleting it from the tree, which is pretty simple because the minimum element can't have two children. This implementation has better performance for many applications:

insert: O(lg n), because an element must be inserted into the tree according to its priority, which serves as the key.
extract_min: O(lg n), because red-black deletion also requires walking up and down the tree.

In fact, we can tell that this is the best we do in terms of asymptotic performance, because we can implement sorting using O(n) priority queue operations, and we know that sorting takes O(n lg n) time in general. The idea is simply to insert all the elements to be sorted into the priority queue, and then use extract_min to pull them out in the right order. (See heapsort.sml)

If you want to use a balanced binary tree implementation off-the-shelf, try a splay tree, because tree elements are accessed with a lot of locality. The splay tree will adapt to this access pattern. However, It turns out there is another data structure with the same asymptotic performance but even better constant factors.

Binary heaps

A binary heap (often just referred to as a heap) is a special kind of balanced binary tree. The tree satisfies two invariants:

The priorities of the children of a node are at least as large as the priority of the parent. By implication, the node at the top (root) of the tree has minimum priority.
The different paths from root to leaf differ in height by at most one. At the bottom of the tree there may be some missing leaves; these are to the right to all of the leaves that are present.

Suppose the priorities are just numbers. Here is a possible heap:

Obviously we can find the minimum element in O(1) time. Extracting it while maintaining the heap invariant will take O(lg n) time. Inserting a new element and establishing the heap invariant will also take O(lg n) time.

So asymptotic performance is the same as for balanced binary search trees but the constant factors are better for heaps. The key observation is that we can represent a heaps as an array.

i are at locations 2i+1 and 2i+2. This means that the array corresponding to the tree contains all the elements of tree, read across row by row. The representation of the tree above is:

[3 5 9 12 6 10]

Given an element at index i, we can compute where the children are stored, and conversely we can go from a child at index j to its parent at index floor((j-1)/2). So we have a way to follow pointers from nodes to their parents and children, without actually representing the pointers!

The rep invariant for heaps in this representation is actually simpler than when in tree form:

Rep invariant for heap a (the partial ordering property):

a[i] ≤ a[2i+1] and a[i] ≤ a[2i+2]
for 1 ≤ i ≤ floor((n-1)/2)

Now let's see how to implement the priority queue operations using heaps:

insert

Put the element at first missing leaf. (Extend array by one element.)
Switch it with its parent if its parent is larger: "bubble up"
Repeat #2 as necessary.

Example: inserting 4 into previous tree.

              3
             / \
            /   \
           5     9        [3 5 9 12 6 10 4]
          / \   / \
         12  6 10  4

              3
             / \
            /   \
           5     4        [3 5 4 12 6 10 9]
          / \   / \
         12  6 10  9

This operation requires only O(lg n) time -- the tree is depth
ceil(lg n) , and we do a bounded amount of work on each level.

extract_min

extract_min works by returning the element at the root.

Guaranteed to be the most important (smallest value) by the partial ordering property.
Now we have the two subtrees to put right, though.

The trick is this:

Copy a leaf (last element) to the root (first element)
If it's larger than one of the children, bubble it down.
Swap with the higher priority child, to make sure the parent is always more important than both children.

Original heap to delete top element from (leaves two subheaps)

              3
             / \
            /   \
           5     4        [3 5 4 12 6 10 9]
          / \   / \
         12  6 10  9

copy last leaf to root

              9
             / \
            /   \
           5     4        [9 5 4 12 6 10]
          / \   /
         12  6 10

"push down"

              4
             / \
            /   \
           5     9        [9 5 4 12 6 10]
          / \   /
         12  6 10

Again an O(lg n) operation.

SML heap code

The following code implements priority queues as binary heaps, using SML arrays.

Binomial and Fibonacci heaps

For some heap uses, we want additional operations. For finding shortest paths, we need to be able to increase the priority of an element already in the priority queue. This complicates the interface.

First, the existing signature does not acknowledge the possibility that the ordering on elements is not fixed. There are two ways to fix this: either parameterize the priority queue on two types (a priority and an element), or else have an interface in which the client notifies the priority queue of elements whose priority (and hence ordering relative to other elements) has changed.

Second, since the priority queue gives no fast way to find an element, the increase_priority call needs to take an argument that says where in the heap to find it (concretely, the array index).

Huffman coding

Huffman coding is an elegant compression method that uses a priority queue.

Fixed-Length Codes

Suppose we want to compress a 100,000-byte data file that we know contains only the lowercase letters A through F. Since we have only six distinct characters to encode, we can represent each one with three bits rather than the eight bits normally used to store characters:

Letter A B C D E F
Codeword 000 001 010 011 100 101

Letter	`A`	`B`	`C`	`D`	`E`	`F`
Codeword	000	001	010	011	100	101

This fixed-length code gives us a compression ratio of 5/8 = 62.5%. Can we do better?

Variable-Length Codes

What if we knew the relative frequencies at which each letter occurred? It would be logical to assign shorter codes to the most frequent letters and save longer codes for the infrequent letters. For example, consider this code:

Letter A B C D E F
Frequency (K) 45 13 12 16 9 5
Codeword 0 101 100 111 1101 1100

Letter	`A`	`B`	`C`	`D`	`E`	`F`
Frequency (K)	45	13	12	16	9	5
Codeword	0	101	100	111	1101	1100

Using this code, our file can be represented with

(45�1 + 13�3 + 12�3 + 16�3 + 9�4 + 5�4) � 1000 = 224,000 bits

or 28,000 bytes, which gives a compression ratio of 72%. In fact, this is an optimal character code for this file (which is not to say that the file is not further compressible by other means).

Prefix Codes

Notice that in our variable-length code, no codeword is a prefix of any other codeword. For example, we have a codeword 0, so no other codeword starts with 0. And both of our four-bit codewords start with 110, which is not a codeword. A code where no codeword is a prefix of any other is called a prefix code. Prefix codes are useful because they make a stream of bits unambiguous; we simply can accumulate bits from a stream until we have completed a codeword. (Notice that encoding is simple regardless of whether our code is a prefix code: we just build a dictionary of letters to codewords, look up each letter we're trying to encode, and append the codewords to an output stream.) In turns out that prefix codes always can be used to achive the optimal compression for a character code, so we're not losing anything by restricting ourselves to this type of character code.

When we're decoding a stream of bits using a prefix code, what data structure might we want to use to help us determine whether we've read a whole codeword yet?

One convenient representation is to use a binary tree with the codewords stored in the leaves so that the bits determine the path to the leaf. This binary tree is a trie in which only the leaves map to letters. In our example, the codeword 1100 is found by starting at the root, moving down the right subtree twice and the left subtree twice:

      100
     /   \
    /     \
   /       \
  A         55
[45]      /    \
         /      \
       25        30
      /  \      /  \
     C    B   14    D
   [12] [13] /  \  [16]
            F    E
           [5]  [9]

Here we've labeled the leaves with their frequencies and the branches with the total frequencies of the leaves in their subtrees. You'll notice that this is a full binary tree: every nonleaf node has two children. This happens to be true of all optimal codes, so we can tell that our fixed-length code is suboptimal by examining its tree, which is clearly wasting a bit in representing E and F:

                 100
             /         \
            /           \
	   /             \
          /               \
         86               14 
       /    \             /
      /      \           /
    58        28       14
   /  \      /  \     /  \
  A    B    C    D   E    F
[45] [13] [12] [16] [9]  [5]

Since we can restrict ourselves to full trees, we know that for an alphabet C, we will have a tree with exactly |C| leaves and |C|−1 internal nodes. Given a tree T corresponding to a prefix code, we also can compute the number of bits required to encode a file:

B(T) = ∑ f(c) d_T(c)

where f(c) is the frequency of character c and d_T(c) is the depth of the character in the tree (which also is the length of the codeword for c). We call B(T) the cost of the tree T.

Huffman's Algorithm

Huffman invented a simple algorithm for constructing such trees given the set of characters and their frequencies. Like Dijkstra's algorithm, this is a greedy algorithm, which means that it makes choices that are locally optimal yet achieves a globally optimal solution.

The algorithm constructs the tree in a bottom-up way. Given a set of leaves containing the characters and their frequencies, we merge the current two subtrees with the smallest frequencies. We perform this merging by creating a parent node labeled with the sum of the frequencies of its two children. Then we repeat this process until we have performed |C|−1 mergings to produce a single tree.

As an example, use Huffman's algorithm to construct the tree for our input.

How can we implement Huffman's algorithm efficiently? The operation we need to perform repeatedly is extraction of the two subtrees with smallest frequencies, so we can use a priority queue. We can express this in ML as:

We won't prove that the result is an optimal prefix tree, but why does this algorithm produce a valid and full prefix tree? We can see that every time we merge two subtrees, we're differentiating the codewords of all of their leaves by prepending a 0 to all the codewords of the left subtree and a 1 to all the codewords of the right subtree. And every non-leaf node has exactly two children by construction.

Let's analyze the running time of this algorithm if our alphabet has n characters. Building the initial queue takes time O(n log n) since each enqueue operation takes O(log n) time. Then we perform n−1 merges, each of which takes time O(log n). Thus Huffman's algorithm takes O(n log n) time.

Adaptive Huffman Coding

If we want to compress a file with our current approach, we have to scan through the whole file to tally the frequencies of each character. Then we use the Huffman algorithm to compute an optimal prefix tree, and we scan the file a second time, writing out the codewords of each character of the file. But that's not sufficient. Why? We also need to write out the prefix tree so that the decompression algorithm knows how to interpret the stream of bits.

So our algorithm has one major potential drawback: We need to scan the whole input file before we can build the prefix tree. For large files, this can take a long time. (Disk access is very slow compared to CPU cycle times.) And in some cases it may be unreasonable; we may have a long stream of data that we'd like to compress, and it could be unreasonable to have to accumulate the data until we can scan it all. We'd like an algorithm that allows us to compress a stream of data without seeing the whole prefix tree in advance.

The solution is adaptive Huffman coding, which builds the prefix tree incrementally in such a way that the coding always is optimal for the sequence characters already seen. We start with a tree that has a frequency of zero for each character. When we read an input character, we increment the frequency of that character (and the frequency in all branches above it). We then may have to modify the tree to maintain the invariant that the least frequent characters are at the greatest depths. Because the tree is constructed incrementally, the decoding algorithm simply can update its copy of the tree after every character is decoded, so we don't need to include the prefix tree along with the compressed data.