CS 312 Recitation 25
Priority Queues and Binary Heaps

Priority Queues

Priority queues are a kind of queue in which the elements are dequeued in priority order.

code/imp_prioq.sml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
 
signature IMP_PRIOQ =
  sig
    (* A 'a prioq is a mutable priority queue.
     * Abstractly, it is a possibly empty sequence of
     * elements [a1,...,an] sorted in priority order.
     * The operations destructively update the data
     * structure. *)
    type 'a prioq

    (* Create a new, empty priority queue *)
    val create : ('a * 'a -> order) -> 'a prioq

    (* insert(q,a) inserts a into q in priority order. *)
    val insert : 'a prioq -> 'a -> unit

    (* extract_min(q) removes and returns the first
     * element in the queue. Checks whether the
     * queue is nonempty. *)
    val extract_min : 'a prioq -> 'a

    (* empty(p) is true iff p has no elements *) 
    val empty : 'a prioq -> bool
  end
 

There are many ways to implement this signature. For example, we could implement it as a linked list where the cells of the list are connected through refs so it can be updated imperatively:

code/list_prioq.sml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
 
structure ListPrioq : IMP_PRIOQ =
  (* Represents the priority queue as a list ordered by key,
   * and min element at head. *)
  struct
    type 'a prioq = {compare: 'a * 'a -> order,
                     elements: 'a list ref}

    fun create (c:'a*'a->order) = {compare=c, elements=ref []}
    fun empty({compare, elements}: 'a prioq) = null(!elements)
    fun insert ({compare,elements}: 'a prioq) (x:'a): unit =
      let fun ins [] = [x]
            | ins (hd::tl) =
        (case compare(hd,x) of
           LESS => hd::(ins tl) | _ => x::(hd::tl))
      in
        elements := ins(!elements)
      end
    exception EmptyQueue
    fun extract_min ({compare,elements}:'a prioq):'a =
      case (!elements) of
        [] => raise EmptyQueue
      | hd::tl => (elements := tl; hd)
  end
 

What is the asymptotic performance of this implementation?

Another alternative implementation is to use red-black trees or another of the balanced search trees. For example, in red-black trees we can find the minimum element by simply walking down the left children all the way from the root. Extracting the minimum element requires deleting it from the tree; we haven't seen how to do this, but it's about twice as complicated as the insertion we've already seen. This implementation has better performance for many applications:

In fact, we can tell that this is the best we do in terms of asymptotic performance, because we can implement sorting using O(n) priority queue operations, and we know that sorting takes O(n lg n) time in general. The idea is simply to insert all the elements to be sorted into the priority queue, and then use extract_min to pull them out in the right order:

code/heapsort.sml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
 
(* sort(lst) contains the elements of lst in sorted order
 * according to cmp *)
fun sort(lst: 'a list, cmp: 'a*'a -> order):'a list =
  let
    val pq = create(cmp)
    val t1 = Time.now()
    val _ = foldl (fn(a:'a, b:unit) =>
                  insert pq a) () lst
    val t2 = Time.now()
    fun loop(): 'a list =
      if empty(pq) then []
      else let val a = extract_min(pq) in
        a::loop()
      end
    val result = loop()
    val t3 = Time.now()
  in
    print ("Insertion: " ^ Time.toString(Time.-(t2,t1)) ^ " sec\n");
    print ("Extraction: " ^ Time.toString(Time.-(t3,t2)) ^ " sec\n");
    print ("Total: " ^ Time.toString(Time.-(t3,t1)) ^ " sec\n");
    result
  end

(* The list [1,...,n] *)
fun one_to_n(n: int) =
  let fun m_to_n(m,n) =
    if n = m then [m]
      else m::(m_to_n(m+1,n))
  in
    m_to_n(1,n)
  end

fun heap_to_n(n) =
  let val h = create(Int.compare) in
    foldl (fn(a,b) => insert h a) () (one_to_n(n));
    h
  end

fun timeit(f: unit->'a):unit = let
  val time1 = Time.now()
  val _ = f()
  val time2 = Time.now()


in
  print ("\nTotal time = " ^ (Time.toString(Time.-(time2,time1)))
         ^ " seconds\n")
end

fun sorti(lst: int list) = sort(lst, Int.compare)

val _ = SMLofNJ.Internals.GC.messages false

 

Heaps

Although they have good asymptotic performance, it turns out that red-black trees are overkill for implementing priority queues: they are more complicated and slower than necessary. There is a simple, fast way to implement priority queues.

A heap is a special kind of balanced binary tree. Sometimes it is called a binary heap to distinguish it from a memory heap. The tree satisfies two invariants:

Suppose the priorities are just numbers. Here is a possible heap:

              3
             / \
            /   \
           5     9
          / \   /
         12  6 10

Obviously we can find the minimum element in O(1) time. Extracting it while maintaining the heap invariant will take O(lg n) time. Inserting a new element and establishing the heap invariant will also take O(lg n) time. So asymptotic performance is the same as for red-black trees but constant factors are better for heaps.

The key observation is that we can represent a heaps as an array

The root of the tree is at location 0 in the array and the children of the node stored at position i are at locations 2i+1 and 2i+2. This means that the array corresponding to the tree contains all the elements of tree, read across row by row. The representation of the tree above is:

[3 5 9 12 6 10]

Given an element at index i, we can compute where the children are stored, and conversely we can go from a child at index j to its parent at index floor((j-1)/2).

The rep invariant for heaps in this representation is actually simpler than when in tree form:

Rep invariant for heap a (the partial ordering property):

a[i] ≤ a[2i+1] and a[i] ≤ a[2i+2]
for 1 ≤ i ≤ floor((n-1)/2)

Now let's see how to implement the priority queue operations: 

insert

  1. Put the element at first missing leaf. (Extend array by one element.)

  2. Switch it with its parent if its parent is larger: "bubble up"

  3. Repeat #2 as necessary.

Example: inserting 4 into previous tree.

              3
             / \
            /   \
           5     9        [3 5 9 12 6 10 4]
          / \   / \
         12  6 10  4

              3
             / \
            /   \
           5     4        [3 5 4 12 6 10 9]
          / \   / \
         12  6 10  9

This operation requires only O(lg n) time -- the tree is depth
ceil(lg n) , and we do a bounded amount of work on each level.

extract_min

extract_min works by returning the element at the root.

The trick is this:

Original heap to delete top element from (leaves two subheaps)

              3
             / \
            /   \
           5     4        [3 5 4 12 6 10 9]
          / \   / \
         12  6 10  9

copy last leaf to root

              9
             / \
            /   \
           5     4        [9 5 4 12 6 10]
          / \   /
         12  6 10

"push down"

              4
             / \
            /   \
           5     9        [9 5 4 12 6 10]
          / \   /
         12  6 10


Again an O(lg n) operation.

We can sort using this implementation of priority queues.
How expensive is the sorting function built from this?

  n insertions, at O(lg n) cost, for O(n lg n) total
  n deletions, at O(lg n) cost, for O(n lg n)  total.

  Thus, O(n lg n) total cost.

It's called heapsort and it's a standard, reliable sorting algorithm.

If you have to sort by doing comparisons only, this is as fast as possible (up to a constant factor). There are plenty of other O(n lg n) algorithms with better properties in some cases, for example:

One last comment -- you might be worried about the fixed size for the array of values. The solution is just to use a resizable array abstraction (like Java Vectors), which you should be able to figure out how to build.

code/heap.sml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
 
structure Heap : IMP_PRIOQ =
  struct
    type 'a heap = {compare : 'a*'a->order,
                    next_avail: int ref,
                    values : 'a option Array.array
                    }
    type 'a prioq = 'a heap

(* We embed a binary tree in the array 'values', where the
 * left child of value i is at position 2*i+1 and the right
 * child of value i is at position 2*i+2.
 *
 * Invariants:
 *
 * (1) !next_avail is the next available position in the array
 * of values.
 * (2) values[i] is SOME(v) (i.e., not NONE) for 0<=iorder) *)

(* get_elt(p) is the pth element of a. Checks
 * that the value there is not NONE. *)
fun get_elt(values:'a option Array.array, p:int):'a =
  valOf(Array.sub(values,p))

val max_size = 500000
fun create(cmp: 'a*'a -> order):'a heap =
  {compare = cmp,
   next_avail = ref 0,
   values = Array.array(max_size,NONE)}
fun empty({compare,next_avail,values}:'a heap) = (!next_avail) = 0

exception FullHeap
exception InternalError
exception EmptyQueue

fun parent(n) = (n-1) div 2
fun left_child(n) = 2*n + 1
fun right_child(n) = 2*n + 2

(* Insert a new element "me" in the heap.  We do so by placing me
 * at a "leaf" (i.e., the first available slot) and then to
 * maintain the invariants, bubble me up until I'm <= all of my
 * parent(s).  If there's no room left in the heap, then we raise
 * the exception FullHeap.
 *)
fun insert({compare,next_avail,values}:'a heap) (me:'a): unit =
  if (!next_avail) >= Array.length(values) then
    raise FullHeap
  else
    let fun bubble_up(my_pos:int):unit =
      (* no parent if position is 0 -- we're done *)
      if my_pos = 0 then ()
      else
        let (* else get the parent *)
          val parent_pos = parent(my_pos);
          val parent = get_elt(values, parent_pos)
        in
          (* compare my parent to me *)
          case compare(parent, me) of
            GREATER =>
              (* swap if me <= parent and continue *)
              (Array.update(values,my_pos,SOME parent);
               Array.update(values,parent_pos,SOME me);
               bubble_up(parent_pos))
          | _ => () (* otherwise we're done *)
        end
        (* start off at the next available position *)
        val my_pos = !next_avail
    in
      next_avail := my_pos + 1;
      Array.update(values,my_pos,SOME me);
      (* and then bubble me up *)
      bubble_up(my_pos)
    end

exception EmptyQueue
(* Remove the least element in the heap and return it, raising
 * the exception EmptyQueue if the heap is empty.  To maintain
 * the invariants, we move a leaf to the root and then start
 * pushing it down, swapping with the lesser of its children.
 *)
fun extract_min({compare,next_avail,values}:'a heap):'a =
  if (!next_avail) = 0 then raise EmptyQueue
  else (* first element in values is always the least *)
    let val result = get_elt(values,0)
      (* get the last element so that we can put it at position 0 *)
      val last_index = (!next_avail) - 1
      val last_elt = get_elt(values, last_index)
      (* min_child(p) is (c,v) where c is the child of p at which
       * the minimum element is stored), and v is the value
       * at that position. Requires p has a child. *)
      fun min_child(my_pos): int*'a =
        let
          val left_pos = left_child(my_pos)
          val right_pos = right_child(my_pos)
          val left_val = get_elt(values, left_pos)
        in
          if right_pos >= last_index then (left_pos, left_val)
          else
            let val right_val = get_elt(values, right_pos) in
              case compare(left_val, right_val)
                of GREATER => (right_pos, right_val)
                 | _ => (left_pos, left_val)
            end
        end
      (* Push "me" down until I'm no longer greater than my
       * children. When swapping with a child, choose the
       * smaller of the two.
       * Requires: get_elt(values, my_pos) = my_val
       *)
      fun bubble_down(my_pos:int, my_val: 'a):unit =
        if left_child(my_pos) >= last_index then () (* done *)
        else let val (swap_pos, swap_val) = min_child(my_pos) in
          case compare(my_val, swap_val)
            of GREATER =>
              (Array.update(values,my_pos,SOME swap_val);
               Array.update(values,swap_pos,SOME my_val);
               bubble_down(swap_pos, my_val))
             | _ => () (* no swap needed *)
        end
    in
      Array.update(values,0,SOME last_elt);
      Array.update(values,last_index,NONE);
      next_avail := last_index;
      bubble_down(0, last_elt);
      result
    end
  end