Another important kind of mutable data structure that SML provides is the
array. Arrays generalize refs in that they are a sequence of memory locations,
where each location contains a different value. We can think of a ref cell as
an array of size 1. The type t array is in fact very similar to
the Java array type t[]. Here's a partial signature for the builtin
Array structure for SML. Note that you have to "open Array"
explicitly to use the operations or else write Array.foo
when you want to use the operation foo.
signature ARRAY = sig (* Overview: an 'a array is a mutable fixed-length sequence of * elements of type 'a. *) type 'a array (* array(n,x) is a new array of length n whose elements are * all equal to x. *) val array : int * 'a -> 'a array (* fromList(lst) is a new array containing the values in lst *) val fromList : 'a list -> 'a array exception Subscript (* indicates an out-of-bounds array index *) (* sub(a,i) is the ith element in a. If i is * out of bounds, raise Subscript *) val sub : 'a array * int -> 'a (* update(a,i,x) * Effects: Set the ith element of a to x * Raise Subscript if i is not a legal index into a *) val update : 'a array * int * 'a -> unit (* length(a) is the length of a *) val length : 'a array -> int ... end
See the SML documentation for more information on the operations available on arrays.
A binary heap (often just referred to as a heap) is a special kind of balanced binary tree. The tree satisfies two invariants:
Suppose the priorities are just numbers. Here is a possible heap:
3
/ \
/ \
5 9
/ \ /
12 6 10
Obviously we can find the minimum element in O(1)
time. Extracting it while maintaining the heap invariant will take
O(lg n) time.
Inserting a new element and establishing the heap invariant will also take
O(lg n) time. So asymptotic performance is the same as
for red-black trees but constant factors are better for heaps.
The key observation is that we can represent a heaps as an array.
The root of the tree is at location 0 in the array and the children of the node
stored at position i are at locations
2i+1 and 2i+2.
This means that the array corresponding to the tree contains all the elements of
tree, read across row by row. The representation of the tree above is:
[3 5 9 12 6 10]
Given an element at index i, we can compute
where the children are stored, and conversely we can go from a child at index
j to its parent at index
floor((j-1)/2).
The rep invariant for heaps in this representation is actually simpler than when
in tree form:
Rep invariant for heap a (the partial
ordering property):
a[i] ≤ a[2i+1] and a[i] ≤ a[2i+2]
for 1 ≤ i ≤ floor((n-1)/2)
Now let's see how to implement the priority queue operations using heaps:
Example: inserting 4 into previous tree.
3
/ \
/ \
5 9 [3 5 9 12 6 10 4]
/ \ / \
12 6 10 4
3
/ \
/ \
5 4 [3 5 4 12 6 10 9]
/ \ / \
12 6 10 9
This operation requires only O(lg n)
time -- the tree is depth
ceil(lg n) , and we do a bounded amount of work
on each level.
extract_min works by returning the element at the root.
The trick is this:
Original heap to delete top element from (leaves two subheaps)
3
/ \
/ \
5 4 [3 5 4 12 6 10 9]
/ \ / \
12 6 10 9
copy last leaf to root
9
/ \
/ \
5 4 [9 5 4 12 6 10]
/ \ /
12 6 10
"push down"
4
/ \
/ \
5 9 [9 5 4 12 6 10]
/ \ /
12 6 10
Again an O(lg n) operation.
The following code implements priority queues as binary heaps, using SML arrays.
Note the priority queue signature was covered in the previous lecture, and a
list implementation was shown. This is a better implementation using
binary heaps.
structure Heap : IMP_PRIOQ = struct type 'a heap = {compare : 'a*'a->order, next_avail: int ref, values : 'a option Array.array } type 'a prioq = 'a heap (* We embed a binary tree in the array 'values', where the * left child of value i is at position 2*i+1 and the right * child of value i is at position 2*i+2. * * Invariants: * * (1) !next_avail is the next available position in the array * of values. * (2) values[i] is SOME(v) (i.e., not NONE) for 0<=iorder) *) (* get_elt(p) is the pth element of a. Checks * that the value there is not NONE. *) fun get_elt(values:'a option Array.array, p:int):'a = valOf(Array.sub(values,p)) val max_size = 500000 fun create(cmp: 'a*'a -> order):'a heap = {compare = cmp, next_avail = ref 0, values = Array.array(max_size,NONE)} fun empty({compare,next_avail,values}:'a heap) = (!next_avail) = 0 exception FullHeap exception InternalError exception EmptyQueue fun parent(n) = (n-1) div 2 fun left_child(n) = 2*n + 1 fun right_child(n) = 2*n + 2 (* Insert a new element "me" in the heap. We do so by placing me * at a "leaf" (i.e., the first available slot) and then to * maintain the invariants, bubble me up until I'm <= all of my * parent(s). If there's no room left in the heap, then we raise * the exception FullHeap. *) fun insert({compare,next_avail,values}:'a heap) (me:'a): unit = if (!next_avail) >= Array.length(values) then raise FullHeap else let fun bubble_up(my_pos:int):unit = (* no parent if position is 0 -- we're done *) if my_pos = 0 then () else let (* else get the parent *) val parent_pos = parent(my_pos); val parent = get_elt(values, parent_pos) in (* compare my parent to me *) case compare(parent, me) of GREATER => (* swap if me <= parent and continue *) (Array.update(values,my_pos,SOME parent); Array.update(values,parent_pos,SOME me); bubble_up(parent_pos)) | _ => () (* otherwise we're done *) end (* start off at the next available position *) val my_pos = !next_avail in next_avail := my_pos + 1; Array.update(values,my_pos,SOME me); (* and then bubble me up *) bubble_up(my_pos) end exception EmptyQueue (* Remove the least element in the heap and return it, raising * the exception EmptyQueue if the heap is empty. To maintain * the invariants, we move a leaf to the root and then start * pushing it down, swapping with the lesser of its children. *) fun extract_min({compare,next_avail,values}:'a heap):'a = if (!next_avail) = 0 then raise EmptyQueue else (* first element in values is always the least *) let val result = get_elt(values,0) (* get the last element so that we can put it at position 0 *) val last_index = (!next_avail) - 1 val last_elt = get_elt(values, last_index) (* min_child(p) is (c,v) where c is the child of p at which * the minimum element is stored), and v is the value * at that position. Requires p has a child. *) fun min_child(my_pos): int*'a = let val left_pos = left_child(my_pos) val right_pos = right_child(my_pos) val left_val = get_elt(values, left_pos) in if right_pos >= last_index then (left_pos, left_val) else let val right_val = get_elt(values, right_pos) in case compare(left_val, right_val) of GREATER => (right_pos, right_val) | _ => (left_pos, left_val) end end (* Push "me" down until I'm no longer greater than my * children. When swapping with a child, choose the * smaller of the two. * Requires: get_elt(values, my_pos) = my_val *) fun bubble_down(my_pos:int, my_val: 'a):unit = if left_child(my_pos) >= last_index then () (* done *) else let val (swap_pos, swap_val) = min_child(my_pos) in case compare(my_val, swap_val) of GREATER => (Array.update(values,my_pos,SOME swap_val); Array.update(values,swap_pos,SOME my_val); bubble_down(swap_pos, my_val)) | _ => () (* no swap needed *) end in Array.update(values,0,SOME last_elt); Array.update(values,last_index,NONE); next_avail := last_index; bubble_down(0, last_elt); result end end