CS212 Notes for Lecture 15
March 14, 2000

Outline for the day:


Destructive list operations

Using set!, one can modify list structure. Recall that a list with 3 elements internally looks like this:

(define mylist (list 1 2 3))

     mylist: --------> o
                      / \
                     /   \
                    1     o
                         / \
                        /   \
                       2     o
                            / \
                           /   \
                          3    ()
The objects marked by "o" are just small blocks of memory containing two pointers, a head pointer and a tail pointer. These small blocks of memory are traditionally called cons cells.

One can change pointers with set! as follows. Suppose you wish to link a new element 4 into this list between 1 and 2. One can first create a list containing just 4:
(let ((new (list 4))) ...

         new: ------------> o
                           / \
                          /   \
                         4    ()
Now the tail pointer of this object, which currently points to the empty list, can be reset to point to where the tail pointer of the first cons cell of mylist points:
(set! (tail new) (tail mylist))

     new: ----------> o
                     / \
                    /   \
                   4     \
                          \
     mylist: --------> o  |
                      / \ |
                     /   \|
                    1     o
                         / \
                        /   \
                       2     o
                            / \
                           /   \
                          3    ()
Then the tail of mylist can be reset to point to new:
(set! (tail mylist) new)

     mylist: --------> o
                      / \
                     /   \
                    1     o <----- new
                         / \
                        /   \
                       4     o
                            / \
                           /   \
                          2     o
                               / \
                              /   \
                             3    ()
All together:
(define mylist (list 1 2 3))
(let ((new (list 4)))
  (set! (tail new) (tail mylist))
  (set! (tail mylist) new))

mylist ==> (1 4 2 3)
Note that when manipulating pointers, the order we do things is important! If we do the two set! statements in the opposite order, we get a circular list structure:
(define mylist (list 1 2 3))
(bind ((new (list 4)))
    (set! (tail mylist) new))
    (set! (tail new) (tail mylist))

mylist ==> (1 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 ...)

Stacks and Queues

Data structures are the basic tools of the trade.  Choosing the right structures (and abstractions around them) makes the difference between easy and hard.

A couple of the basic data structures of computer science that show up everywhere that you have probably seen:

A stack is a data structure supporting the following operations:

  (make-stack) -------- Make a new empty stack
  (push thing stack) -- Return a new stack with thing on top of `stack'.
  (pop stack)  -------- Return a stack like `stack', but without its top element.
  (top stack) --------- Return the top element of the stack.
  (empty? stack) ------ Is there anything on the stack?
They obey the contract:
  (top (push thing stack)) = thing
  (pop (push thing stack)) = stack
  (empty? (make-stack)) = #t
  (empty? (push thing stack)) = #f
We can implement a stack with lists:
  push = pair
  pop = tail
  top = head
  empty? = null?
This implementation is quite efficient:  the operations all take O(1) time, independent of the size of the stack.

A queue has the same operation names as a stack, but they do slightly different things.  It's like a line of people at a cafeteria--you can join at the back end of the line, but you leave at the front end.


FIFO -- first in-first out (maintains order)

(make-queue) ---------- make a new empty queue
(insert thing queue) -- put thing at the tail end of the queue
(delete queue) -------- delete the head of the queue
(head queue) ---------- return the head of the queue
(empty? queue) -------- test emptiness
We could implement a queue as a list with the head (oldest element) at the front.
head = head
delete = tail
empty? = null?
but then to insert:
(method ((thing <object>) (q <list>)) (append q (list thing)))
which is O(n) time. We have to walk all the way down the list.  This is expensive.

We could also do it backwards--the head at the last element--but that wouldn't help either: insert would be O(1), but delete and head would be O(n). 

By being a little more clever, we can have both the head and  the tail available in constant time.

Keep pointers to both ends of the list.

Here's how to implement a queue with constant time insertions  and deletions.

Use a list structure with pointers to the beginning and end of the list. We'll read from the front of the list, and add to the rear.

(define <queue> <pair>)

(define (empty-queue? (q <queue>))
  (null? (head q)))

(define (make-queue) '(()))

(define (queue-head (q <queue>))
  (if (empty-queue? q)
      (error "Cannot take head of empty queue")
      (head (head q))))

(define (insert x (q <queue>))
  (let ((new (list x)))
    (if (empty-queue? q)
        (set! (head q) new)
        (set! (tail (tail q)) new))
    (set! (tail q) new))
    q)

(define (delete (q <queue>))
  (if (empty-queue? q)
      (error "Cannot delete from empty queue")
      (set! (head q) (tail (head q))))
  q)

Priority Queues

A priority queue is a data structure that maintains a collection of elements, each with a key, which is some value chosen from a totally ordered set (such as the natural numbers). A priority queue allows efficient insertion of a new element and extraction of the element with the minimum key. We will see an example of a priority queue in action in PS4.  Thus

Operations on a priority queue:

(make-prioq)      -- return empty structure
(insert! e pq)    -- put entry e in pq
(extract-min! pq) -- Remove highest priority entry in pq and return it

Priority Queues and Lists -- First Implementation

One possible implementation of a priority queue (by no means the only one) is a list in which the elements are ordered by increasing key value. To insert a new element, we find the appropriate place to insert it so as to maintain the order. The element with the minimum key is then always at the head of the list.

(defstruct <entry>
  (key <integer>)
  (data <top>))

(define <priority-queue> <list>)
We can make a new queue entry with key n and data d by calling
(make-entry n d)
Here are two ways to insert a new element, one which does not use destructive list operations, one which does.
(define (insert-1 (e <entry>) (q <priority-queue>))
  (cond ((null? q) (list e))
        ((< (entry-key e) (entry-key (head q))) (cons e q))
        (else (cons (head q) (insert-1 e (tail q))))))
As this recursive operation comes back out of the recursive calls, it allocates a new cons cell in the pair operation in the last line. The old cons cells are no longer in use and can be garbage-collected.

We can then insert a new entry e into a global priority queue *q* as follows:

(set! *q* (insert-1 e *q*))
Note that you cannot change the binding of *q* inside the insert-queue procedure if *q*  is passed as a parameter, since (set! q ...) inside the procedure will change the local binding, not the global one.

Here is an alternative method that uses destructive list operations.

(define (insert-2 (e <entry>) (q <priority-queue>))
  ;first check if new element should go at head of list 
  (if (or (null? q) (< (entry-key e) (entry-key (head q))))
      (cons e q)
      ;if not, find element that it goes immediately after
      (letrec
        ((find-place (lambda ((q <priority-queue>))
           (if (or (null? (tail q)) (< (entry-key e) (entry-key (second q))))
               q
               (find-place (tail q))))))
        (let*
          ((pq (find-place q))
           (le (cons e (tail pq))))
          ;link new element in
          (set! (tail pq) le)
          ;return original list
          q))))
As above, to insert a new entry e into *q*, do
(set! *q* (insert-2 e *q*))
Version 2 has the minor advantage that it does not waste cons cells. This is the version you would use if programming in C, since C does not do garbage collection. However, since Scheme does garbage collection, this advantage is far outweighed by the simplicity of version 1.

Don't confuse a priority queue with its implementation as a list. A priority queue is just the data abstraction specified by its contract. There are many ways to implement it; the list implementation above is just one. We will see a different implementation using lists and a more efficient implementation using heaps below.

Analysis


Priority Queues and Lists -- Second Implementation

Which is better? Not really any difference--one you win on inserts but lose on extract-min's, the other vice versa.


Side note: set! on parameters passed to functions doesn't always do what you want:

(define mylist '(1 2 3))
(define (alter1 (s <list>))
(set! s '(4 5 6)) s)

Then

(alter1 mylist) ==> (4 5 6)
mylist ==> (1 2 3)
mylist doesn't change, because you are setting the binding of the parameter s, not the argument mylist.

BUT:

(define (alter2 (l <list>))
  (set! (tail l) '(4 5 6)) l)
  (alter2 mylist) ==> (1 4 5 6)
mylist ==> (1 4 5 6)

Warning!  


Priority Queues and Heaps

For list implementations above, either insert! or extract-min! is O(n).

This is more expensive than it needs to be.  We can implement priority queues more efficiently with a heap: a tree containing data entries at the nodes such that

  1. The key of each node is no larger than keys of its children
  2. The node with minimum key is on top (root) of the tree (this property is actually a consequence of property 1).

Property 1 is called heap order.  Heaps will give

Let's ignore data for a bit. Numbers shown are just keys (priorities)

     3
    / \
   /   \
  5     9
 / \   / \
12 6  10 15

Heaps are easily represented as vectors, or one-dimensional arrays.  Like arrays in Java.

The root of the tree is at location 1 in the array and the children of the node stored array at position i are at locations 2i and 2i+1.

[3 5 9 12 6 10 15]

(Read across the tree, row by row.)

Heap order then translates to: for 1 <= i <= floor(n/2),

A[i] <= A[2i]
A[i] <= A[2i+1]


All operations on heaps will maintain heap order.


Crash course on Scheme vectors:

(make-vector k)         -- space for k things, indices 0 to k-1
(make-vector k e)       -- same, but initialize all entries to e
(vector e1 ...)         -- evaluate e1 ... and put them into a vector (analogous to list)
(vector-ref vec i)      -- get i'th element in O(1) time
(vector-set! vec i e)   -- put e at location i of vec
(vector-length vec)     -- length of vec

vector-ref, vector-set!, and vector-length require constant time, the other operations require linear time (time proportional to the length of the vector), not counting the time required to evaluate the arguments.


We'll make our heaps with a limited capacity telling how many elements the prioq can hold at once.  The prioq will use only part of this vector at any time.

(define <prioq> <vector>)

(define (make-prioq (size <integer>))
  (make-vector (+ size 1) 0))
prioq-insert! takes some work:
       3
      / \
     /   \
    5     9    [3 5 9 12 6 10 15 4]
   / \   / \
  12  6 10 15
 /
4
        3
       / \
      /   \
     5     9    [3 5 9 4 6 10 15 12]
    / \   / \
   4   6 10 15
  /
12
        3
       / \
      /   \
     4     9    [3 4 9 5 6 10 15 12]
    / \   / \
   5   6 10 15
  /
12

This operation requires only O(log n) time -- the tree is depth ceil(log n), and we do a constant amount of work on each level.

So the code on the handout does the following:

 extract-min! works by returning the element at the root.

Here's what the code does (see handout):

Example: Delete the top element from

        3
       / \
      /   \
     4     9    [3 4 9 5 6 10 15 12]
    / \   / \
   5   6 10  15
  /
12
This leaves two subheaps.  Copy last element to root:
     12
    /  \
   /    \
  4      9    [12 4 9 5 6 10 15]
 / \    / \
5   6  10 15
Bubble it down:
     4
    / \
   /   \
 12     9    [4 12 9 5 6 10 15]
 / \   / \
5   6 10 15

      4
     / \
    /   \
   5     9    [4 5 9 12 6 10 15]
  / \   / \
12   6 10 15
Again an O(log n) operation because the tree is always balanced.

 


Heapsort

We can use a priority queue to sort n numbers:

With the first list implementation (sorted list), this would take

With the second list implementation (unsorted list), this would take

In each case the total is O(n2).

With the heap implementation,

Thus, O(n log n) total cost.

It's called heapsort and it's a reliable standard one.

If you have to sort by doing comparisons only, this is as fast as possible (up to a constant factor).


Today's concepts: