Outline for the day:
Using set!, one can modify list structure. Recall
that a list with 3 elements internally looks like this:
Data structures are the basic tools of the trade. Choosing the right structures (and abstractions around them)
makes the difference between easy and hard.
(define mylist (list 1 2 3))
mylist: --------> o
/ \
/ \
1 o
/ \
/ \
2 o
/ \
/ \
3 ()
The objects marked by "o" are just small blocks of memory containing two pointers, a head pointer and a tail pointer. These small blocks
of memory are traditionally called cons cells.
One can change pointers with set! as follows. Suppose you wish to link a new element 4 into this list between 1 and 2. One can first
create a list containing just 4:
(let ((new (list 4))) ...
new: ------------> o
/ \
/ \
4 ()
Now the tail pointer of this object, which currently points to the empty list, can be reset to point to where the tail pointer of the
first cons cell of mylist points:
(set! (tail new) (tail mylist))
new: ----------> o
/ \
/ \
4 \
\
mylist: --------> o |
/ \ |
/ \|
1 o
/ \
/ \
2 o
/ \
/ \
3 ()
Then the tail of mylist can be reset to point to new:
(set! (tail mylist) new)
mylist: --------> o
/ \
/ \
1 o <----- new
/ \
/ \
4 o
/ \
/ \
2 o
/ \
/ \
3 ()
All together:
(define mylist (list 1 2 3))
(let ((new (list 4)))
(set! (tail new) (tail mylist))
(set! (tail mylist) new))
mylist ==> (1 4 2 3)
Note that when manipulating pointers, the order we do things is important! If we do the two
set! statements in the opposite order,
we get a circular list structure:
(define mylist (list 1 2 3))
(bind ((new (list 4)))
(set! (tail mylist) new))
(set! (tail new) (tail mylist))
mylist ==> (1 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 ...)
Stacks and Queues
A couple of the basic data structures of computer science that show up everywhere that you have probably seen:
A stack is a data structure supporting the following operations:
We could also do it backwards--the head at the last element--but
that wouldn't help either: insert would be O(1), but delete and head would be O(n).
By being a little more clever, we can have both the head and
the tail available in constant time.
Keep pointers to both ends of the list.
Here's how to implement a queue with constant time insertions
and deletions.
Use a list structure with pointers to the beginning and end of the
list. We'll read from the front of the list, and add to the rear.
A priority queue is a data structure that maintains a collection of
elements, each with a key, which is some value chosen from a totally
ordered set (such as the natural numbers). A priority queue allows
efficient insertion of a new element and extraction of the element
with the minimum key. We will see an example of a priority queue in action in
PS4. Thus
Operations on a priority queue:
One possible implementation of a priority queue (by no means the only
one) is a list in which the elements are ordered by increasing key
value. To insert a new element, we find the appropriate place to
insert it so as to maintain the order. The element with the minimum
key is then always at the head of the list.
We can then insert a new entry e into a global priority queue *q* as
follows:
Here is an alternative method that uses destructive list operations.
Don't confuse a priority queue with its implementation as a list. A
priority queue is just the data abstraction specified by its contract.
There are many ways to implement it; the list implementation above
is just one. We will see a different implementation using lists and a more efficient implementation
using heaps below.
Which is better? Not really any difference--one you
win on inserts but lose on extract-min's, the other
vice versa. Side note: set! on parameters passed to functions
doesn't always do what you want: Then
BUT: For list implementations above, either insert! or extract-min! is O(n).
This is more expensive than it needs to be. We can implement priority queues more efficiently with a heap:
a tree containing data entries at the nodes such that
Property 1 is called heap order. Heaps will give
Let's ignore data for a bit. Numbers shown are just keys (priorities)
Heaps are easily represented as vectors, or one-dimensional arrays.
Like arrays in Java. The root of the tree is at location 1 in the array and the
children of the node stored array at position i are at
locations 2i and 2i+1. (Read across the tree, row by row.) Heap order then translates to: A[i] <= A[2i] Crash course on Scheme vectors:
vector-ref, vector-set!, and vector-length
require constant time, the other operations require linear time (time
proportional to the length of the vector), not counting the time required to
evaluate the arguments.
We'll make our heaps with a limited capacity telling how many
elements the prioq can hold at once.
This operation requires only O(log n) time -- the tree is depth
ceil(log n), and we do a constant amount of work on each level.
So the code on the handout does the following:
extract-min! works by returning the element at the root.
Here's what the code does (see handout):
Example: Delete the top element from We can use a priority queue to sort n numbers: With the first list implementation (sorted list), this would take With the second list implementation (unsorted list), this would take In each case the total is O(n2). With the heap implementation,
Thus, O(n log n) total cost.
It's called heapsort and it's a reliable standard one.
If you have to sort by doing comparisons only, this is as fast as
possible (up to a constant factor).
(make-stack) -------- Make a new empty stack
(push thing stack) -- Return a new stack with thing on top of `stack'.
(pop stack) -------- Return a stack like `stack', but without its top element.
(top stack) --------- Return the top element of the stack.
(empty? stack) ------ Is there anything on the stack?
They obey the contract:
(top (push thing stack)) = thing
(pop (push thing stack)) = stack
(empty? (make-stack)) = #t
(empty? (push thing stack)) = #f
We can implement a stack with lists:
push = pair
pop = tail
top = head
empty? = null?
This implementation is quite efficient: the operations all take O(1) time, independent of the size of the stack.
A queue has the same operation names as a stack, but they do slightly different
things. It's like a line of people at a cafeteria--you can join at the back end of the line, but you leave at the front end.
FIFO -- first in-first out (maintains order)
(make-queue) ---------- make a new empty queue
(insert thing queue) -- put thing at the tail end of the queue
(delete queue) -------- delete the head of the queue
(head queue) ---------- return the head of the queue
(empty? queue) -------- test emptiness
We could implement a queue as a list with the head (oldest element) at the front.
head = head
delete = tail
empty? = null?
but then to insert:
(method ((thing <object>) (q <list>)) (append q (list thing)))
which is O(n) time. We have to walk all the way down the list.
This is expensive.
(define <queue> <pair>)
(define (empty-queue? (q <queue>))
(null? (head q)))
(define (make-queue) '(()))
(define (queue-head (q <queue>))
(if (empty-queue? q)
(error "Cannot take head of empty queue")
(head (head q))))
(define (insert x (q <queue>))
(let ((new (list x)))
(if (empty-queue? q)
(set! (head q) new)
(set! (tail (tail q)) new))
(set! (tail q) new))
q)
(define (delete (q <queue>))
(if (empty-queue? q)
(error "Cannot delete from empty queue")
(set! (head q) (tail (head q))))
q)
Priority Queues
(make-prioq) -- return empty structure
(insert! e pq) -- put entry e in pq
(extract-min! pq) -- Remove highest priority entry in pq and return it
Priority Queues and Lists -- First Implementation
(defstruct <entry>
(key <integer>)
(data <top>))
(define <priority-queue> <list>)
We can make a new queue entry with key n and data d by calling
(make-entry n d)
Here are two ways to insert a new element, one which does not use
destructive list operations, one which does.
(define (insert-1 (e <entry>) (q <priority-queue>))
(cond ((null? q) (list e))
((< (entry-key e) (entry-key (head q))) (cons e q))
(else (cons (head q) (insert-1 e (tail q))))))
As this recursive operation comes back out of the recursive calls, it
allocates a new cons cell in the pair operation in the last line. The
old cons cells are no longer in use and can be garbage-collected.
(set! *q* (insert-1 e *q*))
Note that you cannot change the binding of *q* inside the insert-queue
procedure if *q* is passed as a parameter, since (set! q ...) inside
the procedure will change the local binding, not the global one.
(define (insert-2 (e <entry>) (q <priority-queue>))
;first check if new element should go at head of list
(if (or (null? q) (< (entry-key e) (entry-key (head q))))
(cons e q)
;if not, find element that it goes immediately after
(letrec
((find-place (lambda ((q <priority-queue>))
(if (or (null? (tail q)) (< (entry-key e) (entry-key (second q))))
q
(find-place (tail q))))))
(let*
((pq (find-place q))
(le (cons e (tail pq))))
;link new element in
(set! (tail pq) le)
;return original list
q))))
As above, to insert a new entry e into *q*, do
(set! *q* (insert-2 e *q*))
Version 2 has the minor advantage that it does not waste cons cells.
This is the version you would use if programming in C, since C does
not do garbage collection. However, since Scheme does garbage
collection, this advantage is far outweighed by the simplicity of
version 1.Analysis
Priority Queues and Lists -- Second Implementation
(define mylist '(1 2 3))
(define (alter1 (s <list>))
(set! s '(4 5 6)) s)
(alter1 mylist) ==> (4 5 6)
mylist ==> (1 2 3)
mylist doesn't change, because you are setting
the binding of the parameter s, not the argument mylist.
(define (alter2 (l <list>))
(set! (tail l) '(4 5 6)) l)
(alter2 mylist) ==> (1 4 5 6)
mylist ==> (1 4 5 6)
Warning!
Priority Queues and Heaps
3
/ \
/ \
5 9
/ \ / \
12 6 10 15
[3 5 9 12 6 10 15]
A[i] <= A[2i+1]
All operations on heaps will maintain heap order.
(make-vector k) -- space for k things, indices 0 to k-1
(make-vector k e) -- same, but initialize all entries to e
(vector e1 ...) -- evaluate e1 ... and put them into a vector (analogous to list)
(vector-ref vec i) -- get i'th element in O(1) time
(vector-set! vec i e) -- put e at location i of vec
(vector-length vec) -- length of vec
(define <prioq> <vector>)
(define (make-prioq (size <integer>))
(make-vector (+ size 1) 0))
prioq-insert! takes some work:
3
/ \
/ \
5 9 [3 5 9 12 6 10 15 4]
/ \ / \
12 6 10 15
/
4
3
/ \
/ \
5 9 [3 5 9 4 6 10 15 12]
/ \ / \
4 6 10 15
/
12
3
/ \
/ \
4 9 [3 4 9 5 6 10 15 12]
/ \ / \
5 6 10 15
/
12
3
/ \
/ \
4 9 [3 4 9 5 6 10 15 12]
/ \ / \
5 6 10 15
/
12
This leaves two subheaps. Copy last element to root:
12
/ \
/ \
4 9 [12 4 9 5 6 10 15]
/ \ / \
5 6 10 15
Bubble it down:
4
/ \
/ \
12 9 [4 12 9 5 6 10 15]
/ \ / \
5 6 10 15
4
/ \
/ \
5 9 [4 5 9 12 6 10 15]
/ \ / \
12 6 10 15
Again an O(log n) operation because the tree is always balanced.
Heapsort
Today's concepts: