CS410, Summer 1998 Lecture 3 Outline Dan Grossman Goals: * To establish once and for all, lists vs. arrays * Circularity. * Queues * Double-ended Queues Announcements: * See the web page for information on how to get email every time the web page is updated. This is an optional service provided by a fellow student. * Happy Canada Day. Reading: CLR 11.1, 11.2 * What is an array? It is an object that holds a specified number of other objects of a given type. It supports 2 kinds of operations in O(1) time: get the i^th item (error if out of bounds) update the i^th item (error if out of bounds) These are so common that almost every language has them built in How you would implement them -- grab a hunk of memory. Why you can't make your own in Java: No class you define can have an arbitrary number of objects without using arrays. Why Java (presumably) does new int[N] and new Foo[N] differently: * Sharing of objects (a change must be see in all references) * Inheritance (an extension of Foo may take more space than a Foo) Why you can't make them longer or squeeze an element into the middle. What to do if they fill up -- error or resize (This is presumably what Java's Vectors are -- you must know this to use them appropriately) * What is a (singly) linked list? It is a data structure that is either an empty list or an object that contains one other object and a reference to a (singly) linked list. (Notice how this is an inductive definition.) It supports these operations in O(1) time: is the list empty get/update the _first_ item (error if empty) get the reference to the rest of the list (error if empty) These are so common that many languages have them built in, and in the rest they're trivial to make. How you would implement them. Why you can't get the i^th item in O(1) time. They don't fill up. Why they take a constant factor more space than the corresponding array. Deleting a node is troublesome because we need the previous node to point to the next. If we are currently traversing the list, then we should just keep track of the previous pointer for this purpose. If we don't have this for some reason, and we already have a pointer to the node being deleted, then an O(1) solution is to mark the node as a zombie. Then all methods that traverse the list should check for zombies and delete them. * Doubly linked list. Use only as necessary, learn to recognize when it is necessary. Not built into any language I know of, but still easy to make. They take a constant factor more space than a singly linked list, and most of the common operations take a constant factor more time. class DoublyLinkedListNode { Object val; DoublyLinkedListNode prev, next; } class DoublyLinkedList { DoublyLinkedListNode head; DoublyLinkedList () { head = null; } void insert(Object obj) { DoublyLinkedListNode n = new DoublyLinkedListNode(); n.val = obj; n.prev = null; if (head != null) head.prev = n; head = n; } void delete(DoublyLinkedListNode n) { if (prev != null) prev.next = n.next; else head = head.next; if (next != null) next.prev = n.prev; } } Notice there is no need for zombie-like solutions, as in the singly linked list case. * What about length/size: arrays remember in Java (or don't in C), so it's O(1). For lists, it's generally recomputed in time O(n): int n = 0; while (l != null) { l = l.rest; n++; } We could keep size as a field in the List class, but this only works if there is no sharing between different List objects. * Circularity is when a ListNode's rest pointer points back to somewhere in the same list! This would make the length code above go into an infinite loop. Circular lists are a space-efficient way for every element of a List to be able to follow pointers to every other element of the List -- just have the last element point back to the first. How could we test whether a linked list is circular? * We could walk through the List and see if we ever reached the head again. However, this doesn't work for a list of this shape: ___ / \ __________\___/ * We could add a "marked" field to each node and see if we ever re-reached any node. This requires more space as an addition to our Node class. Also, we have to remember to unmark everything when done or the next circle detection won't work right. * We could keep track of all visited nodes in a separate list and check against each of them as we walk through the list. This takes time O(n^2). [I didn't discuss this one in class.] * We could use the following cute trick: if (l == null || l.rest == null) return false; List slow = l, fast = l.rest; while (fast != slow) { if (fast.rest == null || fast.rest.rest == null) return false; fast = fast.rest.rest; slow = slow.rest; } return true; Analyzing this trick is a good exercise: Correctness: If the list is non-circular, fast will get to a null pointer and return false. If the list is circular, eventually fast and slow will both be in the circle. After this happens, the distance from slow back to fast will decrease by exactly 1 with each iteration. So fast and slow will eventually meet, and true will be returned. Efficiency: If the list is non-circular, fast will reach the end after n/2 iterations of the O(1) loop body. If the list is circular, fast and slow will be in the circle after less than n iterations. Fast catching slow will also take less than n iterations. So in all cases the algorithm is O(n). * ADT -- Stacks (lots of uses) Linked lists and arrays both do just fine and expose the trade-offs Push and Pop O(1) except in array resizing case. We did this on Monday. * ADT -- Queues (lots of uses -- think standing in line) A Queue has the following operations: * enqueue: put an object in * dequeue: remove the object that was least recently enqeueued and has never been dequeued. Arrays work fine, just be careful to get the wrap-around right. Linked lists work fine, you might be tricked into thinking you need doubly linked and you'd be wrong. You do need an end-pointer though! class ArrayQueue { private int head, tail; private Object[] theQueue; public ArrayQueue(int maxSize) { // assume no resizing head = tail = 0; theQueue = new Object[maxSize+1]; // makes code easier w/ one // extra element } public void enqueue(Object obj) { theQueue[tail] = obj; tail = (tail + 1) % theQueue.size; // or expand it out w/ an if if (head == tail) { // error just to make life easy we waste one space } } public Object dequeue() { if (head == tail) { //error -- queue is empty } Object ans = theQueue[head]; head = (head + 1) % theQueue.size; return ans; } public boolean isEmpty() { return (head == tail); } } class ListQueue { private List head, tail; public ListQueue() { head = null; } public void enqueue(Object obj) { List l = new List(); l.val = obj; l.rest = null; if (head == null) { head = tail = l; } else { tail.rest = l; } } public Object dequeue() { if (head == null) { // error } Object ans = head.val; head = head.rest; return ans; } public boolean isEmpty() { return (head == null) } } * ADT -- Double-ended Queues A double ended queue is like a stack and a queue in one. We have operations insert, pop, and dequeue. Example uses: * Modeling how I write on the board: When I need to erase space, I either erase side issues in LIFO order or I erase the oldest "real" things. * A terrible grocery store practice of opening new checkout lines and taking people from the back of other lines. Arrays work fine, just like queues. The pop operation merely moves the tail index backwards instead of the head index forward. For lists, we really do need doubly-linked lists now. Otherwise either pop or dequeue (your choice) would take time O(n).