CS410, Summer 1998 Lecture 6 Outline Dan Grossman Goals: * The importance of balanced binary trees * Red-black trees -- a way to keep BST's balanced Reading: CLR 14 * The BIG POINT: Last time we saw that a bunch of operations (max, min, predecessor, successor, insert, delete, lookup) can be made to run in O(h) on a BST where h is the height of the tree. But n, the number of items in our data structure, is the real measure. How many nodes could there be in a tree of (maximum) height h? 1N(h-1) + 1 <= N(h) <= 2N(h-1) + 1 So by last week's hard work, we know instantly, N(h) is Omega(h) and O(2^h). That is, h is Omega(log n) and O(n). We really, really want it to be O(log n), guaranteed. * If we had all the elements in advance, we could make an ideal tree by sorting, then taking recursive medians. In many applications this may be the way to go, but it doesn't work in the presence of dynamic inserts and deletes. What starts out sorted doesn't necessarily stay sorted. * The plan: * Start "balanced enough" (see homework) * After an insert/delete, do log n work to get back to balanced enough O(log n) work => a walk up or down the tree doing O(1) work at each step. * Store extra information to get this to work. * Balanced enough -- a red-black tree is a BST with the additional requirements that: 1. every node is either red or black (i.e. one bit to keep track of this) 2. every leaf is black => not really important 3. if a node is red then its children are black 4. The number of black nodes on every root-to-leaf path is equal. (5. root is black => makes codes easier later) 3, 4 are the key ones Claim: The lengths of all paths are within a factor of two of each other. (Shortest is black,black,black.... Longest is red,black,red,black,...) Claim: Therefore, h is O(log n) (homework). * Helper functions: Left-Rotate and Right-Rotate Idea: We can change the structure of the middle of the BST as follows and still have a BST: (x,y are nodes; A,B,C are arbitrary subtrees) | | y x / \ <========> / \ x C A y / \ / \ A B B C Convince yourself this is legal. Implementing it just requires being very careful with pointers: Right-rotate(y): // assume y has a left child, else makes no sense x = y.left; Aroot = x.left; Broot = x.right; Croot = y.right; p = y.parent; y.left = Broot; if (Broot != null) Broot.parent = y; y.parent = x; x.right = y; if (p == null) root = x else { x.parent= p; if (p.left == y) p.left = x; else p.right = x; } Good -- 3 links changed and we did 6 assignments. Good -- this is O(1). Left-rotate is symmetric. * Insert Do a normal insert and color the new node red. Now possibly violating property 3 because the parent of the new node might be red. We will fix this in a pass to the root. 6 cases (try in order, so in case i we can assume 0,...,i-1 aren't true) Case 0s: * Our parent is black -- then we're already done. * We're at root -- color ourselves black (Note grandparent must be black, else original tree had a property 3 violation.) Case 1: Uncle is red: change parent and uncle to black, change grandparent to red Now property 3 is violated higher in the tree! recur with the grandparent. | | * O / \ / \ O O ====> * * / \ / \ / \ / \ O O / \ / \ We have to recur because the great-grandparent might be red. Case 2: Left child of left child: make grandparent red, parent black rotate grandparent right (We know the uncle is black) | | z * y * / \ / \ y O * w ======> x O O z / \ / \ / \ / \ x O C D E A B C * w / \ / \ A B D E Convince yourself properties 3 and 4 are not violated. We're done -- no further recursion necessary. Case 3: Right child of left child: Make self black, grandparent red rotate parent left rotate grandparent right (We know the uncle is black) | | | z * z * x * / \ / \ / \ y O * w ======> x O * w =====> y O O z / \ / \ / \ / \ / \ / \ A x O D E y O C D E A B C * w / \ / \ / \ B C A B D E Notice the second step is just case 2 with y instead of x! Convince yourself properties 3 and 4 are not violated. We're done -- no further recursion necessary. Case 4: Right child of right child: (just like 2 reversed: rotate grandparent left & color appropriately) Case 5: Left child of right child: (just like 3 reversed: rotate parent right, selft left, etc.) Efficiency: Every case takes O(1), but case 1 requires recursion with the grandparent. Nonetheless, the heigh of the tree is O(log n). So the time is O(1) times (number of times we do case 1), and the product is O(log n). [Here we did an example of three insertions into a red-black tree. I've had enough fun with ASCII art for one day so I will omit the example here.] * Delete Do normal BST delete. If we're deleting a node with two children, then we actually replace the node with the successor and delete the successor. In this case, we replace the node with the successor but with the deleted node's color. So in all cases, we can just focus on how to delete a node with 0 or 1 children... If the node is red, we can just delete it as normal -- property 4 will still be satisfied. Also, if the node has a red child, we can just delete the node and change the child to black. So we just have to worry about the case where the node and both its children (possibly null) are black. We delete the node and replace it with a child. To maintain property 4, we place a token which counts as an extra black on the child used to replace the node. This token isn't legal though, so we need to get rid of it while maintaining property 4. There are many cases which will appear in tomorrow's lecture notes. [We did some of them today, but I'll put them all together tomorrow.]