2-3 trees, B-trees
Notes by Dan Grossman

2-3 Trees:
	These are a completely different way to achieve the
	all-important O(log n) guarantee.  

	(A 2-3 Tree is NOT a binary search tree.)

	In a 2-3 tree,
		* All the data objects (keys and vals) are at the leaves.
		* All leaves are at the same depth.
		* The leaves are in order left to right (but don't really
		  have pointers between them)
		* All internal nodes have either 2 or 3 children and
		  the value of the smallest leaf in its subtree.

	Claim: The depth of the tree is O(log n).
	Proof: The deepest it could be is when every internal node has 2
	       children.  In this case, there are n leaves, n/2 nodes at the
	       level above, n/4 at the next level, n/8, n/16,..., 1.  The sum
	       is less 2n (the standard sequence 1 + 1/2 + 1/4 + 1/8 + ... 
	       converges to 2 and we have n times part of the sequence).

	       The shallowest is when every internal node has
	       3 children.  So the height is always between log_3 n
	       and log_2 n.

	Search: Go to child with the largest value that is less than or equal to
	        what we're looking for.

	Min is go left (the key is at the root, but the value is at the 
			leftmost leaf!), Max is go right
	Predecessor: Go up until you have a closest left sibling, 
		     then find max of tree at that sibling.
	Successor: Go up until you have a closest right sibling, 
		   then find min of tree at that sibling.

	(Alternately, could keep doubly-linked list.
         If pred/succ are really the common operations, then only
	 have a doubly-linked list. We will not need pred/succ in any other
	 operations like we did for red-black trees.)

	Insert: Put in right place.  Problem: parent might now have 4
	children.
		x = parent;
		while (x has 4 children) {
			let x's children be A, B, C, D in that order
			make x_1 with 2 children A, B
			and x_2 with 2 children C, D
			remove x from it's parent and add x_1 and x_2
				(if was root, make a new higher root)
			x = x.parent
		}
		Note we might even change the root (increase the height).
	The rebalancing is called splitting.

	Delete: Take it out.  Problem: parent might now have 1 child.
		x = parent;
		while (x has 1 child) {
			if root, make the child the root
			if it has 3 closest nephews, move closest  here
				Done
			if it has 2 closest nephews, move child there
				delete x and set x to x.parent
		}

   This all works and is thanks to your Dean of Engineering in 1970.

* Example:
  First notice that 2-3 trees are not unique for a set of keys.  These are both
  legal 2-3 trees for the keys 8, 6, 7, 5, 3, 0, 9:

                0                                 0
            /   |    \                        /   |  \
           0    5     7                      0     6   8
          / \  / \  / | \                  / | \  / \ / \
         0   3 5 6 7  8  9                0  3 5 6  7 8  9
  Notice the leaves will always be in the same order though.  The height may 
  be different, although it isn't above.

  Insert 1 into the tree on the right:                                0
insert            0        split             0    split/ new root  /     \
===>          /   |   \     ====>     /   /  |  \  ===>            0      6
            0     6    8             0   3   6   8               /  \   /  \
        / / | \  / \  / \           / \ / \ / \ / \             0    3  6   8
       0 1  3 5 6  7  8  9         0  1 3 5 6 7 8 9            / \  / \ / \ / \
                                                              0  1 3  5 6 7 8  9

(Note: It is NOT the case that we always have a binary tree after making a new
 root.)
                         insert        0
  Insert 2 into result:  =====>     /      \
                                   0        6
                                 /  \      /  \
                                0    3    6    8
                              / | \  / \ / \   / \
                             0  1  2 3 5 6 7  8   9
  Delete 9 from result:
delete        0         merge         0     merge/delete root    
====>      /      \      ====>      /    \    ===>    
          0        6              0       6                        0
        /   \     /  \           /  \     |                   /    |   \
       0    3    6   8          0   3     6                  0     3     6
      /|\  / \  / \  |        /|\  / \   /|\               / | \  / \   /|\
     0 1 2 3 5 6  7  8       0 1 2 3 5  6 7 8             0  1 2  3  5 6 7 8

   Delete 2 from result: delete          0
                         =====>       /  |  \
                                     0   3    6
                                    / \ / \  /|\
                                   0  1 3 5 6 7 8

* In fact 2-3-4 trees work just fine too.  There's even a strange
  resemblance between red-black trees and 2-3-4 trees if you squint
  enough.  This is a mathematical artifact of being balanced enough -- do NOT
  confuse the two kinds of trees!

* What's the real trade-off:
    * No: In a red-black tree, some nodes are near the top
	(This is a pretty bogus argument because only exponentially few 
	 of the nodes are near the top.  It's an okay argument only if those
	 are the ones accessed.
    * Yes: 2-3 trees may be easier to code, but the constant factors
           (in terms of space and perhaps time) are worse.  (Just remember
	   we have as many as 2n nodes in our tree.)

* There is a generalization of 2-3 trees called B trees.  The basic
  idea is everything still works if 2 becomes k and 3 becomes 2k.
  The height is now between log_2k n and log_k n.

B-trees

* There's a generalization of 2-3 trees called B-Trees.  It turns out
  everything still works just fine by making 2 => k and 3 => 2k (or 2k-1
  if you prefer).  Now the height is between log_2k n and log_k n. 

  Other minor changes:
  * Use an array of child pointers (for convenience or maybe for binary search).
  * Move minimum values to the parent (to avoid following child pointers -- 
    could have done this for 2-3 trees too.  In B-tree applications it's more
    important.  We'll see why in a minute.)

  Now search is log_2 k * log_k n , insert and delete basically k log_k n.

  What is k? It's chosen so one node fits exactly on one page of disk.
  (This is also why we moved the minimum values to the parents.)
  Computation is so much cheaper than bringing in pages from disk,
  that we should minimize pages.  (So things like binary search on the
  array are completely irrelevant.)

  This is great for B-trees -- we have the space, we just don't want to touch
  it.  Red-black doesn't generalize to k and d touch more nearby nodes
  when rebalancing.  As a result, B-trees are very common in databases.