# Balanced Trees
In this lab, we will further explore binary search trees and red-black
trees. We'll also take a brief look at tree traversals. As an optional
but recommended exercise at the end, you can implement dictionaries
using red-black trees.
## Binary search trees
##### Exercise: draw BST [✭✭]
Draw binary search trees of height 3, 4, 5, 6, and 7
for the set {1, 4, 5, 10, 16, 17, 21}.
The height of a tree is the maximum number of nodes
on any path from its root to a leaf.
□
##### Exercise: search sequence [✭✭]
Suppose we have the numbers 1 through 1000 in a binary search
tree and want to search for the number 363. Which of the following
could *not* be the sequence of nodes examined?
A. 2, 252, 401, 398, 330, 344, 397, 363
B. 924, 220, 911, 244, 898, 258, 362, 363
C. 925, 202, 911, 240, 912, 245, 363
D. 2, 399, 387, 219, 266, 382, 381, 278, 363
E. 935, 278, 347, 621, 399, 392, 358, 363
□
##### Exercise: functorized BST [✭✭✭]
Our implementation of BSTs in lecture assumed that it was okay
to compare values using the built-in comparison operators `<`, `=`,
and `>`. But what if the client of the `Set` abstraction wanted to
use their own comparison operators? (e.g., to ignore case in strings,
or to have sets of records where only a single field of the record
was used for ordering.) Reimplement the `BstSet` abstraction as a
functor parameterized on a structure that enables client-provided
comparison operator(s), much like the [standard library `Set`][stdlib-set].
[stdlib-set]: http://caml.inria.fr/pub/docs/manual-ocaml/libref/Set.html
□
## Tree traversals
Suppose you wanted to convert a tree to a list. You'd have to
put the values stored in the tree in some order. Here are three
ways of doing that:
* *preorder*: each node's value appears in the list before the values of its
left then right subtrees.
* *inorder*: the values of the left subtree appear, then the value at the node,
then the values of the right subtree.
* *postorder*: the values of a node's left then right subtrees appear, followed by
the value at the node.
Here is code that implements those *traversals*, along with
some example applications:
```
type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
let rec preorder = function
| Leaf -> []
| Node (l,v,r) -> [v] @ preorder l @ preorder r
let rec inorder = function
| Leaf -> []
| Node (l,v,r) -> inorder l @ [v] @ inorder r
let rec postorder = function
| Leaf -> []
| Node (l,v,r) -> postorder l @ postorder r @ [v]
let t =
Node(Node(Node(Leaf, 1, Leaf), 2, Node(Leaf, 3, Leaf)),
4,
Node(Node(Leaf, 5, Leaf), 6, Node(Leaf, 7, Leaf)))
(*
t is
4
/ \
2 6
/ \ / \
1 3 5 7
*)
let () = assert (preorder t = [4;2;1;3;6;5;7])
let () = assert (inorder t = [1;2;3;4;5;6;7])
let () = assert (postorder t = [1;3;2;5;7;6;4])
```
##### Exercise: efficient traversal [✭✭✭]
On unbalanced trees, the traversal functions above require quadratic
worst-case time (in the number of nodes), because of the `@` operator.
Re-implement the functions without `@`, and instead using `::`, such
that they perform exactly one cons per `Node` in the tree. Thus the
worst-case execution time will be linear. You will need to add an
additional accumulator argument to each function, much like with tail
recursion. (But your implementations won't actually be tail recursive.)
□
## Red-black trees
##### Exercise: RB draw complete [✭✭]
Draw the perfect binary tree on the values 1, 2, ..., 15.
Color the nodes in three different ways such that (i) each
way is a red-black tree (i.e., satisfies the red-black invariants),
and (ii) the three ways create trees with black heights of
2, 3, and 4, respectively. The *black height* of a tree
is the maximum number of black nodes along any path from its
root to a leaf.
□
##### Exercise: RB draw insert [✭✭]
Draw the red-black tree that results from inserting
the characters D A T A S T R U C T U R E into an empty tree.
Carry out the insertion algorithm yourself by hand, then check
your work with the implementation provided in lecture.
□
##### Exercise: RB rotate [✭✭✭]
Re-implement `insert` for red-black trees yourself, without any notes
except for [this picture of rotations](rbrot.jpg) [source: *Purely
Functional Data Structures*, Chris Okasaki, 1998]. Especially, challenge
yourself to implement the pattern-matching involved in `balance` based
solely on that picture.
□
##### Exercise: time performance [✭]
Using the lecture code, re-run the experiments shown in lecture.
That is, run `test 50_000` to find out how long the implementations
take on your own machine. You might first also try smaller workloads,
such as 1,000 or 10,000 elements.
□
##### Exercise: space performance [✭✭✭✭]
In lecture we saw a comparison of three `Set` implementations in
terms of their time performance on two workloads. But what about
their space performance—that is, how much memory they use?
Modify the lecture code to conduct such an experiment. Use
the [standard library `Gc` module][stdlib-gc] to get statistics
on memory usage. `Gc.minor_words` would be a good place to start.
[stdlib-gc]: http://caml.inria.fr/pub/docs/manual-ocaml/libref/Gc.html
□
## Additional exercises
##### Exercise: standard library set [✭✭, optional]
Read the [source code][stdlib-set-ml] of the standard library `Set` module.
Find the representation invariant for the balanced trees that it uses.
Which kind of tree does it most resemble: 2-3, AVL, or red-black?
[stdlib-set-ml]: https://github.com/ocaml/ocaml/blob/trunk/stdlib/set.ml
□
##### Exercise: dictionary [✭✭✭, optional but recommended]
We've so far used balanced trees to implement sets. But they can
also efficiently implement maps aka dictionaries. Just add
an additional "key" component to each `Node`: instead of one value,
now you have one key and one value at each node. For example,
```
type ('k,'v) tree =
| Leaf
| Node of ('k,'v) tree * 'k * 'v * ('k,'v) tree
```
Modify the red-black tree implementation of sets to produce an
implementation of dictionaries. Your implementation should
match this interface:
```
module type Dictionary = sig
type ('k,'v) t
val empty : ('k,'v) t
val mem : 'k -> ('k,'v) t -> bool
val insert : 'k -> 'v -> ('k,'v) t -> ('k,'v) t
val find : 'k -> ('k,'v) t -> 'v option
val bindings : ('k,'v) t -> ('k*'v) list
val of_list : ('k*'v) list -> ('k,'v) t
end
```
The names of the functions above are meant to suggest their
specifications, which you should write yourself.
□
* * *
**Acknowledgement:** this lab uses some problems from *Introduction
to Algorithms*, first edition, by Cormen, Leiserson, and Rivest, 1990.