All of the data structures we have covered in this course so far
are *functional*. You can access old versions of these data
structure, even when you update them. For example, you can cons an
element onto a list without affecting references to the original list.

We have seen efficient functional implementations for stacks, queues, and binary trees. Can we come up with an efficient implementation of functional arrays, too? It turns out that we can, but we will need to use imperative language features. Our array will be implemented using an OCaml array for efficiency, but we need to be careful that updates to the array do not affect previous versions.

Let's define a signature:

module type FUN_ARRAY = sig type 'a t (* make n x creates an array of length n such that each element is x *) val make : int -> 'a -> 'a t (* get a i returns element i in the array a. Requires: 0 <= i < length a *) val get : 'a t -> int -> 'a (* set a i x is the array a with x inserted at index i Requires: 0 <= i < length a *) val set : 'a t -> int -> 'a -> 'a t end

Let's define the `set`

function so that it makes a copy
of the array it was given, then updates the copy.

module FunArray : FUN_ARRAY = struct type 'a t = 'a array let make = Array.make let get = Array.get let set (a : 'a t) (i : int) (x : 'a) : 'a t = let a' = Array.copy a in a'.(i) <- x; a' endThis array is purely functional, but it's slow, because we make a new array every time call

`set`

.
Say we execute `FunArray.set a i x`

. Let's imperatively
set `a.(i) <- x`

. Then, let's update all of the old
references to `a`

so that they point to the old version of
`a`

.

How are we going to update the old references to `a`

?
One idea is to make `'a t`

a `ref`

. Every time
`FunArray.set`

is called, we change the value stored at
`a`

so that it has the same mappings as the original
array. Thus the old references to `a`

will still refer to
the old version of `a`

, even though the underlying array
is different.

type 'a t = 'a data ref and 'a data = Arr of 'a array | Diff of int * 'a * 'a t

An array can either be an `Arr(arr)`

, which has the
same mappings as the OCaml array `arr`

, or it can be a
`Diff(i, x, a')`

, which has all of the same mappings as the
functional array a', exception with the mapping `i -> x`

.
We then define the type `'a t`

so that it is a
`ref`

to a value of type `data`

.

Getting a value from such an array is simple.

let rec get (a : 'a t) (i : int) : 'a = match !a with Arr arr -> arr.(i) | Diff(j, x, a') -> if i = j then x else get a' iIf the array points to an

`Arr(arr)`

, then we just look up
the given index in `arr`

. If it points to a ```
Diff(j,
x, a')
```

, then we see if `j`

is the index we are
searching for. If it is, then we return `x`

, the value it
maps to.
Setting a value is more complicated

let set (a : 'a t) (i : int) (x : 'a) : 'a t = match !a with Arr arr -> let a' = ref (Arr arr) in a := Diff(i, arr.(i), a'); arr.(i) <- x; a' | Diff _ -> ref (Diff(i, x, a))

If `a`

points to a `Diff`

, then we return a
reference to a new `Diff`

which maps `i`

to
`x`

.

If `a`

points to an `Arr(arr)`

, then we
imperatively set `arr.(i) <- x`

. This operation invalidates
the old references to `a`

. These old references to
`a`

ought to refer to the old version of `a`

before the update. To get around this problem, we add a level of
indirection. We update the value stored at `a`

so that it
contains a `Diff`

preserving the old mapping of
`i`

.

`Diff`

nodes before
you reach the `Arr`

. There will be one of these
`Diff`

nodes per update.
We can improve performance when a user stops using the newest
version of the array and starts using an old version instead (for
example, in a backtracking algorithm). Every time a user sets or gets
an element of an array `a`

, we reverse the list of
`Diff`

nodes on the path to the `Arr`

so that
`a`

points directly to the `Arr`

. This operation
is known as *rerooting*.

(* Effects: reverses the list of Diff nodes along the path to the Arr *) let rec reroot (a : 'a t) : unit = match !a with Arr _ -> () | Diff(i, x, a') -> reroot a'; match !a' with Diff _ -> failwith "impossible" | Arr(arr) -> a := Arr(arr); a' := Diff(i, arr.(i), a); arr.(i) <- x

Every time the user calls `get`

or `set`

, we
reroot the tree. Note that this implementation is still inefficient
for some use cases. If a user frequently accesses many different
versions of an array, then rerooting will occur every time, and
performance will suffer. Another problem is that our definition of
`reroot`

is not tail recursive (it is possible to fix this
problem).

For more details, please see this paper on which this recitation is based. They use functional arrays to implement a functional disjoint set, a data structure that we will cover later in the course.

Here is our final implementation.

module FunArray : FUN_ARRAY = struct type 'a t = 'a data ref and 'a data = Arr of 'a array | Diff of int * 'a * 'a t let make (n : int) (x : 'a) : 'a t = ref (Arr (Array.make n x)) (* Effects: reverses the list of Diff nodes along the path to the Arr *) let rec reroot (a : 'a t) : unit = match !a with Arr _ -> () | Diff(i, x, a') -> reroot a'; match !a' with Diff _ -> failwith "impossible" | Arr(arr) -> a := Arr(arr); a' := Diff(i, arr.(i), a); arr.(i) <- x let rec get (a : 'a t) (i : int) : 'a = reroot a; match !a with Diff(j, x, a') -> failwith "impossible" | Arr arr -> arr.(i) let rec set (a : 'a t) (i : int) (x : 'a) : 'a t = reroot a; match !a with Arr arr -> let a' = ref (Arr arr) in a := Diff(i, arr.(i), a'); arr.(i) <- x; a' | Diff _ -> ref (Diff(i, x, a)) end