Mutable data types

Today we will practice with mutable fields, refs, and physical equality. We'll learn about arrays and loops. As a challenge exercise, you can implement doubly-linked lists.

Mutable fields and refs

Exercise: mutable fields [✭]

Define an OCaml record type to represent student names and GPAs. It should be possible to mutate the value of a student's GPA. Write an expression defining a student with name "Alice" and GPA 3.7. Then write an expression to mutate Alice's GPA to 4.0.

□

Exercise: refs [✭]

Give OCaml expressions that have the following types. Use utop to check your answers.

bool ref
int list ref
int ref list

□

Exercise: inc fun [✭]

Define a reference to a function as follows:

# let inc = ref (fun x -> x+1);;

Write code that uses inc to produce the value 3110.

□

Exercise: addition assignment [✭✭]

The C language and many languages derived from it, such as Java, has an addition assignment operator written a += b and meaning a = a + b. Implement such an operator in OCaml; its type should be int ref -> int -> unit. Here's some code to get you started:

let (+:=) x y = ...

And here's an example usage:

# let x = ref 0;;
# x +:= 3110;;
# !x
- : int = 3110

□

Equality and mutability

Recall that OCaml has two different notions of equality:

Physical equality (==) tests whether two mutable values (references, records, or arrays) have the same identity.
Structural equality (=) tests whether the current contents of a reference are equivalent.

The specification given by Pervasives.(==) has this to say about physical equality:

e1 == e2 tests for physical equality of e1 and e2. On mutable types such as references, arrays, ..., and records with mutable fields, e1 == e2 is true if and only if physical modification of e1 also affects e2. On non-mutable types, the behavior of ( == ) is implementation-dependent; however, it is guaranteed that e1 == e2 implies compare e1 e2 = 0.

Exercise: physical equality [✭✭]

Define x, y, and z as follows:

let x = ref 0
let y = x
let z = ref 0

Predict the value of the following series of expressions:

# x == y;;
# x == z;;
# x = y;;
# x = z;;
# x := 1;
# x = y;;
# x = z;;

Check your answers in utop.

□

Arrays

OCaml arrays generalize a single ref cell to a sequence of mutable values.

# let v = [| 0.; 1. |];;
v : float array = [| 0.; 1. |]

# v.(0);;
- : float = 0.

# v.(0) <- 5.;;
- : unit = ()

# v;;
- : float array = [| 5.; 1. |]

The Array module contains useful functions for creating and working with arrays.

For the next couple exercises, let's use the following type:

(* AF: the float array [| x1; ...; xn |] represents the 
 *     vector (x1, ..., xn) 
 * RI: the array is non-empty *)
type vector = float array

The Euclidean norm of an $n$ -dimensional vector $x = (x_1, \ldots, x_n)$ is written $|x|$ and is defined to be

$\sqrt{x_1^2 + \cdots + x_n^2}.$

Exercise: norm [✭✭]

Write a function norm : vector -> float that computes the Euclidean norm of a vector. Your function should not mutate the input array. Hint: although your first instinct is likely to reach for a loop, instead try to use Array.map and Array.fold_left or Array.fold_right.

□

Every vector can be normalized by dividing each component by $|x|$ ; this yields a vector with norm 1:

$\left(\frac{x_1}{|x|}, \ldots, \frac{x_n}{|x|}\right)$

Exercise: normalize [✭✭]

Write a function normalize : vector -> unit that normalizes a vector "in place" by mutating the input array. Here's a sample usage:

# let a = [|1.; 1.|];;
val a : float array = [|1.; 1.|]
# normalize a;;
- : unit = ()
# a;;
- : float array = [|0.7071...; 0.7071...|]

Hint: Array.iteri.

□

Loops

When working with arrays, it is often convenient to write traditional for or while loops. The "Loops" paragraph of section 6.7.2 "Control Structures" in the OCaml manual contains the syntax and semantics of loops. Here is a brief summary:

while e1 do e done
for x = e1 to e2 do e done
for x = e1 downto e2 do e done

Each of these three expressions evaluates the expression between do and done for each iteration of the loop; while loops terminate when e1 becomes false; for loops execute once for each integer from e1 to e2; for..to loops evaluate starting at e1 and incrementing x each iteration; for..downto loops evaluate starting at e1 and decrementing x each iteration. All three expressions evaluate to () after the termination of the loop. Because they always evaluate to (), they are less general than folds, maps, or recursive functions.

Loops are themselves not inherently mutable, but they are most often used in conjunction with mutable features like arrays—typically, e causes side effects.

Exercise: normalize loop [✭✭]

Modify your implementation of normalize to use one of the looping expressions.

□

Additional exercises

Exercise: norm loop [✭✭]

Modify your implementation of norm to use one of the looping expressions. Here is pseudocode for what you should do:

initialize norm to 0.0
loop through array
  add to norm the square of the current array component
return sqrt of norm

□

Exercise: imperative factorial [✭✭]

Write a factorial function that uses a loop expression and refs to compute the factorial of its input.

□

Here's a neat trick that's possible with refs: we can build recursive functions without ever using the keyword rec. Suppose we want to define a recursive function such as fact:

let rec fact n = 
  if n = 0 then 1 else n * fact (n-1)

Abstracting a little, that function has the following form:

let rec f x = 
  ... some code including a recursive call [f y] from some argument [y] ...

We can instead write the following:

let f0 = ref (fun x -> x)
let f x = ... replace [f y] with [!f0 y] ...
f0 := f

Now f will compute the same result as it did in the version where we defined it with rec. What's happening here is sometimes called "tying the recursive knot": we update the reference to f0 to point to f, such that when f dereferences f0, it gets itself back! The initial function to which we made f0 point (in this case the identity function) doesn't really matter; it's just there as a placeholder until we tie the knot.

Exercise: tie the knot [✭✭✭]

Define the factorial function without using any loops and without the rec keyword.

□

Exercise: init matrix [✭✭✭]

The array module contains two functions for creating an array: make and init. make creates an array and fills it with a default value, while init creates an array and uses a provided function to fill it in. The library also contains a function make_matrix for creating a two-dimensional array, but it does not contain an analogous init_matrix to create a matrix using a function for initialization. Write a function init_matrix : int -> int -> (int -> int -> 'a) -> 'a array array such that init_matrix n o f creates and returns an n by o matrix m with m.(i).(j) = f i j for all i and j in bounds. See the documentation for make_matrix for more information on the representation of matrices as arrays.

□

Challenge exercise: Doubly-linked lists

Consider the following types and functions for mutable doubly-linked lists:

(* An ['a node] is a node of a mutable doubly-linked list.
 * It contains a value of type ['a] and optionally has 
 * pointers to previous and/or next nodes. *)
type 'a node = {
  mutable prev : 'a node option;
  mutable next : 'a node option;
  value : 'a
}

(* An ['a dlist] is a mutable doubly-linked list with elements 
 * of type ['a].  It is possible to access the first and 
 * last elements in constant time.  
 * RI: The list does not contain any cycles. *)
type 'a dlist = {
  mutable first : 'a node option;
  mutable last : 'a node option;
}

(* [create_node v] is a node containing value [v] with
 * no links to other nodes. *)
let create_node v = {prev=None; next=None; value=v}

(* [empty_dlist ()] is an empty doubly-linked list. *)
let empty_dlist () = {first=None; last=None}

(* [create_dlist n] is a doubly-linked list containing
 * exactly one node, [n]. *)
let create_dlist (n: 'a node) : 'a dlist = {first=Some n; last=Some n}

(* [insert_first d n] mutates dlist [d] by
 * inserting node [n] as the first node. *)
let insert_first (d: 'a dlist) (n: 'a node) : unit =
  failwith "unimplemented"

(* [insert_last d n] mutates dlist [d] by
 * inserting node [n] as the last node. *)
let insert_last (d: 'a dlist) (n: 'a node) : unit =
  failwith "unimplemented"

(* [insert_after d n1 n2] mutates dlist [d] by
 * inserting node [n2] after node [n1]. *)
let insert_after (d: 'a dlist) (n1: 'a node) (n2: 'a node) : unit =
  failwith "unimplemented"

(* [insert_before d n1 n2] mutates dlist [d] by
 * inserting node [n2] before node [n1]. *)
let insert_before (d: 'a dlist) (n1: 'a node) (n2: 'a node) : unit =
  failwith "unimplemented"

(* [remove d n] mutates dlist [d] by removing node [n].
 * requires: [n] is a node of [d]. *)
let remove (d: 'a dlist) (n: 'a node) : unit =
  failwith "unimplemented"

(* [iter_forward d f] on a dlist [d] which has 
 * elements n1; n2; ... is (f n1); (f n2); ... *)
let iter_forward (d: 'a dlist) (f: 'a -> unit) : unit =
  failwith "unimplemented"

(* [iter_backward d f] on a dlist [d] which has 
 * elements n1; n2; ... is ...; (f n2); (f n1) *)
let iter_backward (d: 'a dlist) (f: 'a -> unit) : unit =
  failwith "unimplemented"

Exercise: doubly linked list [✭✭✭✭]

Complete the implementations above. Test your code.

Hint: draw pictures! Reasoning about mutable data structures is typically easier if you draw a picture.

□