# Modules and Printing
We'll start by learning about how printing works in OCaml.
Then we'll use the OCaml module system to implement some queues and
dictionaries. Finally, we'll see how compilation of modules interacts with utop and
learn some new utop directives related to modules.
## Printing
The `Pervasives` module has built-in printing functions for several of
the built-in primitive types: `print_char`, `print_int`, `print_string`,
and `print_float`. There's also a `print_endline` function, which is
like `print_string` but also outputs a newline.
##### Exercise: hello world [✭]
Type the following in utop:
```
# print_endline "Hello world!";;
# print_string "Hello world!";;
```
Notice the difference in output from each.
□
Let's look at the type of those functions:
```
# print_endline;;
- : string -> unit =
# print_string;;
- : string -> unit =
```
They both take a string as input and return a value of type `unit`,
which we haven't seen before. There is only one value of this type,
which is written `()` and is also pronounced "unit". So `unit` is like
`bool`, except there is one fewer value of type `unit` than there is of
`bool`. Unit is therefore used when you need to take an argument or
return a value, but there's no interesting value to pass or return. Unit
is often used when you're writing or using code that has side effects.
Printing is an example of a side effect: it changes the world and can't
be undone.
If you want to print one thing after another, you could sequence some
print functions using nested let expressions:
```
let () = print_endline "THIS" in
let () = print_endline "IS" in
print_endline "3110"
```
The `()` in `let () =` is a pattern, the unit pattern, which matches
only the unit value. The entire expression evaluates to whatever
`print_endline 3110` returns, which is the unit value `()`, hence the
entire expression has type `unit`. But the evaluation has the side
effect of printing `THIS IS 3110` over multiple lines of output.
The boilerplate of all the `let () = ... in` above is annoying to have
to write! We don't really care about matching the unit value; after
all, there's only one thing it could be. So there's a special syntax
that can be used to chain together multiple functions who return unit.
The expression `e1; e2` first evaluates `e1`, which should evaluate to
`()`, then discards that value, and evaluates `e2`. So we could rewrite
the above code as:
```
print_endline "THIS";
print_endline "IS";
print_endline "3110"
```
And that is far more idiomatic code.
If `e1` does not have type `unit`, then `e1;e2` will give a warning, because
you are discarding useful values. If that is truly your intent, you can call
the built-in function `ignore : 'a -> unit` to convert any value to `()`:
```
# 3; 5;;
Warning 10: this expression should have type unit.
- : int = 5
# ignore 3; 5;;
- : int = 5
```
##### Exercise: print int list rec [✭✭]
Write a function `print_int_list : int list -> unit` that
prints its input list, one number per line. For example,
`print_int_list [1;2;3]` should result in this output:
```
1
2
3
```
Here is some code to get you started:
```
let rec print_int_list = function
| [] -> ()
| h::t -> (* fill in here *);
print_int_list t
```
□
##### Exercise: print int list iter [✭✭]
Write a function `print_int_list' : int list -> unit` whose
specification is the same as `print_int_list`. Do not
use the keyword `rec` in your solution, but instead to use
the [List module][list] function `List.iter`.
Here is some code to get you started:
```
let print_int_list' lst =
List.iter (fun x -> (* fill in here *)) lst
```
[list]: http://caml.inria.fr/pub/docs/manual-ocaml/libref/List.html
□
## Queues
Create a file named `myqueue.ml`. In it, put the following code:
```
module ListQueue = struct
type 'a queue = 'a list
let empty = []
let is_empty q = (q = [])
let enqueue x q = q @ [x]
let peek = function
| [] -> failwith "Empty"
| x::_ -> x
let dequeue = function
| [] -> failwith "Empty"
| _::q -> q
end
```
Launch utop and `#use "myqueue.ml"`. You will then be able to
create and use `ListQueue`, for example:
```
# let q = ListQueue.enqueue 1 (ListQueue.enqueue 2 ListQueue.empty);;
# ListQueue.peek q;; (* returns 2, because 2 was enqueued first *)
```
##### Exercise: big list queue [✭✭]
Use the following code to create `ListQueue`'s of exponentially
increasing length: 10, 100, 1000, etc. How big of a queue can you create before
there is a noticeable delay? How big until there's a delay of at least 10 seconds?
(Note: you can abort utop computations with Ctrl-C.)
```
(* Creates a ListQueue filled with [n] elements. *)
let fill_listqueue n =
let rec loop n q =
if n=0 then q
else loop (n-1) (ListQueue.enqueue n q) in
loop n ListQueue.empty
```
□
Add the following code to `myqueue.ml`:
```
module TwoListQueue = struct
type 'a queue = {front:'a list; back:'a list}
let empty = {front=[]; back=[]}
let is_empty = function
| {front=[]; back=[]} -> true
| _ -> false
let norm = function
| {front=[]; back} -> {front=List.rev back; back=[]}
| q -> q
let enqueue x q = norm {q with back=x::q.back}
let peek = function
| {front=[]; _} -> None
| {front=x::_; _} -> Some x
let dequeue = function
| {front=[]; _} -> None
| {front=_::xs; back} -> Some (norm {front=xs; back})
end
```
##### Exercise: big two-list queue [✭✭]
Use the following function to again create queues of exponentially
increasing length:
```
let fill_twolistqueue n =
let rec loop n q =
if n=0 then q
else loop (n-1) (TwoListQueue.enqueue n q) in
loop n TwoListQueue.empty
```
Now how big of a queue can you create before there's a delay of at least 10 seconds?
□
If we compare `ListQueue` and `TwoListQueue`, it's hopefully
clear that `ListQueue` is a simple and correct implementation of
a queue data structure. It's probably far less clear that `TwoListQueue` is
a correct implementation! For more details about these two implementations,
look at the [code associated with this lecture and lab][mod-code].
One of the additional exercises at the end of this lab asks you to
explore the efficiencies of these two implementations a bit further.
[mod-code]: code.ml
## Dictionaries
A *dictionary* maps *keys* to *values*. This data structure typically
supports at least a lookup operation that allows you to find the value
with a corresponding key, an insert operation that lets you create a new
dictionary with an extra key included. And there needs to be a way of
creating an empty dictionary.
We might represent this in a `Dictionary` module type as follows:
```
module type Dictionary = sig
type ('k, 'v) t
(* The empty dictionary *)
val empty : ('k, 'v) t
(* [insert k v d] produces a new dictionary [d'] with the same mappings
* as [d] and also a mapping from [k] to [v], even if [k] was already
* mapped in [d]. *)
val insert : 'k -> 'v -> ('k,'v) t -> ('k,'v) t
(* [lookup k d] returns the value associated with [k] in [d].
* raises: [Not_found] if [k] is not mapped to any value in [d]. *)
val lookup : 'k -> ('k,'v) t -> 'v
end
```
Note how the type `Dictionary.t` is parameterized on two types, `'k` and `'v`,
which are written in parentheses and separated by commas. Although `('k,'v)`
might look like a pair of values, it is not: it is a syntax for writing
multiple type variables.
We have seen already in this class that an association list can be used
as a simple implementation of a dictionary. For example, here is an
association list that maps some well-known names to an approximation of
their numeric value:
```
[("pi", 3.14); ("e", 2.718); ("phi", 1.618)]
```
Let's try implementing the `Dictionary` module type with a module
called `AssocListDict`.
##### Exercise: AssocListDict [✭✭]
Put the code for `Dictionary` above into a file named `dict.ml`.
Put this code after it:
```
module AssocListDict = struct
type ('k, 'v) t = ('k * 'v) list
let empty = failwith "Unimplemented"
let insert k v d = failwith "Unimplemented"
let lookup k d = failwith "Unimplemented"
end
```
Finish the implementation of each of those three definitions.
For `lookup`, use the library function `List.assoc`.
□
##### Exercise: AssocListDict encapsulated [✭✭]
Launch utop and type:
```
# #use "dict.ml";;
# open AssocListDict;;
# let d = insert 1 "one" empty;;
```
The response you get should be:
```
val d : (int * string) list = [(1, "one")]
```
Now close utop. Change the first line of your implementation of
`AssocListDict` in `dict.ml` to the following:
```
module AssocListDict : Dictionary = struct
```
Restart utop and repeat the three phrases above (use,open,let).
What is the response you get now? Explain in your own words
why the response changed. Your answer should incorporate
"abstract type" and "encapsulation".
□
## Modules and utop
*There are several pragmatics with modules and the toplevel that you need
to understand to use them together effectively. The following sequence
of exercises, which should be done in order, is designed to help you
navigate some of the common pitfalls and confusions that come up.*
Compiling an OCaml file produces a module having the same name as the file (but
with the first letter capitalized). These compiled modules can be loaded into
the toplevel using `#load`.
##### Exercise: load [✭]
Create a file called `mods.ml`. Inside that file, put the following code:
```
let b = "bigred"
let inc x = x+1
module M = struct
let y = 42
end
```
At the command line, type `ocamlbuild mods.byte` to compile it. Type
`ls _build` to see the files that `ocamlbuild` produced. One of them
is `mods.cmo`: this is a compiled module object file,
aka bytecode.
Now run `utop`, and give it this directive (recall that the `#` character is
required in front of a directive, it is not part of the prompt):
```
# #directory "_build";;
```
That tells utop to add the `_build` directory to the path in which it
looks for compiled (and source) files.
Now give utop this directive:
```
# #load "mods.cmo";;
```
There is now a module named `Mods` available to be used. Try these
expressions in utop:
```
# Mods.b;;
# Mods.M.y;;
```
Now try these:
```
# inc;; (* will fail *)
# open Mods;;
# inc;;
```
Finish by exiting utop.
□
If you are doing a lot of testing of a particular module, it can be
annoying to have to type those directives (`#directory` and `#load`)
every time you start utop. The solution is to create a file
in the working directory and call that file `.ocamlinit`. Note
that the `.` at the front of that filename is required and makes
it a [hidden file][hidden] that won't appear in directory listings
unless explicitly requested (e.g., with `ls -a`). Everything
in `.ocamlinit` will be processed by utop when it loads.
##### Exercise: ocamlinit [✭]
Using your text editor, create a file named `.ocamlinit` in the same
directory as `mods.ml`. In that file, put the following:
```
#directory "_build";;
#load "mods.cmo";;
open Mods
```
Now restart utop. All the names defined in `Mods` will already be in scope.
For example, these will both succeed:
```
# inc;;
# M.y;;
```
[hidden]: https://en.wikipedia.org/wiki/Hidden_file_and_hidden_directory
□
If you are building code that depends on third-party libraries,
you can load those libraries with another directive:
##### Exercise: require [✭]
Add the following lines to the end of `mods.ml`:
```
open OUnit2
let test = "testb" >:: (fun _ -> assert_equal "bigred" b)
```
Try to recompile the module with `ocamlbuild mods.byte`. That
will fail, because you need to tell the build system to include
the third-party library OUnit. So try to recompile with
`ocamlbuild -pkg oUnit mods.byte`. That will succeed.
Now try to restart utop. If you look closely, there will be an
error message:
```
File ".ocamlinit", line 1:
Error: Reference to undefined global `OUnit2'
```
The problem is that the OUnit library hasn't been loaded into utop yet.
Type the following directive:
```
#require "oUnit";;
```
Now you can successfully load your own module without getting an error.
```
#load "mods.cmo";;
```
Quit utop. Add the `#require` directive to `.ocamlinit` anywhere before the
`#load` directive. Restart utop. Verify that `inc` is in scope.
□
When compiling a file, the build system automatically figures out which
other files it depends on, and recompiles those as necessary. The toplevel,
however, is not as sophisticated: you have to make sure to load all
the dependencies of a file, as we'll see in the next exercise.
##### Exercise: loads [✭]
Create a file `mods2.ml`. In it, put this code:
```
open Mods
let x = inc 0
```
Run `ocamlbuild -pkg oUnit mods2.byte`. Notice that you don't have to
name `mods.byte`, even though `mods2.ml` depends on the module `Mod`.
The build system is smart that way.
Go back to `.ocamlinit` and change it be to just the following:
```
#directory "_build";;
#require "oUnit";;
```
Restart utop. Try this directive:
```
# #load "mods2.cmo";;
Error: Reference to undefined global `Mods'
```
The toplevel did not automatically load the modules that `Mods2` depends upon.
You can do that manually:
```
# #load "mods.cmo";;
# #load "mods2.cmo";;
```
Or you can tell the toplevel to load `Mods2` and recursively to load everything
depends on:
```
# #load_rec "mods2.cmo";;
```
□
There is a big difference between `#load`-ing a compiled module file and `#use`-ing
an uncompiled source file. The former loads bytecode and makes it available for use.
For example, loading `mods.cmo` caused the `Mod` module to be available,
and we could access its members with expressions like `Mod.b`.
The latter (`#use`) is *textual inclusion*: it's like typing the contents of
the file directly into the toplevel. So using `mods.ml` does **not** cause
a `Mod` module to be available, and the definitions in the file
can be accessed directly, e.g., `b`. Let's try that out...
##### Exercise: load vs. use [✭]
Start a new toplevel, first making sure your `.ocamlinit` does not contain any
`#load` directives. We already know from the previous exercise that
we could load first `mods.cmo` then `mods2.cmo`. Let's try using the source
code:
```
# #use "mods.ml" (* will succeed *)
# b;;
val b : string = "bigred"
# #use "mods2.ml" (* will fail *)
Error: Reference to undefined global `Mods'
```
The problem is that `#use`-ing `mods.ml` did not introduce a module into scope.
Rather it directly put the definitions found in that source file into scope.
□
So when you're using the toplevel to experiment with your code, it's often
better to work with `#load`, because this accurately reflects how your modules
interact with each other and with the outside world, rather than `#use` them.
## Additional exercises
##### Exercise: queue efficiency [✭✭✭]
Compare the implementations of `enqueue` in `ListQueue` vs.
`TwoListQueue`. Explain in your own words why the efficiency of
`ListQueue.enqueue` is linear time in the length of the queue. *Hint:
consider the `@` operator.* Then explain why adding \\(n\\) elements to
the queue takes time that is quadratic in \\(n\\).
Now consider `TwoListQueue.enqueue`. Suppose that the queue is in a
state where it has never had any elements dequeued. Explain in your own
words why `TwoListQueue.enqueue` is constant time. Then explain why
adding \\(n\\) elements to the queue takes time that is linear in
\\(n\\).
*(Note: the enqueue and dequeue operations for `TwoListQueue` remain
constant time even after interleaving them, but showing why that is so
require the study of *amortized analysis*, which we will not cover here.)*
□
##### Exercise: binary search tree dictionary [✭✭✭]
Write a module `BstDict` that implements the `Dictionary`
module type using the `tree` type and `lookup` and `insert`
functions that you defined in the lab on variants.
□
##### Exercise: fraction [✭✭✭✭]
Write a module that implements the `Fraction` module type below:
```
module type Fraction = sig
(* A fraction is a rational number p/q, where q != 0.*)
type t
(* [make n d] is n/d. Requires d != 0. *)
val make : int -> int -> t
val numerator : t -> int
val denominator : t -> int
val toString : t -> string
val toReal : t -> float
val add : t -> t -> t
val mul : t -> t -> t
end
```
Your module should ensure these invariants for every value `v` of type `t`
that it returns from `make`, `add`, and `mul`:
1. `v` is in *[reduced form][irreducible]*
2. the denominator of `v` is positive
For the first invariant, you might find this implementation of Euclid's
algorithm to be helpful:
```
(* [gcd x y] is the greatest common divisor of [x] and [y].
* requires: [x] and [y] are positive.
*)
let rec gcd (x:int) (y:int) : int =
if x = 0 then y
else if (x < y) then gcd (y - x) x
else gcd y (x - y)
```
[irreducible]: https://en.wikipedia.org/wiki/Irreducible_fraction