## Part 1: Variables and scope
In lecture you learned that at the top level,
let x = 0;;
let y = 1;;
is really shorthand for
let x = 0 in
let y = 1 in
This is one of many examples where indentation or notation changes how you
think about a piece of code. Even though the two are equivalent, the first
*seems* more like a sequence of definitions, and thinking of it this way makes
it easier to read about. It is also easier to look at the code and see where
each variable is defined.
For this reason, we **do not** indent the body of the let (the part after the
`in`). However, if we need to break a definition (the part after the `=` but
before the `in`) across multiple lines, we **do** indent those lines:
```
let x = if true
then 1
else 0 in
let y = x + x in
x + x
```
```
let f x =
let helper x y =
let t = x + y in
let z = t * t + x + y in
z * z in
helper 3 x;;
f 17
```
This last snippet is an example of a very common pattern: by keeping the helper
function `helper` within the scope of `f`, we clearly tell the reader that
`helper` is only needed for `f`. Variables defined in small scopes are
preferable to variables defined in large scopes because they allow the
programmer to ignore them most of the time.
For these exercises, do not use the top level (except to check your work).
Start by breaking these expressions on to multiple lines and adding
indentataion to help you reason about them.
**Exercise:** Give the values of the following expressions:
1. let x = 0 in let x = 1 in x
2. let x = let x = 0 in x in x
3. let x = 0 in let f x = x in let x = 1 in f 2
4. let x = 0 in let f y = x in let x = 1 in f 2
5. let x = let y = 0 in y in let z = 1 in let w = 2 in x + y + z + w
**Exercise:** Do expressions 3 and 4 differ only in the names of variables.
Does this violate the Principle of Name Irrelevance? Why or why not?
## Part 2: Modules
As we saw above, limiting the scope of variables is good practice because it
allows programmers to ignore variables that aren't relevant. Like `let ... in`
expressions, modules are mechanism for limiting the scope of names.
### Defining modules
A module is a collection of related variables with values. Modules are created
by placing a series of definitions in between `struct ... end`, and they are
given names using the `module` keyword. All modules have names that start with
uppercase letters.
```
module MyModule = struct
let x = 3
let f y = x + y
end
```
You can access the values within the module by prefacing them with the name of
the module and a `.`. You've already seen examples of this with the `Random`
module:
```
utop# Random.int;;
- : int -> int =
```
**Exercise:** Add `MyModule` to your working file for this recitation and
`#use` it. Write a function (outside of the module) that uses the values in
the `MyModule` module.
Just like values, modules have types (although they are called module types
instead of just types). Module types are bracketed by `sig ... end` and list
all of the definitions inside the module.
We will discuss module types in much more detail later. For now, the useful
thing to know about them is that you can print the type of a module in the
toplevel:
```
utop# module type T = module type of Random;;
```
**Exercise:** What is the module type of `MyModule`?
**Exercise:** Using just `utop`, find the string that is used to separate
file names. It's in the `File`something module, but you don't remember exactly
and your internet connection is down.
### Opening modules
As you've seen, you can access the contents of modules using a `.`. You can
also bring all of the definitions of a module into the current scope using
`open`:
```
utop# capitalize;;
Error: Unbound value capitalize
utop# String.capitalize;;
- : string -> string =
utop# open String;;
utop# capitalize;;
- : string -> string =
```
Opening a module is just like writing a let statement for each variable defined
in the module. `open String` does the same thing as
```
let length = String.length
let get = String.get
let set = String.set
...
let capitalize = String.capitalize
...
```
Occasionally, you may want to open a module in a limited scope (such as in the
body of a function). The syntax for this is
`let open M in ....`
### Pervasives
There is a special module called `Pervasives` that is automatically `open`ed.
It contains the "built-in" functions and operators.
**Exercise:** In the toplevel, look at the list of variables in the `Pervasives`
module. Find the name of the function that prints out a string followed by
an end of line character.
### A note on style
You now know three ways to access a variable from a module. Directly:
```
String.capitalize
```
By making a short name for the module:
```
module S = String
...
S.capitalize
```
Or by opening the module:
```
open String
...
capitalize
```
**Exercise:** What are the advantages of each style? When would you use which?
### Compiling and loading modules
Compiling an OCaml file produces a module having the same name as the file (but
with the first letter capitalized). We use the `cs3110` tool to compile and
test your assignments. These compiled modules can be loaded into the toplevel
using `#load`.
**Exercise:** Create a file called `mods.ml`. Inside that file, define a
variable `x` of type int, two functions `f` and `g`, and a module called `M`
containing a variable `y`. At the command line, type `cs3110 compile mods.ml`
to compile it.
Before you can load a module compiled with the cs3110 tool, you have to set up
`utop` properly:
- You must specify a directory to load compiled files from using `#directory`.
`cs3110 compile` outputs compiled modules in the `_build` directory, so you
need to type `#directory "_build";;`
- You must require any third-party libraries. Modules compiled with `cs3110 compile`
depend on the `pa_ounit` library. You need to type `#require "pa_ounit";;`
**Exercise:** At the toplevel, type `#load "mods.d.cmo";;`
to load the file you just compiled.
As with everything else you type into utop, you will want to put these
`#require`, `#directory` and `#load` commands into another file that you will
`#use` from utop (to save yourself typing).
Notice that `x` and `y` are not defined (as they would be if you had `#use`d
`mods.ml`). However, there is a module `Mods` available; `Mods.x`, `Mods.f`
and `Mods.g` are defined.
**Exercise:** Access `y` from the top level.
**Exercise:** Use a single `open` statement in utop to bring `y` into scope.
When compiling a file `f.ml`, `cs3110 compile` automatically figures out which
other files `f.ml` depends on, and recompiles those as necessary.
**Exercise:** Create a file `mods2.ml`. In it, create a function `f` that
depends on `Mods.x`. Use `cs3110 compile` to compile it. Notice that only
`mods2.ml` was compiled. Now, modify `mods.ml`, and recompile `mods2.ml`.
Notice that both `mods.ml` and `mods2.ml` were recompiled.
**Exercise:** To see clearly the difference between `#load` and `#use`, start
a new toplevel, `#use` mods.ml, and then `#use` mods2.ml. Why doesn't this
work?
Unfortunately, `utop`'s `#load` isn't as smart as `cs3110 compile`: it does not
automatically load dependencies.
**Exercise:** Start a fresh toplevel. `#load "mods2.d.cmo"` (don't forget
`#require` and `#directory`). Notice the error message telling you that `Mods` is undefined.
Load `mods` and then reload `mods2`.
While you're using the toplevel to experiment with your code, it's better to
compile and use `#load`, because this accurately reflects how your modules
interact with each other and with the outside world (including our testing
suite).
As you've already discovered, it can be a pain to get the `#load` statements in
the right order. As mentioned in lab 2, it's a good practice to have a file
that you `#use` while you are working in `utop`. The `#load` statements for
the modules you are testing along with the `#require` and `#directory` commands
are good things to put in there.
## Part 3: Printing
OCaml has a number of ways to print things. Pervasives has `print_` functions
for all of the built-in types: `print_int`, `print_string`, `print_char`, and
so on.
There is also the convenient `print_endline` function, which is like
`print_string`, but it breaks the line after printing.
The `Printf.printf` function is a very versatile and useful function for
printing. In the last lab, you used it to print octal and hexadecimal numbers,
but it can be used for much more.
The first argument to `printf` is a "format string": it looks like a normal
string, but it can contain placeholders that are replaced by the remaining
arguments to `printf`.
For example, the format string `"hello %s (version %d) \n"` has two placeholders:
the "%s" indicates that the next argument to `printf` should be a string, and
the "%d" indicates that the argument after that should be an integer, and will
be printed in decimal.
```
utop# Printf.printf "hello %s (version %d) \n";;
- : string -> int -> unit =
utop# Printf.printf "hello %s (version %d) \n" "world" 1;;
hello world (version 1)
- : unit = ()
```
See the documentation for the `Printf` module for full explanations of the
available placeholders.
**Exercise:** Fill in the format string in the following expression:
```
utop# Printf.printf "TODO" 95.0 true 5.0 false "ocaml";;
```
so that the output is
```
95% true
5% false
100% ocaml!
- : unit = ()
```
This can be done with only three spaces in the format string.
Curiously, the type of `printf` depends on the format string that is provided.
This is possible only because the OCaml language has special support for format
strings.
### Unit
The return type of all of the printing functions is a type called `unit`.
There is only one value of this type, which is written `()` and is also
pronounced "unit". Unit is used when you need to take an argument or return a
value, but there's no interesting value to pass or return.
Unit is often used when you're writing or using code that has side-effects.
Since you usually want to ignore values of type unit, there is a special syntax
for doing so. The expression `e1; e2` first evaluates `e1`, which should
produce `()`. It then discards that value, and evaluates `e2`.
Here is how `;` is often used:
```
let f _ =
let x = 3 + 4 in
print_string "x = ";
print_int x;
print_newline ();
let y = x + x in
Printf.printf "y = %i\n" y;
Printf.printf "returning %i" (y + y + x);
y + y + x
```
If `e1` does not have type `unit`, then `e1;e2` will give a warning, because
you are discarding useful values. If that is truly your intent, you can call
the built-in function `ignore` to convert any value to `()`:
```
utop# 3; 5;;
Characters 0-1:
Warning 10: this expression should have type unit.
- : int = 5
utop# ignore 3; 5;;
- : int = 5
```
**Exercise:** Modify your recursive `rev_base` function from the last lab so
that it prints something on every recursive call.
### Welcome to the dark side
Printing is your first taste of a "side effect": you don't call printing
functions for the values they return, but for the other things they do while
they run.
Side effects break many of the promises we've made about the benefits of
functional programming.
**Exercise:** To fully appreciate the power of the dark side, predict the
output of the following functions:
```
let lightning_storm _ =
let f x y = print_endline "A"; x + y in
(print_endline "B"; f) (print_endline "C"; 0) (print_endline "D"; 1)
```
```
let corrupt_padowan _ =
let f x y = print_endline "A"; x + y in
let g = print_endline "B"; f in
g (print_endline "C"; 0) (print_endline "D"; 1)
```
Verify your answer. Feel the power of the dark side flow through you.