Recitation: Scope, Modules, and Printing

## Part 1: Variables and scope In lecture you learned that at the top level, let x = 0;; let y = 1;; is really shorthand for let x = 0 in let y = 1 in <everything else that follows> This is one of many examples where indentation or notation changes how you think about a piece of code. Even though the two are equivalent, the first *seems* more like a sequence of definitions, and thinking of it this way makes it easier to read about. It is also easier to look at the code and see where each variable is defined. For this reason, we **do not** indent the body of the let (the part after the `in`). However, if we need to break a definition (the part after the `=` but before the `in`) across multiple lines, we **do** indent those lines: ``` let x = if true then 1 else 0 in let y = x + x in x + x ``` ``` let f x = let helper x y = let t = x + y in let z = t * t + x + y in z * z in helper 3 x;; f 17 ``` This last snippet is an example of a very common pattern: by keeping the helper function `helper` within the scope of `f`, we clearly tell the reader that `helper` is only needed for `f`. Variables defined in small scopes are preferable to variables defined in large scopes because they allow the programmer to ignore them most of the time. For these exercises, do not use the top level (except to check your work). Start by breaking these expressions on to multiple lines and adding indentataion to help you reason about them. **Exercise:** Give the values of the following expressions: 1. let x = 0 in let x = 1 in x 2. let x = let x = 0 in x in x 3. let x = 0 in let f x = x in let x = 1 in f 2 4. let x = 0 in let f y = x in let x = 1 in f 2 5. let x = let y = 0 in y in let z = 1 in let w = 2 in x + y + z + w **Exercise:** Do expressions 3 and 4 differ only in the names of variables. Does this violate the Principle of Name Irrelevance? Why or why not? ## Part 2: Modules As we saw above, limiting the scope of variables is good practice because it allows programmers to ignore variables that aren't relevant. Like `let ... in` expressions, modules are mechanism for limiting the scope of names. ### Defining modules A module is a collection of related variables with values. Modules are created by placing a series of definitions in between `struct ... end`, and they are given names using the `module` keyword. All modules have names that start with uppercase letters. ``` module MyModule = struct let x = 3 let f y = x + y end ``` You can access the values within the module by prefacing them with the name of the module and a `.`. You've already seen examples of this with the `Random` module: ``` utop# Random.int;; - : int -> int = <fun> ``` **Exercise:** Add `MyModule` to your working file for this recitation and `#use` it. Write a function (outside of the module) that uses the values in the `MyModule` module. Just like values, modules have types (although they are called module types instead of just types). Module types are bracketed by `sig ... end` and list all of the definitions inside the module. We will discuss module types in much more detail later. For now, the useful thing to know about them is that you can print the type of a module in the toplevel: ``` utop# module type T = module type of Random;; ``` **Exercise:** What is the module type of `MyModule`? **Exercise:** Using just `utop`, find the string that is used to separate file names. It's in the `File`something module, but you don't remember exactly and your internet connection is down. ### Opening modules As you've seen, you can access the contents of modules using a `.`. You can also bring all of the definitions of a module into the current scope using `open`: ``` utop# capitalize;; Error: Unbound value capitalize utop# String.capitalize;; - : string -> string = <fun> utop# open String;; utop# capitalize;; - : string -> string = <fun> ``` Opening a module is just like writing a let statement for each variable defined in the module. `open String` does the same thing as ``` let length = String.length let get = String.get let set = String.set ... let capitalize = String.capitalize ... ``` Occasionally, you may want to open a module in a limited scope (such as in the body of a function). The syntax for this is `let open M in ....` ### Pervasives There is a special module called `Pervasives` that is automatically `open`ed. It contains the "built-in" functions and operators. **Exercise:** In the toplevel, look at the list of variables in the `Pervasives` module. Find the name of the function that prints out a string followed by an end of line character. ### A note on style You now know three ways to access a variable from a module. Directly: ``` String.capitalize ``` By making a short name for the module: ``` module S = String ... S.capitalize ``` Or by opening the module: ``` open String ... capitalize ``` **Exercise:** What are the advantages of each style? When would you use which? ### Compiling and loading modules Compiling an OCaml file produces a module having the same name as the file (but with the first letter capitalized). We use the `cs3110` tool to compile and test your assignments. These compiled modules can be loaded into the toplevel using `#load`. **Exercise:** Create a file called `mods.ml`. Inside that file, define a variable `x` of type int, two functions `f` and `g`, and a module called `M` containing a variable `y`. At the command line, type `cs3110 compile mods.ml` to compile it. Before you can load a module compiled with the cs3110 tool, you have to set up `utop` properly: - You must specify a directory to load compiled files from using `#directory`. `cs3110 compile` outputs compiled modules in the `_build` directory, so you need to type `#directory "_build";;` - You must require any third-party libraries. Modules compiled with `cs3110 compile` depend on the `pa_ounit` library. You need to type `#require "pa_ounit";;` **Exercise:** At the toplevel, type `#load "mods.d.cmo";;` to load the file you just compiled. As with everything else you type into utop, you will want to put these `#require`, `#directory` and `#load` commands into another file that you will `#use` from utop (to save yourself typing). Notice that `x` and `y` are not defined (as they would be if you had `#use`d `mods.ml`). However, there is a module `Mods` available; `Mods.x`, `Mods.f` and `Mods.g` are defined. **Exercise:** Access `y` from the top level. **Exercise:** Use a single `open` statement in utop to bring `y` into scope. When compiling a file `f.ml`, `cs3110 compile` automatically figures out which other files `f.ml` depends on, and recompiles those as necessary. **Exercise:** Create a file `mods2.ml`. In it, create a function `f` that depends on `Mods.x`. Use `cs3110 compile` to compile it. Notice that only `mods2.ml` was compiled. Now, modify `mods.ml`, and recompile `mods2.ml`. Notice that both `mods.ml` and `mods2.ml` were recompiled. **Exercise:** To see clearly the difference between `#load` and `#use`, start a new toplevel, `#use` mods.ml, and then `#use` mods2.ml. Why doesn't this work? Unfortunately, `utop`'s `#load` isn't as smart as `cs3110 compile`: it does not automatically load dependencies. **Exercise:** Start a fresh toplevel. `#load "mods2.d.cmo"` (don't forget `#require` and `#directory`). Notice the error message telling you that `Mods` is undefined. Load `mods` and then reload `mods2`. While you're using the toplevel to experiment with your code, it's better to compile and use `#load`, because this accurately reflects how your modules interact with each other and with the outside world (including our testing suite). As you've already discovered, it can be a pain to get the `#load` statements in the right order. As mentioned in lab 2, it's a good practice to have a file that you `#use` while you are working in `utop`. The `#load` statements for the modules you are testing along with the `#require` and `#directory` commands are good things to put in there. ## Part 3: Printing OCaml has a number of ways to print things. Pervasives has `print_` functions for all of the built-in types: `print_int`, `print_string`, `print_char`, and so on. There is also the convenient `print_endline` function, which is like `print_string`, but it breaks the line after printing. The `Printf.printf` function is a very versatile and useful function for printing. In the last lab, you used it to print octal and hexadecimal numbers, but it can be used for much more. The first argument to `printf` is a "format string": it looks like a normal string, but it can contain placeholders that are replaced by the remaining arguments to `printf`. For example, the format string `"hello %s (version %d) \n"` has two placeholders: the "%s" indicates that the next argument to `printf` should be a string, and the "%d" indicates that the argument after that should be an integer, and will be printed in decimal. ``` utop# Printf.printf "hello %s (version %d) \n";; - : string -> int -> unit = <fun> utop# Printf.printf "hello %s (version %d) \n" "world" 1;; hello world (version 1) - : unit = () ``` See the documentation for the `Printf` module for full explanations of the available placeholders. **Exercise:** Fill in the format string in the following expression: ``` utop# Printf.printf "TODO" 95.0 true 5.0 false "ocaml";; ``` so that the output is ``` 95% true 5% false 100% ocaml! - : unit = () ``` This can be done with only three spaces in the format string. Curiously, the type of `printf` depends on the format string that is provided. This is possible only because the OCaml language has special support for format strings. ### Unit The return type of all of the printing functions is a type called `unit`. There is only one value of this type, which is written `()` and is also pronounced "unit". Unit is used when you need to take an argument or return a value, but there's no interesting value to pass or return. Unit is often used when you're writing or using code that has side-effects. Since you usually want to ignore values of type unit, there is a special syntax for doing so. The expression `e1; e2` first evaluates `e1`, which should produce `()`. It then discards that value, and evaluates `e2`. Here is how `;` is often used: ``` let f _ = let x = 3 + 4 in print_string "x = "; print_int x; print_newline (); let y = x + x in Printf.printf "y = %i\n" y; Printf.printf "returning %i" (y + y + x); y + y + x ``` If `e1` does not have type `unit`, then `e1;e2` will give a warning, because you are discarding useful values. If that is truly your intent, you can call the built-in function `ignore` to convert any value to `()`: ``` utop# 3; 5;; Characters 0-1: Warning 10: this expression should have type unit. - : int = 5 utop# ignore 3; 5;; - : int = 5 ``` **Exercise:** Modify your recursive `rev_base` function from the last lab so that it prints something on every recursive call. ### Welcome to the dark side Printing is your first taste of a "side effect": you don't call printing functions for the values they return, but for the other things they do while they run. Side effects break many of the promises we've made about the benefits of functional programming. **Exercise:** To fully appreciate the power of the dark side, predict the output of the following functions: ``` let lightning_storm _ = let f x y = print_endline "A"; x + y in (print_endline "B"; f) (print_endline "C"; 0) (print_endline "D"; 1) ``` ``` let corrupt_padowan _ = let f x y = print_endline "A"; x + y in let g = print_endline "B"; f in g (print_endline "C"; 0) (print_endline "D"; 1) ``` Verify your answer. Feel the power of the dark side flow through you.