Lab: Functions, and Testing in the Toplevel

# Functions, and Testing in the Toplevel In the previous lab, you learned to use the OCaml toplevel and a text editor. Today, we'll combine those to write and test some functions. ## Reminder about lab guidelines Recall from the [lab guidlines][guide] that one and two star exercises should be worked when you come to them, and if you get stuck for more than a little while, you should ask about the problem in recitation. Three star exercises can be skipped at first then revisited when the rest of the lab is finished. Four and five star exercises are intended to provide additional challenge for those students who want it. [guide]: ../01-intro/lab-guidelines.html ## Testing code in the toplevel In addition to allowing you to define functions, the toplevel will also accept *directives* that are not OCaml code but rather tell the toplevel itself to do something. All directives begin with the `#` character. Perhaps the most common directive is `#use`, which loads all the code from a file into the toplevel, just as if you had typed the code from that file into the toplevel. ##### Exercise: use [✭✭] Create a file named `lab02.ml`. In that file put the following code: let inc x = x + 1 Start the toplevel. Try entering the following expression, and observe the error: # inc 3;; Error: Unbound value inc Hint: Did you mean incr? The error is because the toplevel does not yet know anything about a function named `inc`. Now issue the following directive to the toplevel: # #use "lab02.ml";; Note that the first `#` character above indicates the toplevel prompt to you. The second `#` character is one that you type to tell the toplevel that you are issuing a directive. Without that character, the toplevel would think that you are trying to apply a function named `use`. Now try again: # inc 3;; - : int = 4 Test the `inc` function on a couple more integers of your choice. Finally, exit the toplevel. &square; The final step in the exercise above ("exit the toplevel") is exceedingly important. The best workflow when using the toplevel to test functions is: * Edit some code in a file. * Load the code in the toplevel with `#use`. * Interactively test the code. * Exit the toplevel. Suppose you wanted to fix a bug in your code: it's tempting to not exit the toplevel, edit the file, and re-issue the `#use` directive into the same toplevel session. Resist that temptation. The "stale code" that was loaded from an earlier `#use` directive in the same session can cause surprising things to happen—surprising when you're first learning the language, anyway. So **always exit the toplevel before re-using a file.** ## Recursive functions In lecture, we saw that there are two ways of defining functions. Non-recursive functions are defined like this: let f x = ... Recursive functions are defined like this: let rec f x = ... The difference is just the `rec` keyword. ##### Exercise: fact [✭✭] In your file `lab02.ml`, try adding the following code to define the factorial function: let fact n = if n=0 then 1 else n * fact (n-1) Save the file, start a new toplevel session, and use the file. You'll get an error: # #use "lab02.ml";; Error: Unbound value fact The problem is that we omitted the `rec` keyword. Without it, `fact` is not in scope in its own body, so it can't be called recursively. Fix this error by adding the `rec` keyword. Don't forget to restart the toplevel before re-using the file. Interactively test your factorial function on a few integers. Note that, as in many languages, OCaml integers are not the "mathematical" integers but are limited to a fixed number of bits. The [manual][man] specifies that integers are at least 30 bits but might be wider. So if you test on large enough inputs, you might begin to see strange results. The problem is machine arithmetic, not OCaml. Also if you test on integers less than 0, e.g. `fact (-1)`, you'll get an error. The problem there is that our function assumes that its input is at least zero. We should really document a precondition for our function! [man]: http://caml.inria.fr/pub/docs/manual-ocaml/values.html#sec76 &square; ##### Exercise: fib [✭✭] Define a recursive function `fib : int -> int`, such that `fib n` is the `n`th number in the [Fibonacci sequence][fib], which is 1, 1, 2, 3, 5, 8, 13, ... That is: - `fib 1 = 1`, - `fib 2 = 1`, and - `fib n = fib (n-1) + fib (n-2)` for any `n > 2`. Test your function in the toplevel. [fib]: https://en.wikipedia.org/wiki/Fibonacci_number &square; ##### Exercise: fib fast [✭✭✭] How quickly does your implementation of `fib` compute the 50th Fibonacci number? If it computes nearly instantaneously, congratulations! But the recursive solution most people come up with at first will seem to hang indefinitely. The problem is that the obvious solution computes subproblems repeatedly. For example, computing `fib 5` requires computing both `fib 3` and `fib 4`, and if those are computed separately, a lot of work (an exponential amount, in fact) is being redone. Create a function `fib_fast` that requires only a linear amount of work. *Hint:* write a recursive helper function `h : int -> int -> int -> int`, where `h n pp p` is defined as follows: - `h 1 pp p = p`, and - `h n pp p = h (n-1) p (pp+p)` for any `n > 1`. The idea of `h` is that it assumes the previous two Fibonacci numbers were `pp` and `p`, then computes forward `n` more numbers. Hence, `fib n = h n 0 1` for any `n > 0`. What is the first value of `n` for which `fib_fast n` is negative, indicating that integer overflow occurred? &square; ## Polymorphic functions The simplest function is the *identity* function `fun x -> x`. What is its type? We can enter it into the toplevel to find out: ``` # fun x -> x;; - : 'a -> 'a = <fun> ``` *(You will note that the syntax highlighting above gets wonky. The Google Code Prettifier, which we use on this website, unfortunately does the wrong thing with single quotes here: it treats them as surrounding a character literal, even though they are not. If there are any wizard web programmers in the course, feel free to talk to Prof. Clarkson about how to fix this problem.)* The `'a` is a *type variable*: it stands for an unknown type, just like a regular variable stands for an unknown value. Type variables always begin with a single quote. Commonly used type variables include `'a`, `'b`, and `'c`, which OCaml programmers typically pronounce in Greek: alpha, beta, and gamma. ##### Exercise: id [✭] Add the identity function to your `lab02.ml` file: let id x = x Try testing it on several types of values—an integer, a string, a Boolean, etc. &square; Because you can apply that function to many types of values, it is a *polymorphic* function. (*poly* = many, *morph* = form) ##### Exercise: poly types [✭✭✭] What is the type of each of the functions below? You can ask the toplevel to check your answers. let f x = if x then x else x let g x y = if y then x else x let h x y z = if x then y else z let i x y z = if x then y else y &square; ## Labeled and optional arguments Usually the type and name of a function give you a pretty good idea of what the arguments should be. However, for functions with many arguments (especially arguments of the same type), it can be useful to label them. For example, you might guess that the function `String.sub` returns a substring of the given string (and you would be correct). You could type in `String.sub` to find its type: # String.sub;; - : string -> int -> int -> string But it's not clear from the type how to use it—you're forced to consult the documentation. OCaml supports labeled arguments to functions. You can declare this kind of function using the following syntax: # let f ~name1:arg1 ~name2:arg2 = arg1 + arg2;; val f : name1:int -> name2:int -> int = <fun> This function can be called by passing the labeled arguments in either order: f ~name2:3 ~name1:4;; Labels for arguments are often the same as the variable names for them. OCaml provides a shorthand for this case. The following are equivalent: let f ~name1:name1 ~name2:name2 = name1+name2 let f ~name1 ~name2 = name1 + name2 Use of labeled arguments is a largely matter of taste: they convey extra information, but they can also add clutter to types. If you need to write both a labeled argument and an explicit type annotation for it, here's the syntax for doing so: let f ~name1:(arg1:int) ~name2:(arg2:int) = arg1 + arg2 ##### Exercise: divide [✭✭] Write a function `divide : numerator:float -> denominator:float -> float`. Apply your function. &square; It is also possible to make some arguments optional: when called without the argument, a default value will be provided. To declare such a function, use the following syntax: # let f ?name:(arg1=8) arg2 = arg1 + arg2;; val f : ?name:int -> int -> int = <fun> You can then call a function with or without the argument: # f ~name:2 7;; - : int = 9 # f 7;; - : int = 15 ## Partial application We could define an addition function as follows: let add x y = x + y Try entering this rather similar function into the toplevel: let addx x = fun y -> x + y Function `addx` takes an integer `x` as input, and returns a *function* of type `int -> int` that will add `x` to whatever is passed to it. ##### Exercise: addx [✭✭] * What is the type of `addx`? *Hint: what does the toplevel say?* * If you define `let add5 = addx 5`, what is the type of `add5`? * What is the result of `add5 2`? * If you define `let add3 = add 3`, what is the type of `add3`? * What is the result of `add3 8`? &square; What you just did in defining `add3` is called *partial application*: you partially applied the function `add` to one argument, even though you normally would think of it as a multi-argument function. Why does this work? It's because the following three functions are *syntactically different* but *semantically equivalent*. That is, they are different ways of expressing the same computation: ``` let add x y = x+y let add x = fun y -> x+y let add = fun x -> (fun y -> x+y) ``` So `add` is really a function that takes an argument `x` and returns a function `(fun y -> x+y)`. Which leads us to a deep truth... ## Function associativity Are you ready for the truth? Here goes... **Every OCaml function takes exactly one argument.** Why? Consider `add`: although we can write it as `let add x y = x + y`, we know that's semantically equivalent to `let add = fun x -> (fun y -> x+y)`. And in general, ``` let f x1 x2 ... xn = e ``` is semantically equivalent to ``` let f = fun x1 -> (fun x2 -> (... (fun xn -> e)...)) ``` So even though you think of `f` as a function that takes `n` arguments, in reality it is a function that takes 1 argument, and returns a function. And the type of such a function t1 -> t2 -> t3 -> t4 really means the same as t1 -> (t2 -> (t3 -> t4))) That is, function types are *right associative*: there are implicit parentheses around function types, from right to left. The intuition here is that a function takes a single argument and returns a new function that expects the remaining arguments. Function application, on the other hand, is *left associative*: there are implicit parenthesis around function applications, from left to right. So e1 e2 e3 e4 really means the same as ((e1 e2) e3) e4 The intuition here is that the left-most expression grabs the next expression to its right as its single argument. ##### Exercise: associativity [✭✭] Which of the following produces an integer, which produces a function, and which produces an error? Decide on an answer, then check your answer in the toplevel. * `add 5 1` * `add 5` * `(add 5) 1` * `add (5 1)` &square; ## Operators The addition operator `+` has type `int->int->int`. It is normally written *infix*, e.g., `3+4`. By putting parentheses around it, we can make it a *prefix* operator: `(+) 3 4`. The same technique works for any built-in operator. But be careful of multiplication, which must be written `( * )` as infix, because `(*)` would be parsed as beginning a comment. We can even define our own new infix operators, as follows: let (^^) x y = max x y;; 2 ^^ 3 (* is 3 *) ##### Exercise: average [✭✭] Define an infix operator `$$` to compute the average of two floating-point numbers. For example, * `1.0 $$ 2.0 = 1.5` * `0. $$ 0. = 0.` &square;