Functions, and Testing in the Toplevel
In the previous lab, you learned to use the OCaml toplevel and a text editor. Today, we'll combine those to write and test some functions.
Reminder about lab guidelines
Recall from the lab guidlines that one and two star exercises should be worked when you come to them, and if you get stuck for more than a little while, you should ask about the problem in recitation. Three star exercises can be skipped at first then revisited when the rest of the lab is finished. Four and five star exercises are intended to provide additional challenge for those students who want it.
Testing code in the toplevel
In addition to allowing you to define functions, the toplevel will also accept
directives that are not OCaml code but rather tell the toplevel itself to do something.
All directives begin with the #
character. Perhaps the most common directive
is #use
, which loads all the code from a file into the toplevel, just as if you had
typed the code from that file into the toplevel.
Exercise: use [✭✭]
Create a file named lab02.ml
. In that file put the following code:
let inc x = x + 1
Start the toplevel. Try entering the following expression, and observe the error:
# inc 3;;
Error: Unbound value inc
Hint: Did you mean incr?
The error is because the toplevel does not yet know anything about
a function named inc
. Now issue the following directive to the toplevel:
# #use "lab02.ml";;
Note that the first #
character above indicates the toplevel prompt to you.
The second #
character is one that you type to tell the toplevel that you
are issuing a directive. Without that character, the toplevel would think
that you are trying to apply a function named use
.
Now try again:
# inc 3;;
- : int = 4
Test the inc
function on a couple more integers of your choice.
Finally, exit the toplevel.
□
The final step in the exercise above ("exit the toplevel") is exceedingly important. The best workflow when using the toplevel to test functions is:
- Edit some code in a file.
- Load the code in the toplevel with
#use
. - Interactively test the code.
- Exit the toplevel.
Suppose you wanted to fix a bug in your code: it's tempting to not exit
the toplevel, edit the file, and re-issue the #use
directive into the
same toplevel session. Resist that temptation. The "stale code" that
was loaded from an earlier #use
directive in the same session can
cause surprising things to happen—surprising when you're first
learning the language, anyway. So always exit the toplevel
before re-using a file.
Recursive functions
In lecture, we saw that there are two ways of defining functions. Non-recursive functions are defined like this:
let f x = ...
Recursive functions are defined like this:
let rec f x = ...
The difference is just the rec
keyword.
Exercise: fact [✭✭]
In your file lab02.ml
, try adding the following code to define the factorial function:
let fact n = if n=0 then 1 else n * fact (n-1)
Save the file, start a new toplevel session, and use the file. You'll get an error:
# #use "lab02.ml";;
Error: Unbound value fact
The problem is that we omitted the rec
keyword. Without it, fact
is not in scope
in its own body, so it can't be called recursively.
Fix this error by adding the rec
keyword. Don't forget to restart the toplevel
before re-using the file. Interactively test your factorial function on a few integers.
Note that, as in many languages, OCaml integers are not the
"mathematical" integers but are limited to a fixed number of bits. The
manual specifies that integers are at least 30 bits but might be
wider. So if you test on large enough inputs, you might begin to see
strange results. The problem is machine arithmetic, not OCaml. Also if
you test on integers less than 0, e.g. fact (-1)
, you'll get an error.
The problem there is that our function assumes that its input is at
least zero. We should really document a precondition for our function!
□
Exercise: fib [✭✭]
Define a recursive function fib : int -> int
, such
that fib n
is the n
th number in the Fibonacci sequence, which
is 1, 1, 2, 3, 5, 8, 13, ... That is:
fib 1 = 1
,fib 2 = 1
, andfib n = fib (n-1) + fib (n-2)
for anyn > 2
.
Test your function in the toplevel.
□
Exercise: fib fast [✭✭✭]
How quickly does your implementation of fib
compute the 50th Fibonacci number?
If it computes nearly instantaneously, congratulations! But the recursive solution
most people come up with at first will seem to hang indefinitely. The
problem is that the obvious solution computes subproblems repeatedly. For
example, computing fib 5
requires computing both fib 3
and fib 4
,
and if those are computed separately, a lot of work (an exponential amount, in fact)
is being redone.
Create a function fib_fast
that requires only a linear amount of
work. Hint: write a recursive helper function h : int -> int -> int -> int
,
where h n pp p
is defined as follows:
h 1 pp p = p
, andh n pp p = h (n-1) p (pp+p)
for anyn > 1
.
The idea of h
is that it assumes the previous two Fibonacci numbers were pp
and p
, then computes forward n
more numbers. Hence, fib n = h n 0 1
for
any n > 0
.
What is the first value of n
for which fib_fast n
is negative, indicating
that integer overflow occurred?
□
Polymorphic functions
The simplest function is the identity function fun x -> x
. What is
its type? We can enter it into the toplevel to find out:
# fun x -> x;;
- : 'a -> 'a = <fun>
(You will note that the syntax highlighting above gets wonky. The Google Code Prettifier, which we use on this website, unfortunately does the wrong thing with single quotes here: it treats them as surrounding a character literal, even though they are not. If there are any wizard web programmers in the course, feel free to talk to Prof. Clarkson about how to fix this problem.)
The 'a
is a type variable: it stands for an unknown type, just like
a regular variable stands for an unknown value. Type variables always
begin with a single quote. Commonly used type variables include 'a
, 'b
, and 'c
,
which OCaml programmers typically pronounce in Greek: alpha, beta, and gamma.
Exercise: id [✭]
Add the identity function to your lab02.ml
file:
let id x = x
Try testing it on several types of values—an integer, a string, a Boolean, etc.
□
Because you can apply that function to many types of values, it is a polymorphic function. (poly = many, morph = form)
Exercise: poly types [✭✭✭]
What is the type of each of the functions below? You can ask the toplevel to check your answers.
let f x = if x then x else x
let g x y = if y then x else x
let h x y z = if x then y else z
let i x y z = if x then y else y
□
Labeled and optional arguments
Usually the type and name of a function give you a pretty good idea of what the
arguments should be. However, for functions with many arguments (especially
arguments of the same type), it can be useful to label them. For example, you
might guess that the function String.sub
returns a substring of the given string
(and you would be correct). You could type in String.sub
to find its type:
# String.sub;;
- : string -> int -> int -> string
But it's not clear from the type how to use it—you're forced to consult the documentation.
OCaml supports labeled arguments to functions. You can declare this kind of function using the following syntax:
# let f ~name1:arg1 ~name2:arg2 = arg1 + arg2;;
val f : name1:int -> name2:int -> int = <fun>
This function can be called by passing the labeled arguments in either order:
f ~name2:3 ~name1:4;;
Labels for arguments are often the same as the variable names for them. OCaml provides a shorthand for this case. The following are equivalent:
let f ~name1:name1 ~name2:name2 = name1+name2
let f ~name1 ~name2 = name1 + name2
Use of labeled arguments is a largely matter of taste: they convey extra information, but they can also add clutter to types.
If you need to write both a labeled argument and an explicit type annotation for it, here's the syntax for doing so:
let f ~name1:(arg1:int) ~name2:(arg2:int) = arg1 + arg2
Exercise: divide [✭✭]
Write a function divide : numerator:float -> denominator:float -> float
. Apply your function.
□
It is also possible to make some arguments optional: when called without the argument, a default value will be provided. To declare such a function, use the following syntax:
# let f ?name:(arg1=8) arg2 = arg1 + arg2;;
val f : ?name:int -> int -> int = <fun>
You can then call a function with or without the argument:
# f ~name:2 7;;
- : int = 9
# f 7;;
- : int = 15
Partial application
We could define an addition function as follows:
let add x y = x + y
Try entering this rather similar function into the toplevel:
let addx x = fun y -> x + y
Function addx
takes an integer x
as input, and returns a function
of type int -> int
that will add x
to whatever is passed to it.
Exercise: addx [✭✭]
- What is the type of
addx
? Hint: what does the toplevel say? - If you define
let add5 = addx 5
, what is the type ofadd5
? - What is the result of
add5 2
? - If you define
let add3 = addx 3
, what is the type ofadd3
? - What is the result of
add3 8
?
□
What you just did in defining add3
is called partial application:
you partially applied the function add
to one argument, even though
you normally would think of it as a multi-argument function. Why does
this work? It's because the following three functions are
syntactically different but semantically equivalent. That is,
they are different ways of expressing the same computation:
let add x y = x+y
let add x = fun y -> x+y
let add = fun x -> (fun y -> x+y)
So add
is really a function that takes an argument x
and returns
a function (fun y -> x+y)
. Which leads us to a deep truth...
Function associativity
Are you ready for the truth? Here goes...
Every OCaml function takes exactly one argument.
Why? Consider add
: although we can write it as
let add x y = x + y
, we know that's semantically
equivalent to let add = fun x -> (fun y -> x+y)
.
And in general,
let f x1 x2 ... xn = e
is semantically equivalent to
let f =
fun x1 ->
(fun x2 ->
(...
(fun xn -> e)...))
So even though you think of f
as a function that takes n
arguments,
in reality it is a function that takes 1 argument, and returns
a function.
And the type of such a function
t1 -> t2 -> t3 -> t4
really means the same as
t1 -> (t2 -> (t3 -> t4)))
That is, function types are right associative: there are implicit parentheses around function types, from right to left. The intuition here is that a function takes a single argument and returns a new function that expects the remaining arguments.
Function application, on the other hand, is left associative: there are implicit parenthesis around function applications, from left to right. So
e1 e2 e3 e4
really means the same as
((e1 e2) e3) e4
The intuition here is that the left-most expression grabs the next expression to its right as its single argument.
Exercise: associativity [✭✭]
Which of the following produces an integer, which produces a function, and which produces an error? Decide on an answer, then check your answer in the toplevel.
add 5 1
add 5
(add 5) 1
add (5 1)
□
Operators
The addition operator +
has type int->int->int
. It is normally
written infix, e.g., 3+4
. By putting parentheses around it, we can
make it a prefix operator: (+) 3 4
. The same technique works for
any built-in operator. But be careful of multiplication, which must be
written ( * )
as infix, because (*)
would be parsed as beginning a
comment.
We can even define our own new infix operators, as follows:
let (^^) x y = max x y;;
2 ^^ 3 (* is 3 *)
Exercise: average [✭✭]
Define an infix operator $$
to compute the average of two
floating-point numbers. For example,
1.0 $$ 2.0 = 1.5
0. $$ 0. = 0.
□