Type Constraints
To gather the constraints for a definition, HM does the following:
Assign a preliminary type to every subexpression in the definition. For known operations and constants, such as
+
and3
, use the type that is already known for it. For anything else, use a new type variable that hasn't been used anywhere else.Use the "shape" of the expressions to generate constraints. For example, if an expression involves applying a function to an argument, then generate a constraint requiring the type of the argument to be the same as the function's input type.
We'll give some examples of this first, then we'll give the algorithms for doing it.
Example 1.
Here's an example utop interaction:
# let g x = 5 + x;;
val g : int -> int = <fun>
How did OCaml infer the type of g
here? Let's work it out.
First, let's rewrite g
syntactically to make our work a little easier:
let g = fun x -> ((+) 5) x
We've made the anonymous function explicit, and we've made the binary infix operator a prefix function application.
1. Assign preliminary types.
For each subexpression of fun x -> (+) 5 x
, including the entire
expression itself, we assign a preliminary type. We already know the
types of (+)
and 5
, because those are baked into the language
itself, but for everything else we "play dumb" and just invent a new
type variable for it. For now we will use uppercase letters to represent
those type variables, rather than the OCaml syntax for type variables
(e.g., 'a
).
Subexpression Preliminary type
------------------ --------------------
fun x -> ((+) 5) x R
x U
((+) 5) x S
((+) 5) T
(+) int -> (int -> int)
5 int
x V
2. Collect constraints.
Here are some observations we could make about the "shape" of subexpressions and some relationships among them:
Since function argument
x
has typeU
and function body((+) 5) x
has typeS
, it must be the case thatR
, the type of the anonymous function expression, satisfies the constraintR = U -> S
. That is, the type of the anonymous function is the type of its argument arrow the type of its body.Since function
((+) 5)
has typeT
and function application((+) 5) x
has typeS
, and since the argumentx
has typeV
, it must be the case thatT = V -> S
. That is, the type of the function being applied is the type of the argument it's being applied to arrow the type of the function application expression.Since function
(+)
has typeint -> (int -> int)
and function application(+) 5
has typeT
, and since the argument5
has typeint
, it must be the case thatint -> (int->int) = int -> T
. Once again, the type of the function being applied is the type of the argument it's being applied to arrow the type of the function application expression.Since
x
occurs with both typeU
andV
, it must be the case thatU = V
.
The set of constraints thus generated is:
U = V
R = U -> S
T = V -> S
int -> (int -> int) = int -> T
3. Solve constraints.
You can solve that system of equations easily. Starting from the last
constraint, we know T
must be int -> int
. Substituting that into the
second constraint, we get that int -> int
must equal V -> S
, hence
V = S = int
. Since U=V
, U
must also be int
. Substituting for S
and U
in the first constraint, we get that R = int -> int
. So the
inferred type of g
is int -> int
.
Example 2.
# let apply f x = f x;;
val apply : ('a -> 'b) -> 'a -> 'b = <fun>
Again we rewrite:
let apply = fun f -> (fun x -> f x)
1. Assign preliminary types.
Subexpression Preliminary type
----------------------- ------------------
fun f -> (fun x -> f x) R
f S
(fun x -> f x) T
x U
f x V
f S
x U
2. Collect constraints.
R = S -> T
, because of the anonymous function expression.T = U -> V
, because of the nested anonymous function expression.S = U -> V
, because of the function application.
3. Solve constraints.
Using the third constraint, and substituting for S
in the first
constraint, we have that R = (U -> V) -> T
. Using the second
constraint, and substituting for T
in the first constraint,
we have that R = (U -> V) -> (U -> V)
. There are no further
substitutions that can be made, so we're done solving the constraints.
If we now replace the preliminary type variables with actual OCaml
type variables, specifically U
with 'a
and V
with 'b
, we get that
the type of apply
is ('a -> 'b) -> ('a -> 'b)
, which is the same as
('a -> 'b) -> 'a -> 'b
.
Example 3.
# apply g 3;;
- : int = 8
We rewrite that as (apply g) 3
.
1. Assign preliminary types.
In this running example, the inference for g
and apply
has already
been done, so we can fill in their types as known, much like the type
of +
is already known.
Subexpression Preliminary type
------------- ------------------------------------------
(apply g) 3 R
(apply g) S
apply (U -> V) -> (U -> V)
g int -> int
3 int
2. Collect constraints.
S = int -> R
- `(U -> V) -> (U -> V) = (int -> int) -> S
3. Solve constraints.
Breaking down the last constraint, we have that U = V = int
, and
that S = U -> V
, hence S = int -> int
. Substituting that into
the first constraint, we have that int -> int = int -> R
. Therefore
R = int
, so the type of apply g 3
is int
.
Example 4.
# apply not false;;
- : bool = true
By essentially the same reasoning as in example 3, HM can infer that the
type of this expression is bool
. This illustrates the polymorphism of
apply
: because the type (U -> V) -> (U -> V)
of
apply
contains type variables, the function can be applied to any
arguments, so long as those arguments' types can be consistently
substituted for the type variables.