We now present an algorithm that generates constraints. This algorithm is a precise description of how constraint gathering works in the examples we discussed above. The algorithm is not exactly what HM does, because HM actually performs type checking at the same time as type inference. However, the resulting types are the same, and separating inference from checking hopefully will give you a clearer idea of how inference itself works.
The algorithm takes as input an expression
assume that every function
fun x -> e' in that expression has an
argument with a different name. (If not, our algorithm could make a
pre-pass to rename variables. This is feasible because of lexical scope.)
The output of the algorithm is a set of constraints.
The first thing the algorithm does is to assign unique preliminary
type variables, e.g.
- one to each defining occurrence of a variable, which could be as a function argument or a let binding, and
- one to each occurrence of each subexpression of
Call the type variable assigned to
x in the former clause
D(x), and call the type variable assigned to occurrence of a
e' in the latter clause
U(e'). The names of these
U stands for the use of an expression,
D stands for the definition of a variable name.
Next, the algorithm generates the following constraints:
- For integer constants
U(n) = int. This constraints follows from the type checking rule for integers, which says that every integer constant has type
int. Constraints for other types of constants are generated in a similar way.
- For variables
D(x) = U(x). This constraint follows from the type checking rule for variables, which says the type of a variable use (in this case,
U(x)) must be the same as the type at which that variable was defined (here,
- For function application
U(e1) = U(e2) -> U(e1 e2), as well as any constraints resulting from
e2. This constraint follows from the type checking rule for function application.
- For anonymous functions
fun x -> e:
U(fun x -> e) = D(x) -> U(e), as well as any constraints resulting from
e. This constraint follows from the type checking rule for anonymous functions.
- For let expressions
let x=e1 in e2:
U(let x=e1 in e2) = U(e2), as well as any constraints resulting from
e2. This constraint follows from the type checking rule for let expressions.
- Other expression forms: similar kinds of constraints likewise derived from the type checking rule for the expression form.
The result is a set of constraints, which is the output of the
algorithm. It's not too hard to implement this algorithm as a recursive
function over a tree representing the syntax of
fun x -> (fun y -> x), a type variable
associated with argument
S with argument
T is associated with the occurrence of
fun x -> (fun
y -> x), and
X with the occurrence of
(fun y -> x), and
the occurrence of
x. (Note that the names we've chosen for the type
variables are completely arbitrary.) The constraints generated are
R -> X, and
X = S -> Y, and
Y = R.