Type Systems
We have seen that types can be complex, and therefore so can type checking. So we would like to have a concise way of specifying how to do type checking. This is the role of a static semantics, which defines how to ascribe types to terms.
We saw earlier that that we could implement type checking recursively
as a method typeCheck
on AST nodes, something like the following:
class Expr {
Type typeCheck(Context c);
}
Formally, we express the idea that t == e.typeCheck(c)
with a
typing judgment written
A typing context
Inference rules
A type system is a set of types, plus a set of inference rules for deriving typing judgments; in other words, a type system includes a proof system for typing judgments.
An example of a typing rule is the following inference rule:
(Plus) |
The way to interpret this rule is this: if we can show that
The judgment below the line is the conclusion. The judgments above the line are premises. In general, we may write additional conditions above the line that must be true to derive the conclusion; these non-judgment conditions are called side conditions. If a rule has no premises, we call it an axiom. On the side of the rule we sometimes write the name of the rule (Plus) so we can talk about it elsewhere.
Examples of axioms are the following. First, an axiom for the type of
an integer literal
(IntLit) |
Another axiom lets us derive the type of a variable by finding it in the current typing context. This axiom has a side condition but has no true premises, hence is an axiom. Intuitively, axioms correspond to the terms for which the type checker does not need to make any recursive calls to type-check subterms.
(Var) |
An inference rule must express reasoning that is correct under all
consistent substitutions of syntactic expressions (drawn from the
correct set) for metavariables appearing in the inference rule. That
is, an inference rule is implicitly universally quantified over its
metavariables, such as
The job of a type checker is to determine whether the typing rules
can be used to construct a derivation of a typing judgment for
the given term. A derivation is a tree of instances of inference rules,
showing how to start from axioms and derive the final judgment. For
example, we can prove
(Var) | (IntLit) | ||
(Plus) |
To see how we get this derivation, consider the use of the rule (Plus). We get the corresponding step in this derivation by applying
the substitution
Inference rules for an Eta-like language
We can also type-check statements in a language like Eta.
Statements don't return any interesting value, but we can think of
them as computing a value of unit type. A unit type is a type with
only one value. If a computation produces this value, it merely means
that the computation terminated. The declaration void
in Java, used
as a return type of methods, is essentially a declaration of unit
type. Here, we write 1 for the unit type. The typing
judgment
We will add one more component to the typing judgment, to handle the fact that
statements—notably, variable declarations—can add new variables to the typing
context. We write
In particular, the following rules describes how to type-check variable declarations while extending the typing context so that the declared variable is given the correct type:
|
|
Now we can write rules for type-checking if
and while
, along with other
language constructs:
|
| ||||||||||||
|
| ||||||||||||
|
|
Implementing a type checker
A key property of these rules is that they are syntax-directed: given a statement, we know which rule must be used to derive a the typing judgment for the statement. This means that we can implement a type checker as a simple recursive traversal over the AST. If the rules were not syntax-directed, we might have to search for a derivation, which could take time exponential in the height of the derivation.
For example, consider the rule (If). We can implement type
checking of this statement as a method typeCheck
that recursively
invokes the same method on subexpressions, to satisfy premises.
Side conditions are checked by non-recursive tests.
class If extends Stmt { Expr guard; Stmt consequent, alternative; void typeCheck(Context c) { Type tg = guard.typeCheck(c); // premise 1 if (!tg.equals(boolType)) throw new TypeError("guard must be boolean", guard.position()); consequent.typeCheck(c); // premise 2 alternative.typeCheck(c); // premise 3 } }
Top-level context
We need a top-level context that can includes bindings for all of the
functions in the program. In an object-oriented language, it would
also map each class name to some representation of the class. If we
assume that the program is a sequence of declarations
Of course, we also need to type-check function bodies to make sure
that they satisfy the contract implied by their signatures. To do
this, we need to record somewhere in the typing context what is the
expected return type of the function. One way to do that is
to record the return type of the function in a special
name return
statement is type-checked as follows:
|
|
One nice thing about type systems is that they let us clearly and concisely specify the job of semantic analysis. Another important use is that a formal type system allows us to prove that a statically typed language is strongly typed. However, showing you how to construct such a proof is a topic for a different course.
Reasoning about termination
Eta has some rules about return
statements that we are not enforcing
semantically — though they might already be enforced syntactically by
the parser. One way to achieve this is to assign a second type to
statements that do not terminate “normally” in the sense that
a statement following would never be executed. Obviously a return
statement is such a statement, but so is an if
statement whose
branches both end in a return
. Suppose we write return
; statements of type 1 might end in a return
.
Then we have new rules that can identify statements that don't pass control to a following statement:
|
|
We prevent return
from preceding a statement by modifying the
Seq rule so that it cannot be preceding by a
nonterminating statement:
|
| |
(Seq) |
We don't want a function body to fall off the end, so we require it to
have type 0 (if the return type is not 1), giving us the following judgment for
checking function body return
, so
the judgment obligation is