Iota Static Semantics

NOTE: This page uses some dynamic fonts for some of the math symbols. If the turnstile symbol (`) looks more like a back tick (`) than this (|-), try downloading and installing the TrueType TeX math symbols font. Make sure dynamic fonts are turned on on your browser. Netscape users on Linux may experience problems, apparently because dynamic fonts are not implemented correctly.

This handout presents static semantics for the Iota programming language. The static semantics mirror the syntactic specifications given in that definition; the rules given here can be applied straightforwardly to the result of parsing using those specifications. The static semantics capture the essence of type-checking Iota.

Type contexts

The goal of the type checker is to determine whether a module M is well-typed. We will use "well-typed" and "well-formed" as synonyms, because any well-typed module in Iota is also well-formed: it is a legal module. To check whether a module is well-typed, we must check that all of the definitions in the module are well-typed, which requires showing that all expressions are well-typed. If we see the expression 2 + 2 in a program, we expect that it is well-typed and that it produces an integer (type int). We can expect that the expression 2 + true is not well-typed, at least in most programming languages. What about the expression 2 + x, where x is a variable name? We don't know whether the expression is well-typed unless we know what the type of x is. Because Iota is a statically-typed language, x must have a declared type that we can use to decide whether expressions using it are well-typed.

Therefore, the type-checking of various syntactic constructs (modules, definitions, statements, and expressions) is performed with respect to a type context that keeps track of the legal identifiers that can be used in the construct. A type context is an environment mapping identifiers to their types: a set of identifier-type bindings of the form x: T, where no identifier x occurs twice. Usually, a compiler implements a type context using a symbol table, which binds an identifier to other information as well (for example, its location in memory). However, keep in mind that a type context is not a mutable data structure; it is a mathematical abstraction that can be thought of as a formal description of the contents of a symbol table at a particular moment in time. We will write a context formally as a sequence of such bindings; for example, x₁:T₁, ..., x_n:T_nwhere each of the x_i are variable names and each of the T_i are types. An empty context (containing no bindings) is written as Ø. We will often denote a context as an extension to an existing context, with new bindings appended to it. For example, if the context A contains some set of bindings x₁:T₁, ..., x_n:T_n, then the context A, y:T_y contains the bindings x₁:T₁, ..., x_n:T_n, y:T_y assuming that y is not equal to any of the x_i. If y=x_i for some i21..n, then A, y:T_y has no meaning and is not a legal type context. Thus, the new bindings may not shadow any existing bindings. Thus, Iota differs from the Tiger language (Appel), where new bindings take precedence.

Type-checking modules

Now, suppose that we want to show that a module is well-typed. The statement, "module M is provably well-typed with respect to its interface I," is an example of a judgment, and we will write this judgment formally as ` M : I module, where M is the text of the module and I is the text of the interface. Note that if the module has no interface file, its interface I is the empty interface. The word module is used in the judgment to distinguish judgments about modules from ordinary type judgments and from other judgments that the static checker makes. To check a module, we first construct an type context in which all of the module uses definitions and the components of the module themselves appear. The construction of this initial module context is performed by the function module-ctxt, which is defined below. Similarly, we process the interface file for the module to create a corresponding context containing bindings for all the declarations in the module's interface. Suppose that U represents a single "use" declaration (of the form id.id or id = id.id) and D represents a single definition within a module. For a module to be type-safe, each of its definitions D_i must be type-safe in the context of the module. The module must also correctly implement its interface. These checks are captured in the following rule:

M = `uses` U₁...U_n D₁...D_mA_m = module-ctxt({U₁, ..., U_n}, {D₁, ..., D_m}) A_m` D_i defn ^(i21..m)interface-ctxt( I ) ² module-ctxt({}, {D₁, ..., D_m})
	(module)
` M : I module

In this rule, the word defn is placed after the variable D to indicate that this is a judgment that a definition is well-formed: the judgment A ` D defn means that the definition D is well-formed in the context A.

The function module-ctxt performs the gathering of information about variables and functions that are visible at the top level, and can be defined recursively as shown below. Here, the function interface-type(N, x) finds the type of the identifier x in the interface named N. It obeys the invariant that T = interface-type(N, x) means x:T 2 interface-ctxt(parse(N.ii)), where parse(F) represents the parsing of the Iota source file F as a module or interface. The context interface-ctxt(I) is computed similarly to module-ctxt.

Processing definitions:
module-ctxt({U₁, ..., U_n}, {D₁, ..., D_m_-1, id₀(id₁: T₁, id₂ : T₂, ..., id_n : T_n) = E}) =
    module-ctxt({U₁, ..., U_n}, {D₁, ..., D_m_-1}), id₀: T₁ × T₂ × ... × T_n!unit
module-ctxt({U₁, ..., U_n}, {D₁, ..., D_m_-1, id₀(id₁: T₁, id₂ : T₂, ..., id_n : T_n): T = E}) =
    module-ctxt({U₁, ..., U_n}, {D₁, ..., D_m_-1}), id₀: T₁ × T₂ × ... × T_n!T
module-ctxt({U₁, ..., U_n}, {D₁, ..., D_m_-1, id_m : T_m}) =
    module-ctxt({U₁, ..., U_n}, {D₁, ..., D_m_-1}), id_m: T_m
Processing uses declarations:
module-ctxt({U₁, ..., U_n-₁, id₁.id₂},{}) =
    module-ctxt({U_1, ..., U_n-₁},{}), id₂: interface-type(id₁, id₂)
module-ctxt({U₁, ..., U_n-₁, id₁ = id₂.id₃}, {}) =
    module-ctxt({U_1, ..., U_n-₁},{}), id₁: interface-type(id₂, id₃)
module-ctxt({},{}) = Ø

Type-checking definitions

There are two kinds of definitions in a module: variable definitions and function definitions. Variable definitions in the module are automatically considered to be safe, so we have this as an axiom: A ` id : T defn. Function definitions are less trivial. The rule for checking functions that return or do not return a value is as follows:

A, a₁ : T₁, ..., a_n : T_n, return : T_r ` E : T_r
	(fcn defn)
A ` id(a₁ : T₁, ..., a_n : T_n) : T_r = E defn

A, a₁ : T₁, ..., a_n : T_n, return : unit ` E : T_r
	(proc defn)
A ` id(a₁ : T₁, ..., a_n : T_n) = E defn

These rules say that to check the type safety of a function definition, we check that the body of the function, E, has the expected type T_r that is the declared return type of the function. The second rule describes the case of a function that does not return a value, where the expression E may have any type T_r, since its value, if any, is discarded. The type of the expression E is determined in an context that includes all of the argument variables a_i of the function, bound to their corresponding types T_i, which is necessary because E may mention these variables.

To support return statements, the type context may include a binding for the pseudo-identifier return; this binding keeps track of the return type of the function and is used to make sure that return statements within the function return the right type of value. The Iota type system has one special type, unit, which does not appear in the syntax of the language. The type unit is the type of expressions or statements that produce no value. You can think of unit as a type that has only one value -- sort of like a defective boolean that can only represent false. This one value represents the normal, terminating evaluation of an expression or statement. Although the type unit cannot be used in declarations, it is useful in type-checking Iota as a fiction for describing those expressions that lack a useful result value for their surrounding context to use.

There is one additional condition on function definitions not captured by these rules: a function named main must have the type array[string]!int, as described in the Iota language definition.

Type-checking expressions

As in lecture, the judgment for typing both expressions and statements has the same form: A ` E : T and A ` S : T, respectively. Here, E and S represent an arbitrary expression or statement, and T represents the type of the expression or statement's value. These judgments are known as typing judgments or typing assertions.

A ` string_constant : string
A ` integer_constant : int
A ` true : bool
A ` false : bool

Inference rules for the various binary and unary operations also are straightforward. The rules for the operators -, /, *, and % are like the first rule for +.

A ` E₁ : int
A ` E₂ : int (The rules for -, *, /, % are similar) A ` E₁ : string
A ` E₂ : string

A ` E₁ + E₂ : int

A ` E₁ + E₂ : string

A ` E₁ : bool	(The rule for the logical or operator (`\|`) is similar)
A ` E₂ : bool

A ` E₁ `&` E₂ : bool

T is not unit A ` E₁ : T A ` E₂ : T	A ` E₁ : T A ` E₂ : T
		(T must be int or string. The rule for `>` is similar)
A ` E₁ `==` E₂ : bool	A ` E₁ `<` E₂ : bool

A ` E : bool		A ` E : int

A ` `!` E : bool		A ` `-` E : int

A ` E : string		A ` E : array[T]

A ` `length` E : int		A ` `length` E : int

Identifiers representing variables are type-checked simply by looking them up in the type context. Array index expressions representing variables also are straightforward.

id:T 2 A	A ` E₁ : array[T] A ` E₂ : int	A ` E₁ : string A ` E₂ : int

A ` id : T	A ` E₁ [E₂ ] : T	A ` E₁ [E₂ ] : int

The special array constructor expression in Iota requires that the initialization expression E₂ has the appropriate type:

A ` E₁ : int A ` E₂ : T
	(new array)
A ` `new` T [ E₁ ] ( E₂ ) : array[T]

A function call is checked by ensuring that the types of each of the actual expressions supplied as arguments are the same as the declared types of the formal arguments. Note that this rule looks in the environment to find the signature of the function named id.

id : T₁ × T₂ × ... × T_n!T 2 A A ` E_i : T_i ^{(i 2 1..n)}
	(fcn call)
A ` id ( E₁,..., E_n ) : T

Type-checking statements

In the following rules, the letter S will be used to denote a single statement. First, let us consider rules associated with the lists of statements. The type of an empty list of statements is unit:

The type of a statement list is the same as the type of the final statement. Variable declarations in the sequence apply to all subsequent statements in the block.

A ` S : T

(singleton)

A ` ( S ) : T

A ` S₁ : T₁S₁ is not a declaration A ` ( S₂ ; ... ; S_n ) : T_n		A ` S₁ : T₁S₁ = id : T [ = E ] A, id : T ` ( S₂; ... ; S_n ) : T_n
	(block)		(decl block)
A ` ( S₁ ; ... ; S_n ) : T_n		A ` ( S₁ ; ... ; S_n ) : T_n

Now we will consider type-checking some more interesting statements. A statement that is a variable declaration has type unit if the variable is not initialized by the declaration, or the type (and value) of the variable if it is initialized:

There are two kinds of assignment statements: assignments to ordinary variables, and assignments to array elements:

T not a function type id: T 2 A A ` E : T		A ` E₁ : array[T] A ` E₂ : int A ` E₃ : T
	(assign)		(array assign)
A ` id = E : T		A ` E₁ [ E₂ ] = E₃ : T

Now let us consider Iota's statements for control flow. The if statement requires that its two arms (if present) have the same type in order for the statement to have a value:

A ` E : bool A ` S : T		A ` E : bool A ` S₁ : T A ` S₂ : T
	(if)		(if/else1)
A ` `if` ( E ) S : unit		A ` `if` ( E ) S₁ `else` S₂ : T

If the two arms have different types, the if statement has no value and is given type unit:

A ` E : bool A ` S₁ : T₁A ` S₂ : T₂T₁ not equal to T₂
	(if/else2)
A ` `if` ( E ) S₁ `else` S₂ : unit

A while statement always has the type unit, and requires that its test expression has type bool.

A ` E : bool A ` S : T
	(while)
A ` `while` ( E ) S : unit

		A ` E : T
	(decl)		(decl init)
A `id : T : unit		A ` id : T = E : T

return: unit 2 A		A ` E : T T is not unit return: T 2 A
	(return1)		(return2)
A ` `return` : unit		A ` `return` E : unit

CS 412/413 Spring 2001

Static Semantics for Iota

Type contexts

Type-checking modules

Type-checking definitions

Type-checking expressions

Type-checking statements