Subtype Polymorphism

Introduction

We now explore the concept of subtyping, one of the key features of object-oriented languages. Subtyping was first introduced in SIMULA, considered the first object-oriented programming language. Its inventors Ole-Johan Dahl and Kristen Nygaard later went on to win the Turing Award for their contribution to the field of object-oriented programming. SIMULA introduced a number of innovative features that have become the mainstay of modern OO languages including objects, subtyping and inheritance.

The concept of subtyping is closely tied to those of inheritance and polymorphism and offers a formal way of studying them. It is best illustrated by means of an example:

A Subtype Hierarchy

This is an example of a subtype hierarchy, which describes the relationship between different entities. In this case, the Student and Staff types are both subtypes of the Person type (alternately, Person is the supertype of Student and Staff). Similarly, TA is a subtype of the Student and Person types and so on. A subtype relationship can also be thought of in terms of subsets. For example, this example can be visualized with the help of the following Venn diagram:

Subtypes as subsets

The \(≤\) symbol is typically used to denote the subtype relationship. Thus, Staff \(≤\) Person, RA \(≤\) Student and so on. Sometimes the symbol \(\lt:\) is used instead to denote subtyping, but we will stick to \(≤\).

\( \newcommand{\Tr}[2]{{\cal #1}⟦#2⟧} \newcommand{\SB}[1]{⟦#1⟧} \newcommand{\lam}[2]{λ#1.#2} \newcommand{\ty}{\!:\!} \newcommand{\bnf}{\mid} \newcommand{\rulenm}[1]{({\mathrm{#1}})} \)

Subtyping as inclusion

The statement \(τ_1≤τ_2\) means that a \(τ_1\) can be used wherever a \(τ_2\) is expected. One way to think of this is in terms of sets of values corresponding to these types. Any value of type \(τ_1\) must also be a value of type \(τ_2\). Assuming that \(\Tr{T}{τ}\) is the set of elements of type \(τ\), we understand \(τ_1≤τ_2\) to mean \(\Tr{T}{τ_1} \subseteq \Tr{T}{τ_2}\).

Subtyping rules

The informal interpretation of the subtype relationship \(τ ≤τ' \) is that anything of type \(τ\) can be used in a context that expects something of type \(τ'\). This idea is formalized as the subsumption rule:

\( Γ ⊢ e:τ \)\( τ ≤ τ' \)
\(Γ ⊢ e:τ'\) (Sub)

Notice that the right premise in this rule is actually a side condition, relying on a separate, new judgment of the form \(τ_1≤τ_2\). We still have to define this judgment.

The subsumption rule is a perfectly well-defined typing rule, but it has one serious problem for practical application: it is not syntax-directed. Given a judgment to be derived, we can't tell whether to use the subsumption rule or a rule whose conclusion matches the syntax of the desired judgment.

The subtyping relationship is reflexive and transitive:

\(τ ≤ τ\) (Refl)
\(τ_1 ≤ τ_2 \)\( τ_2 ≤ τ_3 \)
\(τ_1 ≤ τ_3\) (Trans)

Since the \(≤\) relation is both reflexive and transitive, it is a pre-order. In most cases, anti-symmetry holds as well, making the subtyping relation a partial order, but this is not always true. The subtype relationships governing the 1 and 0 types are interesting:

The type hierarchy thus looks as shown:

Type Hierarchy

Subtyping on product types (tuples)

The simplest kind of data structure is a pair (a 2-tuple), with the type \(τ_1 * τ_2\). It represents a pair \((v_1, v_2)\) where \(v_1\) is a \(τ_1\) and \(v_2\) is a \(τ_2\). The operations supported by a pair are to extract the first (left) component or the second (right). The most permissive sound subtyping rule for this type is quite intuitive:

\(τ_1 ≤ τ_1'\)\(τ_2 ≤ τ_2'\)
\(τ_1 * τ_2 ≤ τ_1' * τ_2'\) (PairSub)

This rule is covariant: the direction of subtyping in the arguments to the type constructor (\(*\)) is same as the direction on the result of the constructor. In general, we say that a type constructor \(F\) has subtyping that is covariant in one of its arguments \(τ\) if whenever \(τ≤τ'\), it is the case that \(F(τ)≤F(τ')\).

Records

We are particularly interested in how to handle subtyping in the context of object-oriented languages. Record types correspond closely to object types and yield some useful insights.

A record is a collection of immutable named fields, each with its own type. We extend the grammar of \(e\) and \(τ\) for adding support for record types:

\begin{align*} e & ::= \dots \bnf \{x_1 = e_1, \ldots, x_n = e_n\} \bnf e.x \\ v & ::= \dots \bnf \{x_1 = v_1, \ldots, x_n = v_n\} \\ τ & ::= \dots \bnf \{x_1:τ_1, \ldots, x_n:τ_n\} \end{align*}

with the following typing rules:

\(Γ \vdash e_i\tyτ_i~~^{(\forall i∈1..n)}\)
\(Γ \vdash \{x_1=e_1, \ldots, x_n=e_n\}:\{x_1\tyτ_1, \ldots, x_n\tyτ_n\}\) (Tuple)
\(Γ \vdash e:\{x_1\tyτ_1, \ldots, x_i\tyτ_i, \ldots, x_n\tyτ_n\}\)
\(Γ \vdash e.x_i:τ_i\) (Select)

What we can see from this is that the record type \(\{x_1\ty τ_1, \ldots, x_n\tyτ_n\}\) acts a lot like the product type \(τ_1*\dots*τ_n\). Records can be viewed as tagged product types of arbitrary length. This suggests that subtyping on records should behave like subtyping on products.

There are actually three reasonable subtyping rules for records:

The depth and width subtyping rules for records can be combined to yield a single equivalent rule that handles all transitive applications of both rules:
\(m ≤ n\) \(τ_i ≤ τ_i'~~^{(\forall i∈1..m)}\)
\(\{x_1:τ_1, \ldots, x_n:τ_n\} ≤ \{x_1:τ_1', \ldots, x_m:τ_m'\}\) (Width)

Function subtyping

Based on the subtyping rules we have encountered up to this point, our first impulse is perhaps to write down something like the following to describe the subtyping relation for function types:

\(τ_1 ≤ τ_1'\) \(τ_2 ≤ τ_2'\)
\(τ_1 → τ_2 ≤ τ_1' → τ_2'\) (BrokenFunSub)

However, this rule is incorrect, despite being adopted by the Eiffel programming language and, more recently, TypeScript! To see why, consider the following code snippet:

f: τ1→τ2 = f1
f′: τ1′→τ2′ = f
v: τ1 = ...
f′(v)

In the example above, since the subtype of f is a subtype of the type of the type of f′ (according to the broken rule), we should be able to use f where f′ is expected. Therefore we should be able to call f(t′). But f expects an input of type τ1 and gets instead an input of type τ′1, so we should be able to use τ′1 where τ1 is expected—which in fact implies that we should have τ′1 ≤ τ1 instead of τ1 ≤ τ′1 as given.

Function subtyping

We can derive the correct subtyping rule for functions by thinking about the figure above. The outer box represents a context expecting a function of type \(τ_1'→τ_2'\). The inner box is a function of type \(τ_1→τ_2\). The arrows show the direction of flow of information. So arguments from the outside of type \(τ_1'\) are passed to a function expecting \(τ_1\). Therefore we need \(τ_1'≤τ_1\). Results of type \(τ_2\) are returned to a context expecting \(τ_2'\). Therefore we need \(τ_2 ≤ τ_2'\). The rule is:

\(τ_1' ≤ τ_1\) \(τ_2 ≤ τ_2'\)
\(τ_1→τ_2 ≤ τ_1'→τ_2'\) (FunSub)

The function subtyping rule is our first example of contravariance in subtyping—the direction of the subtyping relation is reversed in the premise for the first argument (\(τ_1\)).

The rule for function subtyping determines how object-oriented languages can soundly permit subclasses to override the types of methods. If we write a declaration C extends D in a Java program, for example, it had better be the case that the types of methods in C are subtypes of the corresponding types in D. Checking this is known as checking the conformance of C with D.

It is sound for object-oriented languages to check conformance by using a more restrictive rule than (FunSub), and Java does: as of Java 1.5, it allows the return types of methods to be refined covariantly in subclasses (the \(τ_2≤τ_2'\) part of the rule above), which is sound. Java doesn't, however, permit contravariant generalization of method arguments, probably because it would interact poorly with the Java overloading mechanism.

It took a surprisingly long time for everyone to agree on the right subtyping rule for functions. The broken rule (BrokenFunSub) was actually used for conformance checking in the language Eiffel. Run-time type-checking had to be added later to make the language type-safe. More recent work on family inheritance mechanisms such as virtual classes and nested inheritance shows how to soundly permit some of the covariant overriding that the Eiffel designers wanted.

Subtyping rules for arrays

What about subtyping on array types?

\begin{eqnarray*} τ & ::= & \ldots \bnf τ\,\texttt{[]} \end{eqnarray*}

For the subtyping rule, our first impulse, especially if we have been doing a lot of Java programming, might be to write down a covariant rule:

\(τ_1 ≤ τ_2\)
\(τ_1\,\texttt{[]} ≤ τ_2\,\texttt{[]}\) (JavaArraySub)

However, this rule is again incorrect. To see why, consider the following example:

x: TA[] = new TA[0]
y: Person[] = x
y[0] = undergrad1;
x[0].gradeAssignment() // something that only TAs can do!

Even though this code type-checks with the given subtyping rule for array types, it will cause a run-time error, because in the last line x[0] does not evaluate to a TA. To avoid this problem, the subtyping relation on array types should be invariant in \(τ\): the only subtyping relationship between array types is the reflexive relationship that already holds for all types.

Since the equivalent code type-checks in Java, you might be wondering how Java can get away with a covariant subtyping rule. The answer is that Java uses extra run-time checking. At the third line, the assignment to y[0] will failed with a run-time exception: ArrayStoreException. The extra run-time checking not only makes code less robust but also imposes a significant run-time cost on use of arrays.

Type checking with subtyping

Since the subsumption rule is not syntax-directed, including directly in the type checker might seem to make type checking very expensive. At each call to the type checker, the syntax doesn't tell us whether to use subsumption or not, suggesting that we might have to do an exponential-time search. Fortunately, such a search is normally not needed.

Instead, we fold all possible use of subsumption into each of the ordinary syntax-directed typing rules. This is made possible by designing each rule so that it produces a principal type: a best possible type that loses no information. In the case of subtyping, the principal type should be a subtype of all other types the expression might have. We then adjust all typing rule premises so that they do not require subexpressions to have the exact type expected, but instead any subtype of the expected type.

For example, let us consider the rule for assigning to an array element in Xi:
\(Γ ⊢ e_1 : τ\texttt{[]}\) \(Γ ⊢ e_2 : \texttt{int}\) \(Γ ⊢ e_3 : τ\)
\(Γ ⊢ e_1\texttt{[}e_2\texttt{]} = e_3 : 1 ⊣ Γ\) (ArrAssign)

In the presence of subtyping, the type derived for \(e_3\) by the syntax-directed rule might be a subtype of the desired type \(t\). Therefore, we replace that premise with the premises of the corresponding instance of the subsumption rule:
\(Γ ⊢ e_1 : τ\texttt{[]}\) \(Γ ⊢ e_2 : \texttt{int}\) \(Γ ⊢ e_3 : τ_3\) \(τ_3 ≤ τ\)
\(Γ ⊢ e_1\texttt{[}e_2\texttt{]} = e_3 : 1 ⊣ Γ\) (ArrAssignSub)

This recipe works well for most typing rules. One issue that can come up is when two premises are required to produce the same type. For example, suppose that we wanted a “ternary expression” ala C or Java. In the absence of subtyping, the rule is straightforward:
\(Γ ⊢ e_1 : \texttt{bool}\) \(Γ ⊢ e_2 : τ\) \(Γ ⊢ e_3 : τ\)
\(Γ ⊢ (e_1 \texttt{?} e_2 \texttt{:} e_3) : τ \) (Ternary)

Now imagine using subsumption to derive the types of both \(e_2\) and \(e_3\).
\(Γ ⊢ e_2 : τ_2\) \(τ_2 ≤ τ\) \(Γ ⊢ e_3 : τ_3\) \(τ_3 ≤ τ\)
\(Γ ⊢ e_1 : \texttt{bool}\) \(Γ ⊢ e_2 : τ\) \(Γ ⊢ e_3 : τ\)
\(Γ ⊢ (e_1 \texttt{?} e_2 \texttt{:} e_3) : τ \) (Ternary)

Unlike with the previous example, we cannot just read out a new syntax-directed rule from the five premises at the top, because they do not say how to choose \(τ\). However, recall that we want the rule to derive the principal type of the expression: that is, the least \(τ\) that is above both \(τ_2\) and \(τ_3\). We can express this as an additional premise using the least upper bound (join) operator \(⊔\):
\(Γ ⊢ e_1 : \texttt{bool}\) \(Γ ⊢ e_2 : τ_2\) \(Γ ⊢ e_3 : τ_3\) \(τ = τ_2 ⊔ τ_3\)
\(Γ ⊢ (e_1 \texttt{?} e_2 \texttt{:} e_3) : τ \) (Ternary)

Taking the least upper bound is not necessarily possible for an arbitrary subtyping relation. For example, in Java two classes C1 and C2 can each implement two interfaces I1 and I2. Both I1 and I2 are upper bounds for the two classes but neither one is more precise than the other. In this case the type checker may do some approximation by coarsening the result type upward in the subtyping order. Alternatively it may introduce a new type constructor for constructing the least upper bound of two types—at the cost of new complexity elsewhere in the type system and type checker.