Intro to Polymorphism There are 3 basic kinds of polymorphism: * Subtyping (or inclusion) polymorphism. * Parametric polymorphism. * Ad hoc polymorphism. Subtyping arises when we can pass an object of type T1 to an operation (including a function or method) that expects a T2 and we don't get a type-error. Most modern languages, including Java and C#, have some notion of subtyping as the basic form of polymorphism. Parametric polymorphism is a generalization of abstract data types and languages such as ML and Haskell are notable for their inclusion of parametric polymorphism. Parametric polymorphism arises naturally for functions, such as the identity, or list reversal, where we manipulate objects in a way that's oblivious to the type. Unlike subtyping, it's possible to express input-output relationships on type signatures which gives the caller more reasoning power. But, subtype polymorphism tends to work better (or at least more easily) for dealing with heterogeneous data structures. Note, however, that you can simulate subtyping in a pretty straightforward way with parametric polymorphism, but it requires passing around coercion functions. You can also simulate parametric polymorphism in a language like Java, but it requires run-time type tests. So neither situation is ideal, and next-generation languages, including Java 1.5 and C# include a notion of "generics" which generalizes both subtyping and parametric polymorphism. Ad hoc polymoprhism is everything else. Overloading is a good example. Something like "+" appears to work on both integers and reals, but note that we get different code for the different operations. Furthermore, "+" is only defined on some types and there's no way to abstract over those types. Languages such as C++ provide a lot of support for ad hoc polymorphism through templates, but anyone who has used template libraries can tell you that this was not well thought through. Language researchers, particularly in the Haskell camp, have been working on better ways to support the power of ad hoc polymorphism (sometimes called polytypism) and this, coupled with notions of reflection, are hot research topics. This is mostly because we don't have a clean handle on these concepts the same way we do subtyping and parametric polymorphism. The funny thing is that it's only in the last 15 years that we've gotten a good handle on those two! --------------------------------------------------------------------- Subtyping: This arises naturally if we think about the idea of "types as sets", for then "subtypes are subsets". More abstractly, the principle behind subtyping is that, if you can use a value of type T1 anywhere that you expect to have a value of type T2, then you ought to be able to treat T1 as a subtype of T2. For example, given record types: type point2d = {x:int, y:int} type point3d = {x:int, y:int, z:int} then we ought to be able to pass in a point3d to any function expecting a point2d, because all that function can do is select out the x or y components, and those operations are defined on point3d values as well as point2d values. If we think of point2d and point3d as interfaces (a la Java or C#) then it's natural to think that anything that implements the point3d interface also implements the point2d interface. We can extend our little functional language with subtyping by adding one new rule called subsumption: G |- e : t1 t1 <= t2 ------------------------ G |- e : t2 The rule says that if e has type t1 and t1 is a subtype of t2, then e also has type t2. Of course, now we need to define what it means for one type to be a subtype of another. Remember the principle to follow is that t1 <= t2 means that for any operation "op" you can perform on a t2 value, you should be able to perform "op" on a t1 value. All subtyping frameworks yield a partial order so they include rules for reflexivity and transitivity: t <= t t1 <= t2 t2 <= t3 ------------------- t1 <= t3 For something like (immutable) records, the subtyping rule might look like this: t1 <= t1' ... tn <= tn' -------------------------------------------------- {l1:t1,...,ln:tn,ln+1:tn+1} <= {l1:t1',...,ln:tn'} This rule says that a record type R1 is a subtype of a record type R2 if for all l:t' in R2, there exists a label l with type t in R1, and furthermore, t <= t'. This rule is sound because the only operation we can apply to an R2 value is to extract a component using the #l operation. If #l is defined in R1, then the operation will succeed. And intuitively, if that yields a value of type t, then we can use subtyping to show that t <= t'. What about sums? Consider the datatypes: datatype t1 = C1 of int datatype t2 = C1 of int | C2 of bool datatype t3 = C1 of int | C2 of bool | C3 of real The only operation on sums is to do a case on them. If I have a case that can handle all of the constructors of t3, then it's safe to pass in a t2 or a t1 value because the set of constructors in those types is a subset of the set used in the definition of t3. So, for sums, the subtyping rule is the opposite -- the subtype must have fewer labels. t1 <= t1' ... tn <= tn' ---------------------------------------------------------------------------- [l1 of t1 | ... | ln of tn] <= [l1 of t1' | ... | ln of tn' | ln+1 of tn+1'] So, once again, sums are pleasingly dual to products. To see why this works out, consider the two definitions: datatype t1 = C1 of {x:int,y:int,z:int} datatype t2 = C1 of {x:int,y:int} | C2 of {b:bool} According to the definition, t1 <= t2. Let us write a function that manipulates t2 objects: fun f(v:t2) = case v of C1(r1:{x:int,y:int}) => #x r1 + #y r1 | C2(r2:{b:bool}) => 0 Is it safe to pass a t1 object to f? The answer is yes because (a) f must have enough cases to cover all of the constructors in t1, since t2 has at least the constructors of t1 and (b) for any case of a pattern match, f can only assume that the value carried by the constructor has t2's type. So, it should be safe to pass it a value with t1's type, as long as t1's type is a subtype of t2's for the given constructor. (I should remark that these rules only work for non-recursive datatypes.) What about functions? The only operation on a function is to apply it to some argument. So, suppose I have a function: f : ({x:int,y:int} -> {x:int,y:int}) -> int Suppose that I have another function: g : {x:int} -> {x:int,y:int,z:int} Does it make sense to pass g to f? That is, what can go wrong if we do f(g)? Well, f might try to apply g to some argument. It thinks that g takes in {x:int,y:int} arguments. So, it might pass an {x:int,y:int} value to g. Now g thinks that it's being given an argument of type {x:int}, so the only thing it can do is apply #x to the argument. That seems to be okay because conceptually, we can use subtyping to treat the {x:int,y:int} value as if it's a {x:int} value. What about the result? f thinks that g returns {x:int,y:int} so it might do #x or #y on the result. But g returns a {x:int,y:int,z:int} value. Again, this is okay because the operations that f might do on g's result are okay. Or conceptually, we can apply subtyping to coerce g's result to {x:int,y:int} if you like. So it appears as if g's type is a subtype of the argument type of f. That is, we have: ({x:int} -> {x:int,y:int,z:int}) <= ({x:int,y:int} -> {x:int,y:int}) In general, the subtyping rule for functions is: t1' <= t1 t2 <= t2' -------------------------- (t1 -> t2) <= (t1' -> t2') So note that the argument types are *contra-variant* while the result types are *co-variant*. There are a lot of ways to see why contra-variance in the argument type is necessary. The example above demonstrates this, but so does drawing a Venn diagram of sets, subsets, and mappings from one set to another. Or from a logical standpoint, we can think of (t1 -> t2) as similar to (not(t1) or t2) --- the negation on the argument suggests that we'll have to flip the subtyping around. ----------------------------------------------------------------------- Subtypes as Subsets vs. Coercions: Earlier, I said that we can think of subtypes as "subsets" and this is the general principle that most languages follow. However, it requires that we have a uniform representation for values that lie in related types. For instance, in Java, all Objects are one word so that we can pass them around, stick them in data structures, etc. But that means that double values are not a subtype of Objects. A *boxed* double value (e.g., Double) is a subtype of Object, and we can coerce a double to a Double. So, in some sense, double is a "coerceable-subtype" of Object. In fact, we can think of all of the subtyping rules as being witnessed by some coercion function. In particular, if we think of t1 <= t2 as a function C:t1->t2, then the subsumption rule becomes: G |- e : t1 C:t1->t2 ----------------------- G |- C e : t2 The other coercions can be calculated as follows: \x:t.x : t->t (reflexivity is the identity) C1:t1->t2 C2:t2->t3 ---------------------- (transitivity is composition) (C1 o C2) : t3 Then record subtyping is witnessed by this coercion: C1 : t1->t1' ... Cn : t1->tn' ----------------------------------------------------------------- \r:{l1:t1,...,ln:tn,ln+1:tn+1}.{l1 = C1(#l1 r),...,ln = Cn(#ln r)} Coercion-based subtyping is simpler from the perspective of the language designer and implementor, and it doesn't require us to have uniform representations for objects. But coercions aren't so attractive for the programmer. First, they cost time. Second, they don't work for mutable objects since they fundamentally operate by making a copy of the object. (You can achieve this by providing a level of indirection. That's what Java interfaces do.) Another issue with coercions is that you must have some notion of coherence. This means that, no matter how we prove that the code type-checks, its meaning doesn't change. In particular, if I have a proof P1 that |- e : t and another proof P2 that |- e : t, whether I use P1 to insert the coercions or P2 shouldn't matter. I should get the same value out at the end of the day. Note that the coercions of double to Double in Java are not coherent. The problem is that if d is a double, then d == d does not return the same thing as (Double)d == (Double)d Similar problems arise in C. In general, (implicit) coercions turn out to be a good idea only in pure, functional languages. In imperative languages, it's almost always a disaster (witness the problems of arithmetic in C/C++.)