Covariance and contravariance – part 1: Arrays and lists

This post is the first in a series – for once I will split a large topic into a few small posts. The series is about covariance and contravariance, together known as variance.

Covariance and contravariance are terms I came across occasionally, and never understood properly. Having put in the effort to work out what was going on, I thought that I should try to write something that I hope will make it easier for other people to understand it. There are lots of other resources you can turn to if this doesn’t make sense – I’ll refer to them at the end of this post.

I’ll be explaining them in the context of C#; I don’t know other languages well enough to be confident of explaining things using them. I guess that the concepts are fairly universal, but details will probably vary between languages.

Why should you care? Using variance means that you’re able to do things with variables and values of different types that polymorphism on its own wouldn’t allow. This generally means things are more flexible and life is easier. The nice thing is, the compiler and run time system are still able to spot when you’re trying to do something stupid and stop you. A small but important detail is that this all involves reference types and not value types. That means that variance gives flexibility to how you use classes, but doesn’t give the equivalent flexibility to how you use structs.

If there’s some existing variant code, you can just use it without doing anything out of the ordinary. If you want to create variant code, you will probably have to decorate types with the modifiers in and / or out. (I’ll explain this in a later post.)

Sample code

Here is some code that compares a few related things. It relies on a pair of classes – Animal and Bird – which aren’t defined here. It doesn’t matter what they’re like other than Bird is derived from Animal (as you might have guessed) and they both have null constructors because I’m too lazy to provide arguments.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22 	// part 1
	Animal animal;
	Bird bird = new Bird();

	animal = bird;

	// part 2
	List<Animal> animalList;
	List<Bird> birdList = new List<Bird> { new Bird() };

	// this doesn't compile
	// animalList = birdList;

	// part 3
	animalList = new List<Animal>();
	animalList.Add(bird);

	// part 4
	Animal[] animalArray;
	Bird[] birdArray = new Bird[2];

	animalArray = birdArray;

Part 1 – Simple polymorphism

Polymorphism lets you assign an object of one class to a variable of any of its ancestor classes. I.e., a Bird will do everything that an Animal could do, so this is safe. Whether the variable will have access to the derived class’s behaviour depends on things like virtual methods. So far this relies on polymorphism but not variance.

Part 2 – Generic types cause problems

It might occasionally be nice to be able to assign a List<Bird> to a List<Animal>, which you might expect because Bird is derived from Animal. But that misses an important detail – while Bird is derived from Animal, List<Bird> is not derived from List<Animal>. This is the first surprise (a slightly unpleasant one).

Part 3 – Zooming in on list members

This is the second surprise – it’s a surprise given the problem in part 2. It’s important to notice here that we’re not moving whole lists around – we’re adding a single element to the list. When you add a single element to a list, you are effectively back to part 1 again. The role of the variable is taken by a single slot or element in the list, and the role of the value is taken by the thing you’re adding to the list / putting in that slot.

Part 4 – Arrays are covariant

This is the third and final surprise for this post. Unlike with List<Bird> and List<Animal> in part 2, you can assign a Bird[] to an Animal[]. This is because arrays are covariant. (There’s probably a valid reason why generic lists aren’t, but I don’t know it.)

Variance and type size

Covariance gets around the problem in part 4 – that’s similar to the problem in part 2 – that Bird[] is not derived from Animal[]. Variance (i.e. both covariance and contravariance) relies on concept of a type’s size, and how type A’s size compares to type B’s. The possible answers are: bigger, smaller, the same, not related.

This is a looser definition than the base / derived relationship between types:

Animal is the base type for the derived type Bird
List<Animal> is not the base type for the type List<Bird>
Animal[] is a bigger type than Bird[] (and so Bird[] is smaller than Animal[])
Animal is not related to e.g. int.

Even though Animal[] is not the base type for Bird[], a Bird[] should be able to cope in the context where you want an Animal[]. All the array-related properties and behaviours of an Animal[], such as indexing a particular element, will be provided by a Bird[]. For a single element in the array, a Bird is able to meet all the requirements of an Animal. (This logic is why I’m puzzled that List<Animal> is not bigger than List<Bird>.)

You can think of big / small in terms of sets of values. The set of all possible Animal[]s is bigger than, and contains, the set of all possible Bird[]s.

Covered in later posts

So far I’ve introduced covariance, but only briefly. I haven’t done anything on contravariance. I’ll address these in later posts. So far I’ve shown only one area where variance applies – arrays. There are two other areas – delegates and interfaces – which will also wait till later posts.

Other resources

C# guru Eric Lippert’s blog has lots on variance;

C# guru Jon Skeet’s book covers variance;

There are videos by people like Greg Kalapos on variance.