Much of the job of communication is to pass on information to someone. When we design a user interface, it communicates on our behalf. When we write code, including test code, we communicate our purpose for the code to someone else (which could be a future version of ourselves). Sometimes we communicate more obviously and directly, for instance around a whiteboard or in a design document.
Often there are simple ways we can increase how much information we convey, for little or no extra effort, and in the rest of the article I’ll give a few examples of this. Yes, this article is going to suggest that you try to turn the information up to 11.
I recently phoned up a company that had an Interactive Voice Response (IVR) system – one of those “For X press 1, for Y press 2 …” machines. At one point it said “Please enter the password. This is the account holder’s date of birth as an 8-digit number, for example 01011980”. The last part was spoken as “oh one, oh one, nineteen eighty” – by the pauses and the grouping of digits into numbers it was clear that it was 01|01|1980.
This struck me as a missed opportunity to convey some information. First, let’s look at what information is in the example, based on the pattern matching it assumes users will do:
- 2 digits each for day and month, 4 digits for the year;
- Left pad day and month with 0;
- The year is last.
Information it doesn’t give:
- Is it DDMMYYYY or MMDDYYYY?
They had a free choice of their example, so they could equally have given e.g. 27011980. Compared to the previous example:
- It tells you it’s DDMMYYYY, because there aren’t 27 months in the year
- It doesn’t tell you about left padding the DD with 0, but should be able to rely on the normal tendency for people to assume patterns hold (i.e. people will assume DD and MM work the same).
You might think that it’s obvious from the first example that it’s DDMMYYYY, but I refer you to a proper usability guru – Paul Boag – and what he says about cognitive load and why and how you should reduce it. If you can easily avoid having to make your users think, then why not do so?
I’ve already written about how you can make tests more vicious by not having all your test data use the id 1. As well as more vicious, you also make them communicate more information.
- If your test data has only one row per table, and all of them use the id 1, then referring to an id of 1 means data from some table.
- If your test data has two or more rows per table, and each table starts its ids at 1, then referring to an id of 1 means the first row of data from some table.
- If your test data has two or more rows per table, and each table starts its ids at a unique number, then referring to an id of e.g. 3001 means the first row of data from this table.
You can see how adding a second row per table and changing how ids are chosen increases how much information is represented by just a number.
Note that I think that you shouldn’t have the various ids – 1003, 2007 and so on, dotted through your code. You should refer to them via constants instead, to pass on even more information.
Examples in designs and documentation
Sometimes we need to give examples of simple maths – for instance, how much better one option is over another, or how charges and refunds flow from signing up to a service for only some of a month rather than the whole month.
The power produced by a wind turbine depends on the square of the blade length and on the cube of the wind speed. To help someone appreciate what this means you might say: if the blade length goes up by a factor of 2, then output goes up by a factor of 4.
This is true, but as 2 * 2 = 4 as well as 2 ^ 2 = 4, this isn’t as clear as it could be. The meaning might get buried under the doubling relationship. If instead you said: if the blade length goes up by a factor of 3, then output goes up by a factor of 9, then you avoid the doubling relationship. I realise that there’s tripling that might play the same role in this version as doubling did in the previous one. However, I think that tripling is something we do much less often than we do doubling, so tripling will get in the way of things less.
Similarly, if you need to split a month up into a period where someone uses a service and a period where they don’t, it’s easy to default to the split being half and half. I usually try to avoid this, unless I really do want the example to be when the two regions are identical duration. If the reason why you’re going for a half and half split is to keep some maths simple, you can often still do this with other numbers. For instance, say that the month is April (30 days), the total charge is £60 (so that we’re avoiding the unwanted meaning that the charge is always equal to the number of days), and then the two periods are 1-7th April and 8-30th April, giving a £14 charge (and a £46 refund if one is due).
The things we’ve been talking about – code, design, documentation etc. – contains information and other things that have little to no information. (In text it’s the difference between words like the, with and an on one side and ostrich, fashionably and defenestrate on the other.) As well as trying to increase the amount of information, I think it’s also a good idea to try to increase the density of information – how much information there is compared to the size of the whole thing.
There are a couple of metaphors that might help you get what I mean. If you’re listening to an FM radio that’s not tuned properly, you’ll hear a mixture of the music you want (the signal) and hiss that you don’t want (the noise). There are a couple of things you could do. If you turn up the volume, then both the signal and the noise increase. You get more signal, but you also get more noise and so it’s probably no easier to hear what you want. If instead you tune the radio correctly then the signal will increase and the noise will decrease – you improve the signal to noise ratio.
In photography if you open the camera’s aperture wider then the depth of field decreases. This means that less of the world is in focus and more of it is blurry. One result of that is the viewer’s attention is constrained to the subject (the person or thing that’s important) and avoids the background.
In terms of code this is a long-winded way of saying Don’t Repeat Yourself (and don’t forget that your tests are code too).
When you’re writing code, you might be used to thinking about its effectiveness and efficiency in terms of meeting its requirements. (How many kinds of requests it can handle, and how much time or memory it takes to process a request, for instance.) I think it’s also a good idea to think about its effectiveness and efficiency in how it communicates its purpose to humans (which includes the future version of you). How much information does it convey, and how much other stuff is there diluting the information?
It’s not just code – any time you’re trying to communicate: how much information is there, and how much is this diluted? (Note that there are other things to think about too for communication that’s not coding – understanding your audience, structure, pacing, appropriate language, humour, intentional repetition to reinforce things and so on.)
It might be that you can increase the amount of information and the information density for little to no extra effort.
2 thoughts on “Turn the information up to 11”
Information can always be reduced from a point of surplus
year: 1970-2038 (wink wink)
This could easily be converted to a lossy format 01-01-1980, which may be unclear stand-alone without code or context. The same could perhaps be inferred or asserted for a few cases of 01-01-1980, but not the same way around.
❤ another fantastic article