The compounding value of information

Information is one of those things where sometimes the whole is greater than the sum of the parts. That is, you get extra value from combining bits of information, on top of the value from the separate bits of information on their own. I’ll illustrate this with an example to do with spies, but then concentrate on what this means for more computer-related things. This touches on broader issues in society, and how we want to organise things.

The efficient finding secret of agents by combining information

Imagine that there is a secret agent somewhere in the UK, and you are trying to find them. Fortunately for you, they are broadcasting a radio message, and you have access to listening stations that can tell how far away a message is from the station.

The Oxford station knows that the message is coming from 15 miles away. That means that it’s somewhere along the circumference of this circle:

A map showing a circle around Oxford, with radius 15 miles

This circle has a circumference of 94 miles, so people on the ground would have 94 miles to search to find the agent. The Swindon station knows the message is coming from 15 miles away too, which means that it’s somewhere along the circumference of this circle:

A map showing a circle around Swindon with radius of 15 miles

That’s a (mostly) different 94 miles. However, if these two bits of information were relayed to headquarters and combined, then things become much easier for the searchers. These two circles intersect at only two places. If the Basingstoke station knows the signal is 40.3 miles away and this bit of information is combined with the other two bits, then the message can be coming from only one place:

A map showing circles around Swindon, Oxford and Basingstoke, intersecting at RAF Brize Norton

This happens to be RAF Brize Norton.

Being lucky with less information

It could be that the location could have been pinpointed with only two bits of information. If the circles around two stations only just touched, i.e. they touched at only one point rather than at two, then you wouldn’t need a third station to help you pinpoint the location.

Similarly, if the stations couldn’t tell you have far away the signal was, but could tell you which direction it was coming from, then most of the time you would need only two stations:

A map showing lines travelling out from Oxford and Swindon, that intersect at RAF Brize Norton

If the signal was coming from a point directly on the line between the two stations, then you would still need a third station to tell you where to look on that line.

I mention all these because they illustrate a few general things to do with how valuable information is when you combine it. If you imagine a graph where the X axis shows the number of bits of information you combine, and the Y axis shows the resulting value to you:

The graph might climb gradually upwards to the right, rather than being flat for a while and then jumping up. The gradually climbing graph is when you can gradually home in on something (such as a location).
The shape of this graph can vary based on the kind of information you have. E.g. how many bits of information you need before the graph has jumped to 100% of what you want, by e.g. filtering things down to a single place.
Sometimes the shape of the graph is influenced by probabilities i.e. luck. If you have stations giving you directions, then most of the time you need only two stations to know the location.

Making it more personal

I’ll change to something closer to home for most people, which is buying things from a supermarket. If a supermarket is able to analyse what you buy over a long enough time, e.g. via a loyalty card, then they might start to pick up bits of information such as:

Whether someone in your household has a gluten intolerance or allergy
If your household is kosher, halal, vegetarian or vegan
The rough size of your household
Whether you have children and / or pets
Whether you drink or are tee-total
How stressed your budget is – based on whether you buy value/own brand, Taste the Difference, or in the middle

Even if the loyalty card doesn’t mean they know your address, and you don’t get a home delivery, they can know a location that’s important to you – as the shop you usually use will probably be the shop nearest to somewhere like where you live or work.

They can probably have a clue as to when you go away, e.g. on holiday, as your regular pattern of visits will be broken.

Note that these will all be clues, rather facts. Taking vegetarianism – the shopping could give false positives or false negatives. It could be that you aren’t a vegetarian, but buy all your meat from a butcher. Or it could be that you are a vegetarian, but buy meat as a favour for a house-bound neighbour.

Also, it isn’t new that what you buy passes on information to the shop. In the days of a grocer behind the counter, such as in Open All Hours, the grocer would know all this. But you would also get to know the grocer (not to the same level of detail, but it wouldn’t be a one way street of information). It’s similar to how people have always been able to see what my home looks like, but before Google Maps street view I would see them at the same time if they stood on the pavement outside. The balance of the relationship has changed.

Sometimes there’s no safety in numbers

I hope I’ve illustrated how even low grade information can help you glean a surprising amount about someone if you collect enough of it. So you might think that, if you hold information about people, you can avoid the risk of accidentally leaking useful or sensitive information by publishing information only about groups of people. The idea is that you can’t get useful enough information about any individual from the information about groups to which they belong. Imagine that you decide that the smallest group you can say anything about is a group of 12 people.

Under this rule, I can then say the following about a workplace:

The average salary of the 13 people who work there is £28,308
The average salary of the 12 people who work there full-time is £30,000

Even though both of these groups are big enough to pass the rule, I can still say with 100% confidence that the 1 part-time worker has a salary of £8,004:

The first bit of information tells me that the total salary of all 13 people = 13 x £28,308 = £368,004.
The second bit of information tells me that the total salary of the 12 full-time workers = 12 x £30,000 = £360,000.
Combining the two bits of information – there’s only one person different between the two groups (the part-time worker) and so the difference in total salary must be due to just them. The difference in total salary is £8,004.

The problem is that even though the separate groups are big enough, by combining them I can infer the existence of a group that is small. This small group is small enough that any information I can deduce about it is sharply focussed on just a few people (in this case, only one person). These hidden small groups can be hard to spot.

Combining information across people

There are other ways in which information can be combined to make it more valuable. One way is where you have information about many individuals, and you have information about enough of them that the group becomes valuable. You can start to pull out general patterns about the group, and then use that pattern to predict things about new members of the group.

One example is shops such as supermarkets who have some kind of loyalty card, as mentioned above. Sometimes these can be tailored, e.g. the shop could offer a baby club, so that pregnant women get baby-related vouchers. As part of the baby club the shop might invite or require the woman to give her baby’s due date.

If you have the shopping history of women you know to be pregnant, and whose due date you know, you can then start to look for patterns. Free will is part of our decision making, but so are other things such as emotions, hormones etc. (Don’t go shopping when you’re hungry!) Some shops have discovered that there’s a general pattern to purchases across the stages of pregnancy, i.e. what you buy is influenced by which trimester you’re at.

So, you have established that, at least to some degree, there’s a pattern of purchases that follows the course of a woman’s pregnancy. Imagine that you notice that the purchases of a woman who isn’t in the baby club have started to match the part of the pattern for the beginning of a pregnancy. You don’t know for certain, but you have a clue that she might be pregnant. What if, next time you send her a book of vouchers, you include a few that a pregnant woman might find attractive?

Just in case you guess wrong, what if you have already reserved a few slots in the book of vouchers for what appear to be random vouchers? For instance, you send offers off garden sheds, even if you’re not sure if the person has a garden. Then the person might think that this voucher off intensive skin lotion for baby bumps was just one of those random vouchers.

It’s important to remember that this is an instance of probabilities, rather than having simple yes/no answers. Your model plus a woman’s purchase history has increased the probability that she’s pregnant, rather than telling you that she definitely is.

I’ve heard some people say that this is a hypothetical example, rather than something that has actually happened. I think that the point still stands, because everything I’ve described is plausible with information currently available to big shops.

So what?

This could all be described as modelling, i.e. nothing new. It’s trying to create a representation of part of the world based on data you’ve measured from it. It’s used in science, medicine, engineering and all kinds of other fields.

This is all true, but it doesn’t mean that there aren’t questions that are valid ones to ask.

The information, in the examples above and similar ones, is given by individuals. Did they have a choice in practice over whether to give this information or not? How informed was this choice? Can you withdraw your consent after you have given it? Is the model updated to remove your information’s contribution to it?

Who benefits from this modelling and who pays any costs? Who has power because of the modelling? Who is held accountable for the accuracy of the model, and its appropriate use? Who gets to define appropriate? How transparent is it all?

How do new entrants break into markets dominated by big players who have lots of information? Capitalism relies on choice and competition – does information act against that? How fair and efficient are these markets?

It’s too easy to see current technology as, unlike past, inferior, technology to have benefits and no drawbacks. It’s easy to see the technology and not see the people around it. I’m not a Luddite (I’m a programmer, after all), but I like people more than I like machines.

It’s also too easy to think “this came out of a computer, so it must be right”. I mean “right” in both senses – right as in correct, but also right as in ethical. A computer might not make value judgements by itself, but it is still a tool in the hand of a human. The maker, owner and user of the tool still have responsibility for how the tool changes the world. I think it’s important that we have an information commissioner in the UK, but don’t feel the need for e.g. a paperclips commissioner. Information can represent power, and power needs to be matched by responsibility.

3 thoughts on “The compounding value of information”

lewiscowles says:

February 11, 2021 at 9:46 am

As usual, wonderful Bob. I Particularly enjoyed the example of picking the salary out of the pile.

Do you have a position on use of data, or is it more of an area of abstract interest?

LikeLike
Bob says:

February 11, 2021 at 10:15 am

Thank you. My previous job was more data processing than my current one. We got data about individual people from two different bits of the UK government, on condition that we treated it carefully (various contracts defined what “carefully” meant, in detail). Things like removable encrypted hard drives that were removed overnight, plus the kind of checks I mention about groups being big enough.

I hope that my point comes across OK, which is that information can mean power. That power can be wielded wisely, for the common good, or less wisely. It’s not intrinsically a bad thing – we wouldn’t have the Coronavirus vaccines we now have without someone collecting and processing lots of data. It’s just that its power isn’t always obvious, because the visible signs are mostly just someone sitting at a laptop, and not someone sitting next to a huge pile of banknotes or a gun (or something else more obvious like that).

LikeLike
Bob says:

February 12, 2021 at 9:04 am

I forgot to mention in the article and my previous comment, that the example Lewis refers to isn’t my own work. It was part of some training given by the Office of National Statistics as part of the process to be allowed to use their Virtual Microdata Laboratory (https://www.ons.gov.uk/census/2011census/2011censusdata/censusmicrodata/securemicrodata).

LikeLike