I was catching up with old episodes of the Radiolab podcast, and one on Coronavirus and numbers made me think about risks to do with Coronavirus, and that thought expanded out to risk more generally and also how it can apply to software testing. UK Coronavirus numbers I need to prefix all this with quite … Continue reading Coronavirus, risk and software testing

# Category: Stats

# K-means clustering and Voronoi maps

Introduction This article came out of one of those realisations that two things I already knew were linked or even the same thing, like Clark Kent and Superman or regular expressions and finite state machines. It’s not a new realisation – a minimal amount of Googling showed that the link is in the first paragraph … Continue reading K-means clustering and Voronoi maps

# Three ways to summarise data sets

Introduction This article will talk about three ways to summarise a data set. It should be gentle stuff - two you're likely to know already, and the less-well-known one isn't tricky to understand. I'm talking about them together in one article to show how stats can often mean you need a toolbox with several stats … Continue reading Three ways to summarise data sets

# Stats: Through a Glass, Darkly

I used to think that stats worked like this: Unfortunately the real world isn’t like that. Instead it’s more like this: You don’t have direct access to the Glorious Truths because there is the Impenetrable Wall of Ignorance in the way. Your only hope is to punch through the wall with some Stats Machinery. One … Continue reading Stats: Through a Glass, Darkly

# How to mess up A/B tests

An excellent talk about A/B tests from someone who knows - Martin Goodson. My favourite part is an A/B test that found a 2.5% improvement to (sales) conversions between the two versions of the software being tested. Unfortunately there was a bug in the A/B testing framework, such that the old version was being tested … Continue reading How to mess up A/B tests

# Statistics Without the Agonising Pain, and Statistics for Hackers

John Rauser, data scientist at Pinterest, has an excellent video called Statistics Without the Agonising Pain. Less than 12 minutes, and it explains a useful stats term (statistical significance) to people who can code but don't know stats. It does this very well! https://www.youtube.com/watch?v=5Dnw46eC-0o Another video along similar lines, by Jake Vanderplas. It builds on … Continue reading Statistics Without the Agonising Pain, and Statistics for Hackers