Psychology, not technology, is the key to Google’s reliability

An excellent video by a Google Site Reliability Engineer, from Goto Conference 2017.  What I liked in particular were three key points: Being honest that trying to have operations act as border guards, who attempt to vet code changes with an increasingly-long checklist before they go live, is a path to failure and frustration. Agreeing … Continue reading Psychology, not technology, is the key to Google’s reliability

Trying to not get too ranty about documenting software architecture

This article is my thoughts on a video about documenting software architecture: https://www.youtube.com/watch?v=kv8XedJTEww A summary of the video is: Domains other than software architecture, e.g. maps or electrical circuits, do a good job of capturing useful and important information in a way that communicates this well – this is mostly in pictures. Software architecture does … Continue reading Trying to not get too ranty about documenting software architecture

Here be dragons: testing your error handling code

Who tests the error handling parts of their code?  You might want to start doing this after watching this very interesting video from Goto Conference 2016.  Among other things, the speaker summarises a paper that investigates catastrophic failures in things like MapReduce, Cassandra etc. 58% of the catastrophic failures could have been prevented by testing … Continue reading Here be dragons: testing your error handling code

Statistics Without the Agonising Pain, and Statistics for Hackers

John Rauser, data scientist at Pinterest, has an excellent video called Statistics Without the Agonising Pain.  Less than 12 minutes, and it explains a useful stats term (statistical significance) to people who can code but don't know stats.  It does this very well! https://www.youtube.com/watch?v=5Dnw46eC-0o Another video along similar lines, by Jake Vanderplas.  It builds on … Continue reading Statistics Without the Agonising Pain, and Statistics for Hackers