I’m still slowly working my way through the back catalogue of the Software Engineering Radio podcast. One episode that I particularly liked is 277: Gil Tene on Tail Latency. It has interesting and useful stuff that helps you see things clearly. For instance:
- How there’s more than one measure of latency (mean, median, 90th centile, 99th centile etc.).
- If your user’s request is served by more than one thing happening in parallel (e.g. downloading many separate JS and CSS files), how the latency distribution of the separate bits combine to produce the latency observed by the user (TL;DR: badly).
- Actions to take to try to cope with poor latency, what their consequences are, and when they’re appropriate.
- A gentle bit of queueing theory, to help you work out what’s actually important to you, so you can phrase your requirements / questions appropriately.