I recently watched a video and listened to a podcast that form an interesting pair of opinions about performance. The video is Patterns for high-performance C# and the podcast is SE-Radio Episode 357: Adam Barr on Software Quality.
There are two things where the podcast and video have no differences: the system must behave correctly, and the user experience must be adequate (which includes the system’s performance as experienced by the user). Where they differ is the relative stress they place on optimising for machine performance vs. optimising for programmer performance.
Patterns for high-performance C#
To be fair to Federico Lois (the presenter), he stresses that there are things you should do first, at the bigger scales, before you resort to the wackier things he talks about to get high performance. First, make sure that the architecture is correct, so you can scale things in sensible ways, you have no avoidable bottle-necks by e.g. using caches and so on. Second, make sure that the design is correct, so for instance you’re not using a O(n^4) algorithm when there’s an O(n.log(n)) alternative.
However, his experience on RavenDB was that these weren’t enough. They needed to squeeze more performance out somehow. The gist of the approach is to make C# as much like C as possible. In a bit more detail, it’s things like trying to avoid using:
- async / await
- try / catch
- LINQ
- the garbage collector (so roll your own allocation)
- virtual method calls
- lambdas that capture context
In comments on his video, he answers the question “Why don’t you just code in C / C++?” with “If we start with C#, we can go to extremes only where we need to, and elsewhere use C# which more people know.”
Software Quality podcast
The podcast leans much more heavily in the direction of having code that’s easy to modify in the future, which means it must be understandable. This is harder than it sounds, because there’s no one metric of understandability – for example, short methods mean each method is easy to understand, but they can make the end-to-end flow harder to work out. Numbers such as cyclometric complexity don’t do a good enough job of being a reliable and comprehensive measure of quality.
Some of the techniques that help understandability won’t hurt system performance – things like giving things good names. However, in some areas you are forced to make a choice between the two. While you can write opaque LINQ, you can also use it to express yourself concisely and clearly. It’s a way in which C# is made higher-level, i.e. closer to human thoughts and further from the level of silicon.
I found it interesting that Adam Barr seems to advocate throwing hardware at the problem quite a lot, rather than making the code less understandable. There are lots of other things he talks about, so I recommend the whole podcast.
Optimising – unequal alternatives
The techniques listed above (avoiding garbage collecting etc.) will optimise for the performance of the machines running your code. Other techniques (well-named methods, well-structured code etc.) will optimise for the performance of the programmers working on the code. They will understand the code more quickly and better, tending to result in future code being written more quickly with fewer bugs and hence less re-work.
The problem is it’s hard to do an apples-to-apples comparison of those alternatives.
The system’s performance can be measured in a meaningful way via a small set of unambiguous numbers (for instance mean and standard deviation of the number of requests per second, amount of CPU / other resources used per request etc.). They can also be measured relatively quickly. (I know that setting up reliable performance tests can be hard, but it will be days or weeks rather than months or years.)
The programmers’ performance is pretty much the opposite. It’s hard to measure, and the effects will be spread out over much longer regions of time than a system performance test. (The code you change now might not be modified until two years’ time, which is when its quality will make a difference – good or bad.)
It reminds me of other important but fuzzy values, like how satisfied your customers are. I don’t think that things like Net Promoter Score or Customer Satisfaction are sensible to apply to programmers and the code, because a programmer is both supplier and consumer of the code, so the relationship is a bit more complicated than your organisation to your customers (at least most of the time). But I hope you get the idea – just because it’s hard to measure, doesn’t mean it’s not important.
It’s very tempting, particularly given the growth of tools that do nice things with numbers – analyse them, visualise them etc, to lock onto numbers and let more fuzzy things fade into the background. I think we should fight this temptation, and aim towards goals that we have agreed are important, not just easy to measure.