Thoughts of Performance Hints from Jeff
Today, Jeff Dean posted an article on Twitter sharing some hints about performance. I found it to be pretty impressive and illuminating, so I want to write down some notes on my thoughts and impressions of it.
API design
This will be easier if your modules are deep (significant functionality accessed via a narrow interface).
For frequently called routines, sometimes it is useful to allow higher-level callers to pass in a data structure that they own or information that the called routine needs that the client already has. This can avoid the low-level routine being forced to allocate its own temporary data structure or recompute already-available information.
This one makes me think about the Actix request dispatcher design. The way handler functions receive an impl Into<Request> (I’m not 100% sure; I didn’t double-check while writing this) was truly a good design when I played around with it years ago. Many implementations of the trait were generated by macros, making it easy to perform a lot of optimizations internally.
In Go, I might need to decide whether to receive an interface or specific request types. generally, I think these are good, practical hints.
Concurrency
A type may be either thread-compatible (synchronized externally) or thread-safe (synchronized internally). Most generally used types should be thread-compatible. This way callers who do not need thread-safety don’t pay for it.
However if the typical use of a type needs synchronization, prefer to move the synchronization inside the type. This allows the synchronization mechanism to be tweaked as necessary to improve performance (e.g., sharding to reduce contention) without affecting callers.
This part is extremely helpful too. I don’t write a lot of concurrent code, but I have recently been redesigning some synchronous code to be asynchronous. If we had considered that we might change it to async in the future, following the hints above would have made it much easier to write.
I bet follow these hints results in better performance than other approaches.
Algorithm
This section reminds me of the union-find algorithm I just learned for this year’s AOC Day 8. When I implemented the union-find algorithm for my AOC library, I actually inserted it directly.
Also, I learned about the need to strike a balance between general usage and specific usage. If we already see the whole picture, we can perform some fine-tuned optimizations.
L1/L2 Cache
I have read many articles about L1/L2 caches before. It is very interesting, though I usually don’t deal with them directly in my work. However, it’s always fascinating to see how much of a difference cache optimization can make.
Do not optimize too early?
I’ll just quote the article directly:
Knuth is often quoted out of context as saying premature optimization is the root of all evil. The full quote reads: “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” This document is about that critical 3%, and a more compelling quote, again from Knuth, reads:
The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today’s software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise-and-pound-foolish programmers, who can’t debug or maintain their “optimized” programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn’t bother making such optimizations on a one-shot job, but when it’s a question of preparing quality programs, I don’t want to restrict myself to tools that deny me such efficiencies.
Many people will say “let’s write down the code in as simple a way as possible and deal with performance later when we can profile”. However, this approach is often wrong…
It reminds me of what I learned years ago from AOC:
Even a tiny optimization can lead to a huge improvement, even if the Big O complexity doesn’t change.
So, I think I should keep optimization in mind during my daily coding. If an optimization can be done immediately, I should do it.