First and foremost is that the benchmark is entirely open, clear, and reproducible. All the code for this effort is in a git repository dedicated to the purpose: dbtester, and the test results (also in the repository) and how to run the tests are all detailed. The benchmarking code was created for the purpose of benchmarking this specific kind of service, and used to compare etcd, zookeeper, and consul. CoreOS did a tremendous service making this public, and I hope it gave them a concrete dashboard for their development improvements while they iterated on etcd to 3.0.
What Gyo-Ho Lee did within the benchmark is what makes this an amazing example: He reviewed the performance of the target against multiple dimensions. Too many benchmarks, especially ones presenting in marketing materials, are simple graphs highlighting a single dimension – and utterly opaque as to how they got there. The etcd3 benchmark reviews itself, zookeeper, and consult against multiple dimensions memory, cpu, and diskIO. The raw data that backed the blog post is committed into the repo under “test-results”. It is reasonably representative (writing 1,000,000 keys to the backend) and tracked time to complete, memory consumed, CPU consumed, and disk IO consumed during the process.
I haven’t looked at the code to see how re-usable it might be – I would love to see more benchmarks with different actions, and a comparison to how this operates in production (in cluster mode) – but these wishes are just variations on the theme, and not a complaint to the work done so far.
As an industry, as we build more with containers, this kind of benchmarking is exactly what we need. We’re composing distributed services now more than ever, and knowing the qualities of how these systems or containers will operate is as critical as any other correctness validation efforts.