Wading back into the Cocoa Frameworks

I presented last night at Seattle Xcoders, first time in a number of years – last time was when I was still working for Disney. The presentation was a rehash of the talk I did at the Swift Cloud Workshop a couple of months ago, and sharing a bit of what I’ve learned from my bug fixing and “getting into swift” learning by working on the Swift Package Manager.

I have a few more fixes and bugs I want to tackle, but for the past two weeks I focused on getting back into the cocoa frameworks; relearning and catching up. I was last using these libraries about 7 years ago. My oh my, the world there as seriously changed! There are some nice things, and some not so nice things – great technical advances and a couple of head-slappers.

First thing that’s obvious is the API surface area has grown enormously. All the pieces I worked with previously are there, and nearly double that as well with new libraries and frameworks. The frameworks are akin to tree rings too – you can see the generational aspects of when the API’s were first developed in the method signatures – some using NSError callbacks, some using blocks, and less consistency than you’d expect from a decade of evolution. Or perhaps exactly what you’d expect from a 15+ years of organic growth and language evolution.

Alas the documentation hasn’t kept up with the API surface area – even from external publishers. Where I used to be able to know 90% of the key API structures from something like the Big Nerd Ranch’s IOS Programming guide, it only covers an introduction today – and misses entirely some of the things like Storyboards and anything in depth with Autolayout. Apple’s documentation has even less structural organization around it, still embedded and (mostly) technically correct, but extremely limited. Without the sample code and WWDC videos they publish to tease it apart, I would be lost – as most of the documentation is written with the expectation that you already have the knowledge you’re seeking on how to use the frameworks.

I also find it immensely frustrating that you can’t download the various guides they publish to something like iBooks and read them offline or on the side with an iPad. There is contextual help in Xcode, but the screen real-estate is so cramped on my laptop that I’m often keeping an iPad on the side with some documentation (or that big nerd ranch book propped open) while I’m digging and learning.

When Apple enabled their forums, I thought that might help with some of the gaps, but in reality it’s StackOverflow that’s made the difference. The Q&A site often includes sample code in detailed questions and answers from a community helping itself, and it isn’t uncomon for the answers there include references back to the formal documentation. That linkage provides the “explanation” for that otherwise arcane but technically accurate wording in the Apple provided documentation.

And finally there’s Xcode – I still see the fragments of the separate interface builder in references, but it’s all a single app now. The debugging and diagnostic tools provided by Apple are absolutely amazing, especially some of the recent advances visually displaying view and memory debugging, and the whole “Instruments” tool chain that have been my gold standard for understanding how a process is operating since I started working on other platforms those 5-7 years ago. I keep trying to recreate those capabilities using whatever native tools I can find on the other platforms just to see what’s happening.

I am not as much a fan of the design tooling. The storyboards views and complex inspector setup, as well as the “fiddly magic” of knowing what to click, where to click, and how to link things together isn’t well coordinated or even consistent. I think it represents the steepest learning curve to developing IOS apps, which is frustrating, because you can tell it’s trying to make it easier.

Also frustratingly, it’s not internally consistent from a composition point of view – so there’s times when you just need to wipe out a whole view controller and recreate it  rather than starting with something small and building up to what you want.

When IOS first came out, I built and taught some workshops with O’Reilly to get started on it – with the complexity of the APIs, development setup, and learning the technologies you need to use to get started today I think it would take weeks for a person completely new to the Apple ecosystem.  I’m not even sure a week-long bootcamp style setup would do more than introduce you to the barest minimum needed to get up and running.

I’m not at all surprised at the focus on technologies like React Native for leveraging web development knowledge into mobile development, although that’s got to make debugging a comparative nightmare. Not my interest, to be honest, because I wanted to see what was possible (and new) with the native capabilities. But I can see the appeal in leveraging knowledge you already have elsewhere.

spelunking swift package manager – Workspaces

I finished up my last bug fix and tackled another. SR-4261 took me deep into some new areas of the code base that I wasn’t familiar with. I did more spelunking to have a decent clue of what the code was doing before making the fix. The results of the spelunking are this post.

Many of the recent changes to SwiftPM have been about managing the workspace of a project – pinning dependencies, updating and editing dependencies, and handling that whole space of interactions. These are exposed
as commands under swift package, such as swift package edit and swift package pin.

In the codebase, most of the high level logic for these commands resides in the target Workspace.

If you want a map of all the targets in the project, I included a graph of targets in my first round of spelunking swift package manager.

There are a number of interesting classes in Workspace, but the ones I focused on
were related to manifests and managed dependencies: Manifest and ManagedDependency. These two are combined in Workspace in DependencyManifest which represents SwiftPM’s knowledge of the workspace, its current state, and provides the means to manipulate the workspace with commands like edit and pin.

When swift package manager wants to manipulate the workspace, the logic
starts out loading the current state. This pretty consistently starts with loading
up the main targets by invoking loadRootManifests. This in turn uses the manifest loading logic and an interesting (newish mechanism to SwiftPM) piece: a DiagnosticsEngine. The DiagnosticsEngine collects errors and can emit interesting details for tooling wanting to provide more UI or feedback information.

loadRootManifests loads this from a list of AbsolutePath that gets passed in from the swift CLI – generally the current working directory of your project. In any case, loadRootManifests returns an array of Manifest, which is the key to loading up information about the rest of the workspace.

The next step is often loadDependencyManifests that returns an instance of DependencyManifests. This does the work of loading the dependencies needed to create the holistic view of the project to date and load the relevant state. Loading the ManagedDependencies leverages the class LoadableResult. LoadableResult is a generic class for loading persistence from a JSON file – Pins and ManagedDependencies are both loaded using it. In the case of ManagedDependencies, it loads from the file dependencies-state.json, which includes loading up the current state, current path, and relevant details about the repository. That file, along with validating the relevant repository exists (using validateEditedPackages) at the correct location on disk, also indicates any packages in the “edit” state.

Pins work a bit differently, being stored in the source path. This is to allow them to be included in the project source in order to concretely specify versions or constraints to versions for each dependency. The ManagedDependencies are maintained in the
working directory for builds, not expected to be in source control, and represent the state of things on your local machine.

The way that SwiftPM handles dependency resolution is by using a collection of RepositoryPackageConstraint and a constraint solver to resolveDependencies. The DependencyResolver is it’s own separate thing under the PackageGraph package. I have not yet dug into it. Most notably, The DependencyResolver will throw an error if it’s unable to resolve the constraints provided to it – and that’s key to the heart of the bug SR-4261, which is about adding in missing constraints for edited packages when invoking the pin command.

Using 3 Tiers of Continuous Integration

A bit over a year ago, I wrote out Six rules for setting up continuous integration, which received a fair bit of attention. One of the items I called out was the speed of tests, suggesting to keep it to around 15-20 minutes to encourage and promote developer velocity.

A few weeks ago, the team at SemaphoreCI used the word “proper” in a similar blog post, and suggested that a timing marker should be 10 minutes. This maps pretty directly to a new feature they’re promoting to review the speed of their CI system as it applies to your code.

I disagree a bit with the 10 minute assertion, only in that it is too simplistic. Rather than a straight “10 minute” rule, let me suggest a more comprehensive solution:

Tier your tests

Not all tests are equal, and you don’t want to treat them as such. I have found 3 tiers to be extremely effective, assuming you’re working on a system that can have some deep complexity to review.

The first tier: “smoke test”. Lots of projects have this concept and it is a good one to optimize for. This is where the 10 minute rule that SemaphoreCI is promoting is a fine rule of thumb. You want to verify as much of the most common paths of functionality that you can, within a time-boxed boundary – you pick the time. This is the same tier that I generally encourage for pull request feedback testing, and I try to include all unit testing into this tier, as well as some functional testing and even integration tests. If these tests don’t pass, I’ve generally assumed that a pull request isn’t acceptable for merge – which makes this tier a quality gate.

I also recommend you have a consistent and clearly documented means of letting a developer run all of these tests themselves, without having to resort to your CI system. The CI system provides a needed “source of truth”, but it behooves you to expose all the detail of what this tier does, and how it does it, so that you don’t run into blocks where a developer can’t reproduce the issue locally to debug and resolve it.

The second tier: “regression tests”. Once you acknowledge that timely feedback is critical and pick a marker, you may start to find some testing scenarios (especially in larger systems) that take longer to validate than the time you’ve allowed. Some you’ll include in the first tier, where you can fit things to the time box you’ve set – but the rest should live somewhere and get run at some point. These are often the corner cases, the regression tests, integration tests, upgrade tests, and so forth. In my experience, running these consistently (or even continuously) is valuable – so this is often the “nightly build & test” sequence. This is the tier that starts “after the fact” feedback, and as you’re building a CI system, you should consider how you want to handle it when something doesn’t pass these tests.

If you’re doing continuous deployment to a web service then I recommend you have this entire set “pass” prior to rolling out software from a staging environment to production. You can batch up commits that have been validated from the first tier, pull them all together, and then only promote to your live service once these have passed.

If you’re developing a service or library that someone else will install and use, then I recommend running these tests continuously on your master or release branch, and if any fail then consider what your process needs to accommodate: Do you want to freeze any additional commits until the tests are fixed? Do you want to revert the offending commits?  Or do you open a bug that you consider a “release blocker” that has to be resolved before your next release?

An aside here on “flakes“. The reason I recommend running the second tier of tests on a continuous basis is to keep a close eye on an unfortunate reality of complex systems and testing: Flakey Tests. Flakes an invaluable for feedback, and often a complete pain to track down. These are the tests that “mostly pass”, but don’t always return consistently. As you build into more asynchronous systems, these become more prevalent – from insufficient resources (such as CPU, Disk IO or Network IO on your test systems) to race conditions that only appear periodically. I recommend you take the feedback from this tier seriously, and collect information that allows you to identify flakey tests over time. Flakes can happen at any tier, and are the worst in first tier. When I find a flakey test in the first tier, I evaluate if it should “stop the whole train” – freeze the commits until it’s resolved, or if I should move it into a second tier and open a bug. It’s up to you and your development process, but think about how you want to handle it and have a plan.

The third tier: “deep regression, matrix and performance tests”. You may not always have this tier, but if you have acceptance or release validation that takes an extended amount of time (such as over say a few hours) to complete, then consider shifting it back into another tier. This is also the tier where I tend to handle the (time consuming) and complex matrixes when they apply. In my experience, if you’re testing across some matrix or configurations (be that software or hardware), the resources are generally constrained and the testing scenarios head towards asymptotic in terms of time. As a rule of thumb, I don’t include “release blockers” in this tier – it’s more about thoroughly describing your code. Benchmarks, matrix validations that wouldn’t block a release, and performance characterization all fall into this tier. I recommend if you have this tier that you run it prior to ever major release, and if resources allow on a recurring periodic basis to enable you to build trends of system characterizations.

There’s a good argument for “duration testing” that sort of fits between the second and third tiers. If you have tests where you want to validate a system operating over an extended period of time, where you validate availability, recovery, and system resource consumption – like looking for memory leaks – then you might want to consider failing some of these tests as a release blocker. I’ve generally found that I can find memory leaks within a couple of hours, but if you’re working with a system that will deployed where intervention is prohibitively expensive, then you might want to consider duration tests to validate operation and even chaos-monkey style recover of services over longer periods of time. Slow memory leaks, pernicious deadlocks, and distributed system recovery failures are all types of tests that are immensely valuable, but take a long “wall clock” time to run.

Reporting on your tiers

As you build our your continuous integration environment, take the time and plan and implement reporting for your tiers as well. The first tier is generally easiest – it’s what most CI systems do with annotating in pull requests. The second and third third take more time and resources. You want to watch for flakey tests, collecting and analyzing failures. More complex open source systems look towards their release process to help coral this information – OpenStack uses Zuul (unfortunately rather OpenStack specific, but they’re apparently working on that), Kubernetes has Gubernator and TestGrid. When I was at Nebula, we invested in a service that collected and collated test results across all our tiers and reported not just the success/failure of the latest run but also a history of success failure to help us spot those flakes I mentioned above.


Using docker to build and test SwiftPM

I sat down to claw my way through blog posts, examples, and docker’s documentation to come up with a way to build and test Swift Package Manager on Linux as well as the Mac.

There are a number of ways to accomplish this; I will just present one. You are welcome to use it or not and any variations on the theme should not be difficult to sort out.

First, you’ll need a docker image with the Linux swift toolchain installed. IBM has a swift docker image you can use, and recently another was announced and made available called ‘swiftdocker‘ which is a Swift 3.0.2 release for Ubuntu 16.04. I riffed on IBM’s code and my previous notes for creating a swift development environment to build the latest master-branch toolchain into an Ubuntu 16.04 based image. If you want to follow along, you can snag the Dockerfile and related scripts from my vagrant-ubuntu-swift-dev github repo and build your own locally. The image is 1.39GB in size and named swiftpm-docker-1604.

SwiftPM is a bit of a special snowflake compared to other server-side swift software – in particular, it’s part of the toolchain itself, so working on it involves some bootstrapping  so that you can get to the moral equivalent of swift build and swift test. Because of that setup, you leverage a script they created to get run the bootstrapping: Utilities/bootstrap.

SwiftPM has also “moved ahead” a bit and leveraged newer capabilities – so if you want to build off the master branch, you’ll need the swift-3.1 toolchain, or the nightly snapshot release to do the lifting. The current 3.0.2 release won’t do the trick. The nightly snapshots are NOT released versions, so there is some measure of risk and potential breakage – it has been pretty good for me so far – and necessary for working on SwiftPM.

On to the command! To build and test swiftpm from a copy of source locally

docker run -t --rm -v $(pwd):/data:rw -w /data swiftpm-docker-1604 \
/bin/bash -c "./Utilities/bootstrap && .build/debug/swift-test --parallel"

To break this down, since Docker’s somewhat insane about command-line options:

  • -t indicates to allocate a TTY for this process
  • --rm makes the results of what we do ephemeral, as opposed to updating the image
  • -v $(pwd):/data:rw is the local volume mount that makes the current local directory appear as /data within the image, and makes it Read/Write
  • -w /data leverages that volume mount to make /data the current working directory for any commands
  • swiftpm-docker-1604 is the name of the docker image that I make and update as needed
  • /bin/bash -c "..." is how I pass in multiple commands to be run, since i want to first run through the bootstrap, but then shift over to using .build/debug/swift-test --parallel for a little more speed (and a lot less output) in running the tests.

The /bin/bash -c "..." bits could easily be replaced with ./Utilities/bootstrap test to the same end effort, but a touch slower in overall execution time.

When this is done, it leaves the “build for Linux” executables (and intermediate files) in the .build directory, so if you try and run it locally on a Mac, it won’t be happy. To be “safe”, I recommend you clear the .build directory and rebuild it locally if you want to test on MacOS. It just so happens that ./Utilities/bootstrap clean does that.

A few numbers on how (roughly) long this takes – on my mid-2012 MacBook Air

  • example command above: 5 minutes 15.887 seconds
  • using /bin/bash -c "./Utilities/bootstrap test": 6 minutes, 15.853 seconds
  • a build using the same toolchain, but just MacOS locally: 4 minutes 13.558 seconds


Adding thread safety in Swift 3

One of the pieces that I’ve brushed up against recently, but didn’t understand in any great depth, were techniques for making various sections of code thread-safe. There are some excellent articles out there on it, so if you’re looking and found this, let me provide a few references:

In Jun 2016 Matt Gallagher wrote Mutexes and closure capture in Swift, which was oriented towards how to optimize beyond the “obvious advice”, and a great read for the more curious.

The “usual advice” referenced a StackOverflow question: What is the Swift equivalent to Objective-C’s @synchronized? that as a technique spans swift 2 and 3 – the gist being to leverage the libDispatch project, creating a dispatch queue to handle synchronization and access to shared data structures. I found another question+answer on StackOverflow to be a bit easier to read and understand: Adding items to Swift array across multiple threads causing issues…, and matched a bit more of the technique that you can spot in Swift Package Manager code.

One of the interesting quirks in Swift 3 is that let myQueue = DispatchQueue(label: "my queue") returns a serialized queue, as opposed to the asynchronous queue, which you can get invoking DispatchQueue.global(). I’m not sure where in the formal documentation that default information appears – the guides I found to Grand Central Dispatch are all still written for C and Objective-C primarily, so the mapping of Swift libraries to those structures wasn’t at all obvious to me. I particularly liked the answer that described this detail, as it was some of the better descriptions that mapped to the Apple developer guides on concurrency (and which desperately needs to be updated to related to Swift IMO).

Circling back to Matt Gallagher’s piece, the overhead of the scope capture in passing through the closure was more than he wanted to see – although it was fine for my needs. None the less, his details are also available in github at CwlUtils, which has a number of other interesting pieces and tidbits in it that I’ll circle back later to look at in depth.


spelunking Swift Package Manager

In the slowly building sequence of my swift dev diaries, I wrote about how to set up a swift development environment, and noted some details I gleaned from the SwiftPM slack channel about how to make a swift 3.0/3.1 binary “portable”. This likely will not be an issue in another year, as the plans for SwiftPM under swift 4 include being able to make statically compiled binaries.

Anyway, the focal point for all this was an excuse to learn and start seriously using swift, and since I’ve been a lot more server-side than not for the past years, I came at it from that direction. With the help of the guys writing swift package manager, I picked a few bugs and started working on them.

Turns out Swift Package Manager is a fairly complex beast. Like any other project, a seemingly simple bug leads to a lot of tendrils through the code. So this weekend, I decided I would dive in deep and share what I learned while spelunking.

I started from the bug SR-3275 (which was semi-resolved by the time I got there), but I decided to try and “make it better”. SwiftPM uses git and a number of other command line tools. Before it uses it, it would be nice to check to make sure they’re there, and if they’re not – to fail gracefully, or at least informatively. My first cut at this  improved the error output and exposed me to some of the complexity under the covers. Git is used in several parts of SwiftPM (as are other tools). It has definitely grown organically with the code, so it’s not entirely consistent. SwiftPM includes multiple binary command-line tools, and several of them use Git to validate dependencies, check them out if needed, and so forth.

The package structure of SwiftPM itself is fairly complex, with multiple separate internal packages and dependencies. I made a map because it’s easier for me to understand things when I scrawl them out.


swiftpm is the PDF version (probably more readable) which I generated by taking the output of swift package --dump-package and converting it to a graphviz digraph, rendering it with my old friend OmniGraffle.

The command line tools (swift build, swift package, etc) all use a common package Commands, which in turn relies and the various underlying pieces. Where I started was nicely encapsulate in the package SourceControl, but I soon realized that what I was fiddling with was a cross-cutting concern. The actual usage of external command line tools happens in Utility, the relevant component being Process.swift.

Utility has a close neighbor: Basic, and the two seem significantly overlapping to me. From what I’ve gathered, Utility is the “we’re not sure if it’s in the right place” grouping, and Basic contains the more stabilized, structured code. Process seems to be in the grey area between those two groupings, and is getting some additional love to stabilize it even as I’m writing this.

All the CLI tools use a base class called SwiftTool, which sets a common structure. SwiftTool has an init method that sets up CLI arguments and processes the ones provided into an Options object that can then get passed down in the relevant code that does the lifting. ArgumentParser and ArgumentBinder do most of that lifting. Each subclass of SwiftTool has its own runImpl method, using relevant underlying packages and dependencies. run invokes the relevant code in runImpl. If any errors get thrown, it deals with them (regardless of the CLI invoking)  with a method called handle in Error. It is mostly printing some (hopefully useful) error message and then exiting with an error code.

My goal is to check for required files – that they exist and are executable – and report consistently and clearly if they’re not. From what I’ve learned so far that seems to be best implemented by creating a relevant subclass (or enum declaration) of Swift.Error and throwing when things are awry, letting the _handle implementation in Commands.Error take care of printing things consistently.

So on to the file checking parts!

The “file exists” seemed to have two implementations, one in Basic.Filesystem and another in Basic.Pathshims.

After a bit more reading and chatting with Ankit, it became clear that Filesystem was the desired common way of doing things, and Pathshims the original “let’s just get this working” stuff. It is slowly migrating and stabilizing over to Filesystem. There are a few pieces of the Filesystem implementation that directly use Pathshims, so I expect a possible follow-up will be to collapse those together, and finally get rid of Pathshims entirely.

The “file is executable check” didn’t yet exist, although there were a couple of notes that it would be good to have. Similar checking was in UserToolchain, including searching environment paths for the executable. It seemed like it needed to be pulled out from there and moved to a more central location, so it could be used in any of the subclasses of SwiftTool.

At this point, I’ve put up a pull request to add in the executable check to Filesystem, and another to migrate the “search paths” logic into Utility. I need to tweak those and finish up adding the tests and comments, and then I’ll head back to adding in the checks to the various subclasses of SwiftTool that could use the error checking.


I took this panoramic over a month ago now, when the crocus’ weren’t even opening yet. A pretty spectacular weekend, and the view of Lake Union and east to the Cascades was amazing.

It seemed like a damn good photo to start my sabbatical with. As I’m posting this, I’m just beginning a sabbatical. I’ve been wanting to do this for over a decade, and this year its feasible. Learning, Playing, and Travel are the top contenders for my time. I have been making lists for nearly 3 months about “what I could do” – I’ll never get to it all, but it makes a good target.

I’ve started some art classes, which I’m inordinately nervous about, but have been trying to work on for months now. I figured a full-out class would enforce some structure around what I’m doing, and hopefully lead to some interesting things. I have been loving the new camera on the iPhone 7 and posting photography experiments to facebook and twitter. It’ll be interesting to see if I can fuse a few different arts together.

I can’t even begin to think of not being a geek, so I am also contributing to a couple  open source efforts. I’m now a contributor to Apple’s swift project: just fixing a bug or two at this point, but it’s a start and gives me a reason to dig more into the language. I may do the same with Kubernetes a bit later, and expand on learning the Go language.

Travel is also in the plan – although like usual I’ll post more about where I’ve been than “where I am”, so don’t expect updates until after the fact for the travel. Lots of things to see on the horizon…

HOW TO: making a portable binary with swift

Over the weekend I was working with Vapor, trying it out and learning a bit about the libraries. Vapor leverages a library called LibreSSL to provide TLS to web services, so when you compile the project, you get a binary and a dynamic library that it uses.

The interesting part here is that if you move the directory that contains the “built bits”, the program ceases to function, reporting an error that it can’t find the dynamic library. You can see this in my simple test case, with the code from https://github.com/heckj/vaportst.

git clone https://github.com/heckj/vaportst
swift build -c release
mv .build/release newlocation
./newlocation/App version

then throws an error:

dyld: Library not loaded: /Users/heckj/src/vaportst/.build/release/libCLibreSSL.dylib
 Referenced from: /Users/heckj/src/vaportst/newlocation/App
 Reason: image not found
Abort trap: 6

It turns out that with Swift 3 (an Swift 3.1 that’s coming), the compiler adds the static path to the dynamic library, and there’s an interesting tool, called the install_name_tool that can modify that from a static path to a dynamic path.

Norio Nomura was kind enough to give me the exact syntax:

install_name_tool -change /Users/heckj/src/vaportst/.build/release/libCLibreSSL.dylib @rpath/libCLibreSSL.dylib .build/release/App
install_name_tool -add_rpath @executable_path .build/release/App

This tool is specific to MacOS, as the linux dynamic loader works slightly differently. As long as the dynamic library is in the same directory as the binary, or the environment variable LD_LIBRARY_PATH is set to the directory containing the dynamic libraries, it’ll get loaded just fine on Linux.

Swift today doesn’t provide a means to create a statically linked binary (it is an open feature request: SR-648). It looks like that may be an option in the future as the comments in the bug show progress towards this goal. The whole issue of dynamic loading becomes moot, at the cost of larger binaries – but it is an incredible boon when dealing with containers and particularly looking towards running “server side swift”.

Kubernetes community crucible

I’ve been watching and lurking on the edges of the Kubernetes community for this past cycle of development. We are closing on the feature freezes for the 1.6 release, and it is fascinating to watch the community evolve.

These next several upcoming releases are a crucible for Kubernetes as a project and community. They have moved from Google to the CNCF, but the reality of the responsibility transition is still in progress. They made their first large moves and efforts to handle the explosive growth of interest in their project and the corresponding expansion of community. The Kubernetes 1.6 release was supposed to be a “few features, but mostly stability” after the heavy changes that led into 1.5, recognizing a lot of change has hit and stabilization is needed. This is true technically as well as for the community itself.

The work is going well: the SIG’s are delegating responsibility pretty effectively and most everyone is working to pull things forward. That is not to say it is going perfectly. The crucible isn’t how it dealt with the growth, but how it deals with the failures and faults of the efforts to handle that growth. One highlight is that although 1.6 was supposed to be about stability, the test flakiness has risen again. The conversation in last week’s community meeting highlighted it, as well as discussed what could be done to shift back to reliability and stability as a key ingredient. Other issues include project-wide impacts that SIG’s can have, the resolution to that being a lot of the community project and product manager focus over the past months.

This week’s DevOpsWeekly newsletter also highlighted some end user feedback: some enthusiastically pro-Kubernetes, another not so much. The two posts are sizzling plates of feedback goodness for the future of Kubernetes – explanations of what’s effective, what’s confusing, and what could be better – product management gold.

  • Thoughts on Kubernetes by Nelson Elhage is an “enthusiastic for the future of the project” account from a new user who clearly did his research and understood the system, pointing out rough edges and weak points on the user experience.
  • Why I’m leaving Kubernetes for Swarm by Jonathan Kosgei highlights the rougher-than-most edges that exist around the concept of “ingress” when Kubernetes is being used outside of the large cloud providers, highlighting that Docker swarm has stitched together a pretty effective end-user story and experience here.

I think the project is doing a lot of good things, and have been impressed with the efforts, professionalism, and personalities of the team moving Kubernetes forward. There is a lot of passion and desire to see the right thing done, and a nice acknowledgement to our myopic tendencies at times when holistic or strategic thoughts are needed. I hope the flaky tests get sorted without as much pain as the last round, and the community grows from the combined efforts. I want to see them alloyed in this crucible and come out stronger for it.

Benchmarking etcd 3.0 – an excellent example of how to benchmark a service

Last week, the CoreOS team posted a benchmark review of etcd 3.0 on their blog. Gyo-Ho Lee was the author, and clearly the primary committer to the effort – and he did an amazing job.

First and foremost is that the benchmark is entirely open, clear, and reproducible. All the code for this effort is in a git repository dedicated to the purpose: dbtester, and the test results (also in the repository) and how to run the tests are all detailed. The benchmarking code was created for the purpose of benchmarking this specific kind of service, and used to compare etcd, zookeeper, and consul. CoreOS did a tremendous service making this public, and I hope it gave them a concrete dashboard for their development improvements while they iterated on etcd to 3.0.

What Gyo-Ho Lee did within the benchmark is what makes this an amazing example: He reviewed the performance of the target against multiple dimensions. Too many benchmarks, especially ones presenting in marketing materials, are simple graphs highlighting a single dimension – and utterly opaque as to how they got there. The etcd3 benchmark reviews itself, zookeeper, and consult against multiple dimensions memory, cpu, and diskIO. The raw data that backed the blog post is committed into the repo under “test-results”. It is reasonably representative (writing 1,000,000 keys to the backend) and tracked time to complete, memory consumed, CPU consumed, and disk IO consumed during the process.

I haven’t looked at the code to see how re-usable it might be – I would love to see more benchmarks with different actions, and a comparison to how this operates in production (in cluster mode) – but these wishes are just variations on the theme, and not a complaint to the work done so far.

As an industry, as we build more with containers, this kind of benchmarking is exactly what we need. We’re composing distributed services now more than ever, and knowing the qualities of how these systems or containers will operate is as critical as any other correctness validation efforts.