Using 3 Tiers of Continuous Integration

A bit over a year ago, I wrote out Six rules for setting up continuous integration, which received a fair bit of attention. One of the items I called out was the speed of tests, suggesting to keep it to around 15-20 minutes to encourage and promote developer velocity.

A few weeks ago, the team at SemaphoreCI used the word “proper” in a similar blog post, and suggested that a timing marker should be 10 minutes. This maps pretty directly to a new feature they’re promoting to review the speed of their CI system as it applies to your code.

I disagree a bit with the 10 minute assertion, only in that it is too simplistic. Rather than a straight “10 minute” rule, let me suggest a more comprehensive solution:

Tier your tests

Not all tests are equal, and you don’t want to treat them as such. I have found 3 tiers to be extremely effective, assuming you’re working on a system that can have some deep complexity to review.

The first tier: “smoke test”. Lots of projects have this concept and it is a good one to optimize for. This is where the 10 minute rule that SemaphoreCI is promoting is a fine rule of thumb. You want to verify as much of the most common paths of functionality that you can, within a time-boxed boundary – you pick the time. This is the same tier that I generally encourage for pull request feedback testing, and I try to include all unit testing into this tier, as well as some functional testing and even integration tests. If these tests don’t pass, I’ve generally assumed that a pull request isn’t acceptable for merge – which makes this tier a quality gate.

I also recommend you have a consistent and clearly documented means of letting a developer run all of these tests themselves, without having to resort to your CI system. The CI system provides a needed “source of truth”, but it behooves you to expose all the detail of what this tier does, and how it does it, so that you don’t run into blocks where a developer can’t reproduce the issue locally to debug and resolve it.

The second tier: “regression tests”. Once you acknowledge that timely feedback is critical and pick a marker, you may start to find some testing scenarios (especially in larger systems) that take longer to validate than the time you’ve allowed. Some you’ll include in the first tier, where you can fit things to the time box you’ve set – but the rest should live somewhere and get run at some point. These are often the corner cases, the regression tests, integration tests, upgrade tests, and so forth. In my experience, running these consistently (or even continuously) is valuable – so this is often the “nightly build & test” sequence. This is the tier that starts “after the fact” feedback, and as you’re building a CI system, you should consider how you want to handle it when something doesn’t pass these tests.

If you’re doing continuous deployment to a web service then I recommend you have this entire set “pass” prior to rolling out software from a staging environment to production. You can batch up commits that have been validated from the first tier, pull them all together, and then only promote to your live service once these have passed.

If you’re developing a service or library that someone else will install and use, then I recommend running these tests continuously on your master or release branch, and if any fail then consider what your process needs to accommodate: Do you want to freeze any additional commits until the tests are fixed? Do you want to revert the offending commits?  Or do you open a bug that you consider a “release blocker” that has to be resolved before your next release?

An aside here on “flakes“. The reason I recommend running the second tier of tests on a continuous basis is to keep a close eye on an unfortunate reality of complex systems and testing: Flakey Tests. Flakes an invaluable for feedback, and often a complete pain to track down. These are the tests that “mostly pass”, but don’t always return consistently. As you build into more asynchronous systems, these become more prevalent – from insufficient resources (such as CPU, Disk IO or Network IO on your test systems) to race conditions that only appear periodically. I recommend you take the feedback from this tier seriously, and collect information that allows you to identify flakey tests over time. Flakes can happen at any tier, and are the worst in first tier. When I find a flakey test in the first tier, I evaluate if it should “stop the whole train” – freeze the commits until it’s resolved, or if I should move it into a second tier and open a bug. It’s up to you and your development process, but think about how you want to handle it and have a plan.

The third tier: “deep regression, matrix and performance tests”. You may not always have this tier, but if you have acceptance or release validation that takes an extended amount of time (such as over say a few hours) to complete, then consider shifting it back into another tier. This is also the tier where I tend to handle the (time consuming) and complex matrixes when they apply. In my experience, if you’re testing across some matrix or configurations (be that software or hardware), the resources are generally constrained and the testing scenarios head towards asymptotic in terms of time. As a rule of thumb, I don’t include “release blockers” in this tier – it’s more about thoroughly describing your code. Benchmarks, matrix validations that wouldn’t block a release, and performance characterization all fall into this tier. I recommend if you have this tier that you run it prior to ever major release, and if resources allow on a recurring periodic basis to enable you to build trends of system characterizations.

There’s a good argument for “duration testing” that sort of fits between the second and third tiers. If you have tests where you want to validate a system operating over an extended period of time, where you validate availability, recovery, and system resource consumption – like looking for memory leaks – then you might want to consider failing some of these tests as a release blocker. I’ve generally found that I can find memory leaks within a couple of hours, but if you’re working with a system that will deployed where intervention is prohibitively expensive, then you might want to consider duration tests to validate operation and even chaos-monkey style recover of services over longer periods of time. Slow memory leaks, pernicious deadlocks, and distributed system recovery failures are all types of tests that are immensely valuable, but take a long “wall clock” time to run.

Reporting on your tiers

As you build our your continuous integration environment, take the time and plan and implement reporting for your tiers as well. The first tier is generally easiest – it’s what most CI systems do with annotating in pull requests. The second and third third take more time and resources. You want to watch for flakey tests, collecting and analyzing failures. More complex open source systems look towards their release process to help coral this information – OpenStack uses Zuul (unfortunately rather OpenStack specific, but they’re apparently working on that), Kubernetes has Gubernator and TestGrid. When I was at Nebula, we invested in a service that collected and collated test results across all our tiers and reported not just the success/failure of the latest run but also a history of success failure to help us spot those flakes I mentioned above.

 

Using docker to build and test SwiftPM

I sat down to claw my way through blog posts, examples, and docker’s documentation to come up with a way to build and test Swift Package Manager on Linux as well as the Mac.

There are a number of ways to accomplish this; I will just present one. You are welcome to use it or not and any variations on the theme should not be difficult to sort out.

First, you’ll need a docker image with the Linux swift toolchain installed. IBM has a swift docker image you can use, and recently another was announced and made available called ‘swiftdocker‘ which is a Swift 3.0.2 release for Ubuntu 16.04. I riffed on IBM’s code and my previous notes for creating a swift development environment to build the latest master-branch toolchain into an Ubuntu 16.04 based image. If you want to follow along, you can snag the Dockerfile and related scripts from my vagrant-ubuntu-swift-dev github repo and build your own locally. The image is 1.39GB in size and named swiftpm-docker-1604.

SwiftPM is a bit of a special snowflake compared to other server-side swift software – in particular, it’s part of the toolchain itself, so working on it involves some bootstrapping  so that you can get to the moral equivalent of swift build and swift test. Because of that setup, you leverage a script they created to get run the bootstrapping: Utilities/bootstrap.

SwiftPM has also “moved ahead” a bit and leveraged newer capabilities – so if you want to build off the master branch, you’ll need the swift-3.1 toolchain, or the nightly snapshot release to do the lifting. The current 3.0.2 release won’t do the trick. The nightly snapshots are NOT released versions, so there is some measure of risk and potential breakage – it has been pretty good for me so far – and necessary for working on SwiftPM.

On to the command! To build and test swiftpm from a copy of source locally

docker run -t --rm -v $(pwd):/data:rw -w /data swiftpm-docker-1604 \
/bin/bash -c "./Utilities/bootstrap && .build/debug/swift-test --parallel"

To break this down, since Docker’s somewhat insane about command-line options:

  • -t indicates to allocate a TTY for this process
  • --rm makes the results of what we do ephemeral, as opposed to updating the image
  • -v $(pwd):/data:rw is the local volume mount that makes the current local directory appear as /data within the image, and makes it Read/Write
  • -w /data leverages that volume mount to make /data the current working directory for any commands
  • swiftpm-docker-1604 is the name of the docker image that I make and update as needed
  • /bin/bash -c "..." is how I pass in multiple commands to be run, since i want to first run through the bootstrap, but then shift over to using .build/debug/swift-test --parallel for a little more speed (and a lot less output) in running the tests.

The /bin/bash -c "..." bits could easily be replaced with ./Utilities/bootstrap test to the same end effort, but a touch slower in overall execution time.

When this is done, it leaves the “build for Linux” executables (and intermediate files) in the .build directory, so if you try and run it locally on a Mac, it won’t be happy. To be “safe”, I recommend you clear the .build directory and rebuild it locally if you want to test on MacOS. It just so happens that ./Utilities/bootstrap clean does that.

A few numbers on how (roughly) long this takes – on my mid-2012 MacBook Air

  • example command above: 5 minutes 15.887 seconds
  • using /bin/bash -c "./Utilities/bootstrap test": 6 minutes, 15.853 seconds
  • a build using the same toolchain, but just MacOS locally: 4 minutes 13.558 seconds

 

Adding thread safety in Swift 3

One of the pieces that I’ve brushed up against recently, but didn’t understand in any great depth, were techniques for making various sections of code thread-safe. There are some excellent articles out there on it, so if you’re looking and found this, let me provide a few references:

In Jun 2016 Matt Gallagher wrote Mutexes and closure capture in Swift, which was oriented towards how to optimize beyond the “obvious advice”, and a great read for the more curious.

The “usual advice” referenced a StackOverflow question: What is the Swift equivalent to Objective-C’s @synchronized? that as a technique spans swift 2 and 3 – the gist being to leverage the libDispatch project, creating a dispatch queue to handle synchronization and access to shared data structures. I found another question+answer on StackOverflow to be a bit easier to read and understand: Adding items to Swift array across multiple threads causing issues…, and matched a bit more of the technique that you can spot in Swift Package Manager code.

One of the interesting quirks in Swift 3 is that let myQueue = DispatchQueue(label: "my queue") returns a serialized queue, as opposed to the asynchronous queue, which you can get invoking DispatchQueue.global(). I’m not sure where in the formal documentation that default information appears – the guides I found to Grand Central Dispatch are all still written for C and Objective-C primarily, so the mapping of Swift libraries to those structures wasn’t at all obvious to me. I particularly liked the answer that described this detail, as it was some of the better descriptions that mapped to the Apple developer guides on concurrency (and which desperately needs to be updated to related to Swift IMO).

Circling back to Matt Gallagher’s piece, the overhead of the scope capture in passing through the closure was more than he wanted to see – although it was fine for my needs. None the less, his details are also available in github at CwlUtils, which has a number of other interesting pieces and tidbits in it that I’ll circle back later to look at in depth.

 

spelunking Swift Package Manager

In the slowly building sequence of my swift dev diaries, I wrote about how to set up a swift development environment, and noted some details I gleaned from the SwiftPM slack channel about how to make a swift 3.0/3.1 binary “portable”. This likely will not be an issue in another year, as the plans for SwiftPM under swift 4 include being able to make statically compiled binaries.

Anyway, the focal point for all this was an excuse to learn and start seriously using swift, and since I’ve been a lot more server-side than not for the past years, I came at it from that direction. With the help of the guys writing swift package manager, I picked a few bugs and started working on them.

Turns out Swift Package Manager is a fairly complex beast. Like any other project, a seemingly simple bug leads to a lot of tendrils through the code. So this weekend, I decided I would dive in deep and share what I learned while spelunking.

I started from the bug SR-3275 (which was semi-resolved by the time I got there), but I decided to try and “make it better”. SwiftPM uses git and a number of other command line tools. Before it uses it, it would be nice to check to make sure they’re there, and if they’re not – to fail gracefully, or at least informatively. My first cut at this  improved the error output and exposed me to some of the complexity under the covers. Git is used in several parts of SwiftPM (as are other tools). It has definitely grown organically with the code, so it’s not entirely consistent. SwiftPM includes multiple binary command-line tools, and several of them use Git to validate dependencies, check them out if needed, and so forth.

The package structure of SwiftPM itself is fairly complex, with multiple separate internal packages and dependencies. I made a map because it’s easier for me to understand things when I scrawl them out.

swiftpm

swiftpm is the PDF version (probably more readable) which I generated by taking the output of swift package --dump-package and converting it to a graphviz digraph, rendering it with my old friend OmniGraffle.

The command line tools (swift build, swift package, etc) all use a common package Commands, which in turn relies and the various underlying pieces. Where I started was nicely encapsulate in the package SourceControl, but I soon realized that what I was fiddling with was a cross-cutting concern. The actual usage of external command line tools happens in Utility, the relevant component being Process.swift.

Utility has a close neighbor: Basic, and the two seem significantly overlapping to me. From what I’ve gathered, Utility is the “we’re not sure if it’s in the right place” grouping, and Basic contains the more stabilized, structured code. Process seems to be in the grey area between those two groupings, and is getting some additional love to stabilize it even as I’m writing this.

All the CLI tools use a base class called SwiftTool, which sets a common structure. SwiftTool has an init method that sets up CLI arguments and processes the ones provided into an Options object that can then get passed down in the relevant code that does the lifting. ArgumentParser and ArgumentBinder do most of that lifting. Each subclass of SwiftTool has its own runImpl method, using relevant underlying packages and dependencies. run invokes the relevant code in runImpl. If any errors get thrown, it deals with them (regardless of the CLI invoking)  with a method called handle in Error. It is mostly printing some (hopefully useful) error message and then exiting with an error code.

My goal is to check for required files – that they exist and are executable – and report consistently and clearly if they’re not. From what I’ve learned so far that seems to be best implemented by creating a relevant subclass (or enum declaration) of Swift.Error and throwing when things are awry, letting the _handle implementation in Commands.Error take care of printing things consistently.

So on to the file checking parts!

The “file exists” seemed to have two implementations, one in Basic.Filesystem and another in Basic.Pathshims.

After a bit more reading and chatting with Ankit, it became clear that Filesystem was the desired common way of doing things, and Pathshims the original “let’s just get this working” stuff. It is slowly migrating and stabilizing over to Filesystem. There are a few pieces of the Filesystem implementation that directly use Pathshims, so I expect a possible follow-up will be to collapse those together, and finally get rid of Pathshims entirely.

The “file is executable check” didn’t yet exist, although there were a couple of notes that it would be good to have. Similar checking was in UserToolchain, including searching environment paths for the executable. It seemed like it needed to be pulled out from there and moved to a more central location, so it could be used in any of the subclasses of SwiftTool.

At this point, I’ve put up a pull request to add in the executable check to Filesystem, and another to migrate the “search paths” logic into Utility. I need to tweak those and finish up adding the tests and comments, and then I’ll head back to adding in the checks to the various subclasses of SwiftTool that could use the error checking.

Sabbatical

I took this panoramic over a month ago now, when the crocus’ weren’t even opening yet. A pretty spectacular weekend, and the view of Lake Union and east to the Cascades was amazing.

It seemed like a damn good photo to start my sabbatical with. As I’m posting this, I’m just beginning a sabbatical. I’ve been wanting to do this for over a decade, and this year its feasible. Learning, Playing, and Travel are the top contenders for my time. I have been making lists for nearly 3 months about “what I could do” – I’ll never get to it all, but it makes a good target.

I’ve started some art classes, which I’m inordinately nervous about, but have been trying to work on for months now. I figured a full-out class would enforce some structure around what I’m doing, and hopefully lead to some interesting things. I have been loving the new camera on the iPhone 7 and posting photography experiments to facebook and twitter. It’ll be interesting to see if I can fuse a few different arts together.

I can’t even begin to think of not being a geek, so I am also contributing to a couple  open source efforts. I’m now a contributor to Apple’s swift project: just fixing a bug or two at this point, but it’s a start and gives me a reason to dig more into the language. I may do the same with Kubernetes a bit later, and expand on learning the Go language.

Travel is also in the plan – although like usual I’ll post more about where I’ve been than “where I am”, so don’t expect updates until after the fact for the travel. Lots of things to see on the horizon…

HOW TO: making a portable binary with swift

Over the weekend I was working with Vapor, trying it out and learning a bit about the libraries. Vapor leverages a library called LibreSSL to provide TLS to web services, so when you compile the project, you get a binary and a dynamic library that it uses.

The interesting part here is that if you move the directory that contains the “built bits”, the program ceases to function, reporting an error that it can’t find the dynamic library. You can see this in my simple test case, with the code from https://github.com/heckj/vaportst.

git clone https://github.com/heckj/vaportst
swift build -c release
mv .build/release newlocation
./newlocation/App version

then throws an error:

dyld: Library not loaded: /Users/heckj/src/vaportst/.build/release/libCLibreSSL.dylib
 Referenced from: /Users/heckj/src/vaportst/newlocation/App
 Reason: image not found
Abort trap: 6

It turns out that with Swift 3 (an Swift 3.1 that’s coming), the compiler adds the static path to the dynamic library, and there’s an interesting tool, called the install_name_tool that can modify that from a static path to a dynamic path.

Norio Nomura was kind enough to give me the exact syntax:

install_name_tool -change /Users/heckj/src/vaportst/.build/release/libCLibreSSL.dylib @rpath/libCLibreSSL.dylib .build/release/App
install_name_tool -add_rpath @executable_path .build/release/App

This tool is specific to MacOS, as the linux dynamic loader works slightly differently. As long as the dynamic library is in the same directory as the binary, or the environment variable LD_LIBRARY_PATH is set to the directory containing the dynamic libraries, it’ll get loaded just fine on Linux.

Swift today doesn’t provide a means to create a statically linked binary (it is an open feature request: SR-648). It looks like that may be an option in the future as the comments in the bug show progress towards this goal. The whole issue of dynamic loading becomes moot, at the cost of larger binaries – but it is an incredible boon when dealing with containers and particularly looking towards running “server side swift”.

Kubernetes community crucible

I’ve been watching and lurking on the edges of the Kubernetes community for this past cycle of development. We are closing on the feature freezes for the 1.6 release, and it is fascinating to watch the community evolve.

These next several upcoming releases are a crucible for Kubernetes as a project and community. They have moved from Google to the CNCF, but the reality of the responsibility transition is still in progress. They made their first large moves and efforts to handle the explosive growth of interest in their project and the corresponding expansion of community. The Kubernetes 1.6 release was supposed to be a “few features, but mostly stability” after the heavy changes that led into 1.5, recognizing a lot of change has hit and stabilization is needed. This is true technically as well as for the community itself.

The work is going well: the SIG’s are delegating responsibility pretty effectively and most everyone is working to pull things forward. That is not to say it is going perfectly. The crucible isn’t how it dealt with the growth, but how it deals with the failures and faults of the efforts to handle that growth. One highlight is that although 1.6 was supposed to be about stability, the test flakiness has risen again. The conversation in last week’s community meeting highlighted it, as well as discussed what could be done to shift back to reliability and stability as a key ingredient. Other issues include project-wide impacts that SIG’s can have, the resolution to that being a lot of the community project and product manager focus over the past months.

This week’s DevOpsWeekly newsletter also highlighted some end user feedback: some enthusiastically pro-Kubernetes, another not so much. The two posts are sizzling plates of feedback goodness for the future of Kubernetes – explanations of what’s effective, what’s confusing, and what could be better – product management gold.

  • Thoughts on Kubernetes by Nelson Elhage is an “enthusiastic for the future of the project” account from a new user who clearly did his research and understood the system, pointing out rough edges and weak points on the user experience.
  • Why I’m leaving Kubernetes for Swarm by Jonathan Kosgei highlights the rougher-than-most edges that exist around the concept of “ingress” when Kubernetes is being used outside of the large cloud providers, highlighting that Docker swarm has stitched together a pretty effective end-user story and experience here.

I think the project is doing a lot of good things, and have been impressed with the efforts, professionalism, and personalities of the team moving Kubernetes forward. There is a lot of passion and desire to see the right thing done, and a nice acknowledgement to our myopic tendencies at times when holistic or strategic thoughts are needed. I hope the flaky tests get sorted without as much pain as the last round, and the community grows from the combined efforts. I want to see them alloyed in this crucible and come out stronger for it.

Benchmarking etcd 3.0 – an excellent example of how to benchmark a service

Last week, the CoreOS team posted a benchmark review of etcd 3.0 on their blog. Gyo-Ho Lee was the author, and clearly the primary committer to the effort – and he did an amazing job.

First and foremost is that the benchmark is entirely open, clear, and reproducible. All the code for this effort is in a git repository dedicated to the purpose: dbtester, and the test results (also in the repository) and how to run the tests are all detailed. The benchmarking code was created for the purpose of benchmarking this specific kind of service, and used to compare etcd, zookeeper, and consul. CoreOS did a tremendous service making this public, and I hope it gave them a concrete dashboard for their development improvements while they iterated on etcd to 3.0.

What Gyo-Ho Lee did within the benchmark is what makes this an amazing example: He reviewed the performance of the target against multiple dimensions. Too many benchmarks, especially ones presenting in marketing materials, are simple graphs highlighting a single dimension – and utterly opaque as to how they got there. The etcd3 benchmark reviews itself, zookeeper, and consult against multiple dimensions memory, cpu, and diskIO. The raw data that backed the blog post is committed into the repo under “test-results”. It is reasonably representative (writing 1,000,000 keys to the backend) and tracked time to complete, memory consumed, CPU consumed, and disk IO consumed during the process.

I haven’t looked at the code to see how re-usable it might be – I would love to see more benchmarks with different actions, and a comparison to how this operates in production (in cluster mode) – but these wishes are just variations on the theme, and not a complaint to the work done so far.

As an industry, as we build more with containers, this kind of benchmarking is exactly what we need. We’re composing distributed services now more than ever, and knowing the qualities of how these systems or containers will operate is as critical as any other correctness validation efforts.

How to measure your open source project

Before I get into the content, let me preface this with this is a WORK IN PROGRESS. I have been contributing and collaborating using Open Source for quite a while now, and my recent work is focused on helping people collaborate using open source as well as how to host and build open source projects.

I am a firm believer of “measure what you want to do” – to the point that even indirect measures can be helpful indicators. Most of this article isn’t something you want to react to on a day-to-day basis – this is about strategic health and engagement. A lot of the information takes time to gather, and even more time to show useful trends.

Some of this may be blindingly obvious, and hopefully a few ideas (or more) will be useful. These nuggets are what I’ve built up to help understand the “success” of an open source project I am involved with, or running. I did a number of google searches for similar topics when I was establishing my last few projects – there weren’t too many articles or hints out there like this, so hopefully this is a little “give back”.

There are two different – related but very separate – ways to look at your project in terms of “success”:

  • Consumption
  • Contribution

Consumption

Consumption is focused around “are folks effectively using this project”. The really awesome news in this space is that there’s been a lot of thinking about these kinds of measures – it’s the same question that anyone managing a product has been thinking about, so the same techniques to measure product success can be used.

If your project is a software library, then you will likely want to leverage public package management tooling that exists for the language of your choice: PyPi, CPAN, NPM, Swift Package Catalog, Go Search, etc. The trend of these package managers is heading towards relying on Github, Bitbucket, and Gitlab as a hosting location for the underlying data.

Some of the package indices provide some stats – npm, for example, can track downloads through NPM for the past day, week and month. A number of the package indicies will also have some measure of popularity where they try and relate consumption by some metric (examples: http://pypi-ranking.info/alltime, NPM keeps track of “how many dependents and stars a package has, and go-search tracks the number of stars on the underlying repo)

The underlying source system, like github, often have some metrics you can look at – and may have the best detail. Github “stars”, which is a social indicator that’s useful to pay attention to, and any given given repository has graphs that provide some basic analytics as well (although you need to be an admin on that repo to see it under the URI graphs/traffic). Github also recently introduced the concept of “dependents” in its graphs – although I’ve not seen it populated with any details as yet. Traffic includes simple line charts for the past two weeks covering:

  • Clones
  • Unique Cloners
  • Views
  • Unique Visitors

Documentation hosted on the Internet is another excellent place to get some metrics, and is a reasonable correlation to people using your library. If you’re hosting documentation for your project through either github pages or ReadTheDocs, you can embed in a Google Analytics code so that you can get page views, unique users per month, and other standard website analysis from those sites.

I’ve often wished to know how often people where looking at just the README on the front page of a Github repository, without any further setup of docs or github pages, and Ilya Grgorik made a lovely solution in the form of ga-beacon (available on github). You can use this to host a project on Heroku (or your PaaS or IaaS of choice), and it’ll return a image and pass along the traffic to Google Analytics to get some of those basic details.

If your project is a more complete service, and most specifically if you have a UI or a reference implementation with a UI, that offers another possibility for collecting metrics. Libraries from a variety of languages can send data to Google Analytics, and with many web-based projects it’s as easy as embedding the Google Analytics code. This can get into a bit of a grey space – about what your monitoring and sending back. In the presentation entitled “Consider the Maintainer“, Nadia Eghbal cited a preference towards more information that I completely agree with – and if you create a demo/reference download that includes a UI, it seems quite reasonable to track metrics on that reference instance’s UI to know how many people are using it, and even what they’re using – although don’t confuse that with evidence about how people are using your project, it’s far more about people experimenting and exploring when you’re reviewing stats from a reference UI.

There is also StackOverflow (or one of the Stack Exchange variants) where your project might be discussed. If your project grows to the level of getting questions and answers on StackOverflow, it’s quite possible to start seeing tags either for your project, or for specifics of your project – at least if the project encourages Q&A there. You can get basic stats per tag, “viewed”, “active”, and “editors” as well as the number of questions with that tag, which can be a reasonably representation of consumption as well.

Often well before your project is big enough to warrant a StackOverflow tag, Google will know about people looking for details about your project. Leverage Google Trends to search for your project’s name, and you can even compare it to related projects if you want to see how you’re shaping up against possible competition, although pinning it down by query terms can be a real dark art.

Contribution

Measuring contribution is a much trickier game, and often more indirect, than consumption. The obvious starting point there are code contributions back to your project, but there’s other aspects to consider: bug reports, code reviews, and just conversational aspects – helping people through a mailing list, IRC or Slack channel. When I was involved with OpenStack there was a lot of conversation about what it meant to be a contributor and how to acknowledge it (in that case, for the purposes of voting on technical leadership within the project). Out of those conversations, the Stackalytics website was created by Mirantis to report on contributions in quite a large number of dimensions: Commits, Blueprints (OpenStack’s feature coordination document) drafted and completed, Emails sent, bugs filed and resolved, reviews, and translations.

Mirantis expanded Stackalytics to cover a number of ancillary projects: Kubernetes, Ansible, Docker, Open vSwitch, and so on. The code to Stackalytics itself is open source, available on github at https://github.com/openstack/stackalytics. I’m not aware of other tooling that provides this same sort of collection and reporting – it looks fairly complex to set up and run, but if you want to track contribution metrics this is an option.

For my own use, the number of pull requests submitted per week or month has been interesting, as has the number of bug reports submitted per week or month. I found it useful to distinguish between “pull requests from the known crew” and external pull requests – trying to break apart a sense of “new folks” from an existing core. If your project has a fairly rigorous review process, then time between creation and close or merging of a PR can also be interesting. Be a little careful with what you imply from it though, as causality for time delays is really hard to ping down and very dependent on your process for validating a PR.

There’s a grey area for a few metrics that lay somewhere between contribution and consumption – they don’t really assert one or the other cleanly, but are useful in trying to track the health of a project: mailing lists and chat channels. The number of subscribers to project mailing lists or forums, # of people subscribed to a project slack channel (or IRC channel). The conversation interfaces tend to be far more ephemeral – the numbers can vary up and down quite a bit – and by minute and hour as much as over days or weeks – but within that ebb and flow you can gather some trends of growth or shrinkage. It doesn’t tell you about quality of content, but just that people are subscribed is a useful baseline.

Metrics I Found Most Useful

I’ve thrown out a lot of pieces and parts, so let me also share what I found most useful. In these cases, the metrics were all measured weekly and tracked over time:

  • Github repo “stars” count
  • Number of mailing list/forum subscribers
  • Number of chat channel subscribers
  • Number of pull requests submitted
  • Number of bugs/issues filed
  • Number of clones of the repository
  • Number of visitors of the repository
  • Number of “sessions” for a documentation website
  • Number of “users” for a documentation website
  • Number of “pageviews” for a documentation website
  • Google Trend for project name

I don’t have any software to help that collection – a lot of this is just my own manual collection and analysis to date.

If you have metrics that you’ve found immensely valuable, I’d love to know about them and what you found useful from them – leave me a comment! I’ll circle back to this after a while and update the list of metrics from feedback.

Updates:

Right after I posted this, I saw a tweet from Ben Balter hinting at some of the future advances that Github is making to help provide for Community Management.

And following on that Bitgeria poked me about their efforts to provide software development analytics, and in particular their efforts at providing software to collect metrics on what it means to collaborate into their own open source effort: Grimoire Lab.

I also spotted an interesting algorithm that  used to compare open source AI libraries based on github metrics:

Aggregate Popularity = (30*contrib + 10*issues + 5*forks)*0.001

Setting up a Swift Development Environment

Over the holidays I thought I should start learning something new – a new language or something, and I’ve been wanting to get back to doing some IOS development for a while, so it seemed to be a good time to start digging into swift. While I’m making all the mistakes and learning, I thought I’d chronicle the effort. I am taking a page from Brent Simmon’s Swift Dev Diaries, which I found immensely useful and informative. So this is the first of my swift dev diaries.

To give myself something concrete to work on, I thought I might help out with open source  swift itself. The swift package manager caught my eye as a starting place. For getting started resources, we have the github repo http://github.com/apple/swift-package-manager, the mailing list swift-build-dev, and a bug tracker. And then on the mailing list, I saw there was also a slack channel, specifically for swiftpm (unofficial and all that).

The bug I decided to tackle as a starting pointlooked reasonably constrained: SR-3275 – originally a crash report, but updated to reflect a request for improved output about why a failure occurred. The bug report led me to want to verify my update on MacOS and Linux, which started this post – which includes a bit of “how to” to develop for Swift project manager on Linux as well. The same could be used for any “server side swift”.

There are instructions on how to set things up, but it’s slightly behind as of the latest development snapshot. I set up a Vagrant based development environment, starting with IBM-Swift‘s work initial server-side swift work, and updating it to make a development snapshot work. The work itself is under a branch called swift-dev-branch, updated to start with Ubuntu 16.04 and adding in a number of dependencies that aren’t yet documented in the swift.org documentation as yet.

If you want to try out the Vagrant setup, you can my Vagrantfile from git and give it a whirl – I’ll happily take updates. You’ll need to have Vagrant and Virtualbox pre-installed.

git clone https://github.com/heckj/vagrant-ubuntu-swift-dev
cd vagrant-ubuntu-swift-dev
git checkout swift-dev-branch
vagrant up && vagrant ssh

It will take a bit of time to build, downloading the base image and snapshot, but once built is pretty fast to work with. If for some reason you want to destroy it all and start from scratch:

vagrant destroy -f && vagrant box update && vagrant up

should do the trick, wiping the local VM and starting afresh.

As a proof of concept, I used it to run the swiftpm build and test sequence:

vagrant ssh:

mkdir -p ~/src
cd ~/src/
git clone https://github.com/apple/swift-package-manager swiftpm
cd ~/src/swiftpm
Utilities/bootstrap test

While I was poking around at this, Ankit provided a number of tips through the Slack channel and shared his development setup/aliases which utilize a Docker image. I have some of those files in my repository as well now, trying them out – but I don’t have it all worked out for an “easy flow” as yet. Perhaps a topic for a future swift dev diary entry.

I wanted to mention a blog post about Bhargav Gurlanka‘s SwiftPM development workflow, which was also useful. It focused on how to use rsync to replicate data between MacOS and a Virtualbox instance to run testing on both MacOS and Ubuntu/Linux based swift.

One of the more hints I got is to use the build product from bootstrap directly for testing, which can operate in parallel, and hence run faster.

.build/debug/swift-test --parallel

That speeds up a complete test run quite considerably. It improved from 3 minutes 31 seconds to run via bootstrap to 1 minute 51 seconds – and it is a lot less visual output – on my old laptop.

I also found quite a bit of benefit to use swift-test to list the tests and to run a specific test while I was poking at things:

.build/debug/swift-test -l
.build/debug/swift-test -s FunctionalTests.MiscellaneousTestCase/testCanKillSubprocessOnSigInt

Which I was doing a fair bit of while working on digging through the code.

I haven’t quite got “using Xcode with a swift development toolchain snapshot” really nailed down as yet – I keep running into strange side issues that are hard to diagnose, but it’s definitely possible – I’ve just not been able to make it repeatably consistent.

The notes at https://github.com/apple/swift-package-manager#development are a good starting point, and from there you can use the environment variable “TOOLCHAINS” to let Xcode know to use your snapshot and go from there. The general pattern that I’ve used to date:

  1. download and install a swift development snapshot from https://swift.org/download/#releases
  2. export TOOLCHAINS=swift
  3. Utilities\bootstrap –generate-xcodeproj
  4. open SwiftPM.xcodeproj

and go from there. Xcode uses it’s own locations for build products, so make sure you invoke the build from Xcode “Build for Testing” prior to invoking any tests.

If Xcode was open before you invoked the “export TOOLCHAINS=swift” and then opened the project, you might want to close it and invoke it from the “open” command as I did above – that makes sure the environment variables are present and respected in Xcode while you’re working with the development toolchain.