Fifteen years ago, IBM introduced a term into the world of computing that’s stuck with me: Autonomic Computing. The concept fascinated me – first at the most basic level, simple recovery scenarios and how to write more robust services. That interest led to digging around in the technologies that enable seamless failover and in more recent years into distributed systems and managing those systems – quorum and consensus protocols, how to develop both with and for them, and the technologies that have quite a bit of attention in some circles – managing sets of VMs or containers – to provide comprehensive services.
On a walk this morning, I was reflecting on those interests and how they have all been on a path to fully autonomic computing. A goal of self managing services: services with human interfaces that require far less deep technical knowledge in order to get the capabilities available. Often that deep knowledge was myself, so in some respects I’ve been trying to program myself out of a job for the past 15 years.
Ten years ago, many of my peers were focused on cfEngine, Puppet, and later Chef: “Software configuration management systems”. Data centers and NOCs were often looking for people “skilled with ITIL” and knowledgable in “effective change management controls”, with an intrinsic goal of being the humans to provide the capabilities that autonomic services were aimed at providing. The technology has continued to advance – plenty of proprietary solutions I generally won’t detail, and quite a number of open source technologies that I will.
JuJu first caught my attention, both with its horrifically stupid name, and it’s powerful capabilities. It went beyond a single machine, to represent the combinations of applications and back-ends that make up a holilstic service, to set it up and run it. It was very Canonical, but also open source. More recently, SaltStack and Terraform continued this pattern with VMs as the base content – the unit of distribution, leveraging the rise of cloud computing. Many years before this, the atomic unit of delivery was an OS package, or maybe a tarball, JAR, or later WAR file. All super specific to the implementations of whatever OS, or in the case of JAR/WAR – language. Cloud services that have finally started to a compute server (VM) as a commodity, disposable resource into the common vernacular, and Docker popularized taking that “down a step” to containers as the unit of deployment.
Marathon and Kubernetes are now providing service orchestration for containers, and while I personally use VMs most commonly, I think containers may be the better path forward, simply because I expect them to be cheaper in the long run. The cloud providers have been in this arena for a while – HEAT in OpenStack as the obvious clone of Amazon CloudFormation, and a variety of startups and orchestration technologies that solve some of the point problems around the same space, and the whole hype-ball of “serverless”, leveraging small bits of computing responding to events as an even greater level of possible efficiency.
Moving this onto physical devices that install into a small office, or even a home, is a tricky game, and the generalized technology isn’t there, although there are some clear wins. This is what Nutanix excels at – small cloud-services-in-a-box. Limited options, easy to set up, seamless to use, commodity price point.
Five years ago I was looking at this problem through the lens of an enterprise “service provider” – what many IT/web-ops shops have become, small internal service providers to their companies, and frankly competing on some level with Amazon, Google, and Azure. I was still looking at the space in terms of “What would be an amazing “datacenter” API that developers could leverage to run their services?”. “Where are the costs in running an Enterprise data center, and how can we reduce them?” was another common question. I thought then, and still tend to believe, the ultimate expression of that would be something like Heroku, or it’s open source clone/private enterprise variant: Pivotal CloudFoundry. Couple that kind of base platform with various logging and other administrative capabilities supporting your services, and you remove a tremendous amount of cost from the space of managing a datacenter – at least when applications can move onto it, and therein lies a huge portion of the crux. Most classic enterprise services can’t move like that, and many may never.
In the past several years, I’ve come to think a lot more about small installations of autonomic services. The small local office with a local physical presence. Running on bare metal, to be specific. In that kind of idea, something like Kubernetes or Marathon not in the large – crossing an entire datacenter, but in the small – focusing on a single service becomes really compelling. Both of these go beyond “setting up the structure” that Terraform does, and like a distributred initD script or systemD unit, they actively monitor what they’ve started, at least on some level. Both open source software platforms haven’t really stitched everything together to get to a point of reacting seamlessly to service levels, but it’s getting pretty damned close. With these tools, you’re nearly at the point where you can have a single mechanism that creates a service, keeps it running, upgrades it when you have updates, and can scale up (or down) to some extent, and recover from some failures.
We’re slowly getting close to fully autonomic services.