helm

Review of using Helm to package and host applications

The open source project Helm represents itself as a package manager for Kubernetes. It does that pretty darn well, having been attached to the project from the earliest days of Kubernetes, and continues to evolve alongside, and now a bit more separately from, the Kubernetes project.

Looking at Helm version 2, it provides three main values

  • As a project, it coordinates and reinforces a number of “best practices” in how to do that generation by housing a public collection of some of the most common/popular open source projects, packaged and “ready to use” within a Kubernetes cluster.
  • It provides a templating solution to generate the plethora of YAML files that make up the descriptions and definitions of Kubernetes resources.
  • It provides a “single command” tool for deploying one or more projects to an existing Kubernetes cluster.

The first two values have been the most meaningful to me, although with some definite caveats and gotchas. The third I thought would be more valuable. I still use the single-deploy-command regularly, although I’m questioning if it is a crutch that will ultimately be a trouble point.

More in depth for each of these:

Package Repository

The single most powerful aspect of Helm isn’t the code itself, but the result of the community and contributions those community members have wrought. The charts collection is the defacto set of examples of how to organize, structure the inputs, and run software with the resource concepts of Kubernetes.

While it calls itself a package manager, there’s a gap if what you are expecting from it is a single binary package that is installable – the moral equivalent of an .rpm .deb, .apk, or installer exe file. You don’t get a single file – Instead you get a set of configuration values alongside a set of templates. The configuration values are the defaults used with the templates to generate the description of the Kubernetes resources. These default values can also be overridden when you invoke helm, which is a godsend for scripted deployment (such as CI/CD). The resource descriptions generated from the templates expect (and require) the actual software you’re running to be available via one of the public container registries – GCR.io, quay and DockerHub being the three most common referenced. The software you’ll actually be running – the container images – is (intentionally) not included within the helm chart.

If you want to run your Kubernetes cluster where it can’t take advantage of the public internet and those resources, be aware and prepared. I would not be surprised to see caching proxies of those services develop, much like Artifactory and Nexus developed for the Maven build tooling. In fact, I would keep a close eye on Harbor (technically open source, but dominated by the VMware) to see what might develop to help them deploy in more isolated scenarios. It is not all that difficult to use private and local container repositories, just be aware the public packages expect to use the public container repositories.

Pervasively embedded within the templates is a fairly robust and opinionated set of how to take advantage of Kubernetes. The content of the  templates contains a ton of knowledge, but be aware it is not always consistently applied or exposed. Like many projects it has learned from experience, so newer and more recently updated charts will often reflect different views of what is important and useful for deployment. None the less, they provide a huge number of functional and effective patterns for running software.

These patterns are the strongest where the features have existed and been stable within Kubernetes for a while – Pods, ReplicaSets, and the effective use of the side car pattern for buckling on ancillary functionality or concerns. It is weaker (or perhaps viewed differently: various levels and consistencies of workarounds were created) for some of the new features in Kubernetes, including StatefulSets and even Deployments.

In some cases, early workarounds were so heavily embedded that they persisted long after the need existed: the concept of Helm “deploying and managing state” was a filler to the gap of Kubernetes not having a solid Deployment concept earlier, and the whole world of custom resources and extending Kubernetes with operators overlaps with what Helm enabled with hooks. My perception is that both Kubernetes and Helm charts are struggling with how to best deploy these newer structures, which themselves represent often some operational knowledge or intended patterns.

Like the virtual machine focused brethren (Chef and Puppet) before them, Helm added testing and validation for their charts. The chart validation has expanded significantly in the past 18 months. Like any validation, they do not guarantee 100% effectiveness. Even still, I believe it is important to be willing to review what the chart is doing, and how it’s doing it, before using it. The instances of charts failing with newer versions of Kubernetes has decreased significantly, primarily due to the focus of the Helm community on recognizing and working to expose it as a problem and resolve it when it occurs.

Templating

A bit background – Kubernetes resources can be defined in either JSON or YAML, and are a declarative structure: a desired state of what you want to exist. These structures are loosely coupled, “tied together” with the concept of labels and selectors. This is both a blessing and curse, providing a lot of flexibility, but if you typo or mismatch some of those connections, there can be very little – to no – checking and it can be quite difficult to debug.

In creating these resource manifests, you will often find yourself repeating the same information, sometimes many times – or explicitly using repetition to tie pieces together. It is ripe for the solution that developed to this repetition and boilerplate overhead: templating.

Helm uses (and exposes) the Go template library Sprig, to greater and lesser degrees. From using the templating language, my opinion is that it is no better (or worse) than any other templating system. It has many of the same concepts that you might find in other templating systems, so if you are already familiar with a templating language, picking up the one used by Helm may be awkward but really is not too bad.

There are variants in other projects that enable a similar functionality to Helm (KSonnet, and the now mostly ice-boxed Fabric8). Even with competition, the network effects from Helm’s collection of charts makes it very hard to compete. Most solutions in this space have to make a choice of how much of a language to build vs. how simple the templates are to use – a continuum between a fully fledged programming language and simple, targeted replacement of values. Helm’s choice adds in some language structures (concepts of making and using a variable, and transforming values), but holds back from the slippery slope into a fully custom language.

We will see if that holds true with Helm version 3’s development, which will be adding the language Lua into the mix, although it appears more for handling the deployment hooks aspect.

If you are a NodeJS, Ruby, or Python developer and looking at the charts, you may have more confusion around what the resource you’re trying to create should look like rather than any trouble with the templating language itself. The templating does nothing to encapsulate or simplify Kubernetes resources and how they may be used together (or not). Helm itself has two commands that have been lifesavers in learning the templating and using charts:

helm template

and

helm –debug –dry-run

These two commands run the templating engine and dump (with slight variances in what they’re expecting) the rendered results. You still end up seeing (and having to deal with) the “wall of YAML” that is Kubernetes, but these two commands at least make it easy to see the results of the templates after they render.

Deployment

The exciting (yes, I get excited about weird things) capability of Helm to deploy my applications in a single command reinforced the the concept of it being a package manager, but may ultimately be the biggest crutch of the solution.

As mentioned earlier, Helm “grew up” with Kubernetes, and was alongside the project from its earliest days, covering the gaps from the core of Kubernetes to the cold hard reality of “getting your stuff running” in a cluster. Helms’s concept of Tiller may be the earliest seed of an operator, as it installs itself into your cluster and then handles the deployment and management of resources that it manages. This same capability is more recently codified into custom resources and the operator pattern, as well as the simplest and most common use cases being covered by the Deployment resource and the associated controller.

When Kubernetes finally included RBAC as a default, Helm (and how tiller was installed) illuminated a bit of a hole in how many people were using and deploying software. There was a lot of work exposing and thinking about how to properly secure Helm. Helm 3 will be removing Tiller from the concept of Helm, continuing to evolve with Kubernetes features.

You also don’t strictly need to use this capability of Helm, although it is darned alluring. As mentioned in the section on templating, you can render charts (and their templates) locally and use tools such as kubectl to apply the resulting resources to a cluster.

Having a single command that is easy to add into a script has been a godsend for continuous deployment scenarios. It is what powers GitLab’s “AutoDevOps” hosted continuous deployment. I use the deploy-with-a-single-command myself, and plan to continue to do so, but it comes with a price.

Helm likes to own the whole lifecycle of the release and does not expect or accommodate anything else interacting with the stuff it is managing. In many cases, this is completely fine, but in some cases it can be a pain in the butt. The most common scenario usually involves some manner of persistence – where you want to install and get the software running, and need it operational to do further configuration on how you want to use it. This could be anything from linking multiple instances of some software into a cluster, doing a database schema migration, or doing a restore of prior backups.

Helm 2 has the concept of hooks to help with actions that happen repeatedly and consistently with every deployment or upgrade process. Helm 3 will be expanding on these concepts, although I don’t yet know the specifics, with Lua as the scripting language to enable this functionality and potentially more.

I am personally conflicted on the inclusion of Lua and what it implies for Helm. While Lua is a lovely scripting language, and likely the best choice for what the developers decided they wanted to do, I think it may end up being a new barrier to adoption for developers outside of the Helm charts/Kubernetes space. Every developer that sits down to use Kubernetes comes with their own biases and comfort with scripting languages. They are often used to Python, Ruby, Javascript, or any of a number of other languages. If Lua becomes an implicit requirement for them to use Helm to accommodate their own operational needs, I suspect that barrier will be significant. Because of this I am hesitant to be excited about the inclusion and focus on using Lua with Helm. What it will ultimately mean in terms of developer accessibility to using Helm and Kubernetes together is yet to be seen. I hope it won’t be an even larger and steeper learning curve.

For the scenario where you want to do periodic, but not consistent, interactions – such as backing up a database or doing a partial restore or recovery – you need to be very aware of the application and its components in their lifecycles. In these scenarios, I have not found a terrific way of using Helm and its hooks to help solve those problems.

Kubernetes itself is only partially effective in this space, having the concept of jobs for one-off batch style mechanics. However, jobs can be darned awkward to use for things like a backup. While I used jobs and continue to try and make them work, I often revert to using kubectl to directly interact with temporary pods to get the work done consistently.

With Helm, I struggled with creating job resources that utilize the same ConfigMap, Secrets, and variables that are used with the charts. Helm is crappy at doing this if you’re using the deploy-with-a-single-command mechanism. An as I mentioned earlier, Jobs can be an awkward fit with the use cases I am trying to accommodate. These scenarios are more “pets” and “one-off” needs where knowledge of the underlying systems and their current state are critical. It may be that operators will ultimately win out for these use cases, but they have a fair way to go yet.

At its heart, this deployment capability that I use implicitly use many times a day also strikes me as the current edition of Helm’s weakest point, and I wonder if it is a crutch that I will ultimately need to replace.