I have a hypothesis on why private cloud is such a struggle:
Corporations are culturally ingrained not to accept loss of control and unpredictability in expenses, and their adoption of private cloud flies right under the foot of that cultural bias.
Let me explain:
A huge portion of the saleable use cases for private cloud are all about dev/test, which is inherently unpredictable. The solution that most enterprises use is to encyst that unpredictabilty down at the developer team level. The typical corporate pattern is to purchase up hardware for themselves, depreciate it out over the requisite years, and repurpose the hell out of it (or slice it up into virtual environments).
When an IT organization tries to provide a cloud service, the first thing they come out with is “here’s the costs for hosting this”, and “what’s the cost center you want to sign up for your account”? They’re a cost center too, so they sum up all the costs, maybe subsidize it a little, and pass it along. Even though a private cloud may be far cheaper for the overall corporation, the act of passing along that variable cost is the equivalent of tossing a hot potato into the budget cycle.
All of a sudden, the director of the development team using this service gets the hot potato of variable costs, and depending on how the IT organization is managing the funny money, it could easily be quite a bit higher than the cost of those depreciated desktops behind Larry’s desk, or the smallish investment of a VMware cluster for the dev team to utilize from a wiring closet.
The historical baggage that IT has to provide 100% reliable services just acerbates this problem. It means that corporate private clouds are under a tremendous amount of scrutiny for uptime, in many cases far more so than any public cloud. Making anything more reliable costs more-than-incrementally more money, and that in turn gets passed forward, making the variable costs even more extreme in potential volatility.
Developers are (slowly) getting culturally ingrained to deal with failures in their programming and creation of distributed systems, but corporate managers and directors certainly are not. Organizationally there’s little to no tolerance – and few corporations have sufficient scale to run the same concept of actual availability zones internally.
But most of all, there’s an alergic reaction to the variability of the cost, and often as not a director locks down control to the IT resources provided, even with a private cloud. And suddenly you’re back to the delays of sending emails, asking for permission, and waiting for someone to pull the trigger to get you even a small virtual machine.
Others have done a good job explaining why a wait of even a few hours for a virtual machine is untenable, but let me give it a try as well:
Developers coming in from small companies or startups, or their own experiences with development using cloud resources, are instantly frustrated that they can’t get the resources they need, when they need it. It’s gone beyond the issues of a matter of convenience, but to efficiency.
Most, even simple, enterprise applications are distributed systems. They have been for ages, we just preferred to call them “client server”, “three tier”, or later “n-tier” during their times in the 90’s and 2000’s prior to really having cloud capabilities. The truth is as you scale those applications up, they’re more of what you expect to hear of as full on distributed systems, and as you add capabilities integrating with other services (an example may be local corporate authentication, logging services, and application metrics collection) you’re into full-bore distributed systems, perhaps even using modern SaaS for some of the components of that system, with complex dependencies and all the fun that comes with it.
The implicit goal the developer has is to have an entire working copy of all the moving parts, and the capability of making changes to any one part of it, while keeping the rest static. Being able to reset that environment – wipe it all away and build it up again – provides a tremendous amount of development efficiency, and many good developers do this as a hedge against the creep of making “just one more change”, and missing the implications. The world of continuous integration, and continuous deployment, does exactly this. The build systems are being advanced to build up the scripts to destroy and rebuild the environment, and ideally can run multiple of these environments side by side. You’re not at the point where you want to spin up, and destroy, these resources with every developer’s commit. Asking for even a hour “wait” for permission in these cases is just completely ludicrous.
The developer response to this has been to leverage the resources they always have at hand, and are under THEIR control, to get what they need done. Vagrant and virtualbox for virtual machines, and then into the world of Docker with boot2docker, or vagrant-coreos and fleet to create the world of a half-dozen to two dozen small services all interacting with each other. And it’s doable on their laptops, or a developer can cobble a pretty decent sized container cluster from the depreciated desktops behind larry’s desk, and best of all – the developers can have their own playground, set it up and destroy it based on their own development needs.