Puppet is out, in comes Ansible

I manage and have managed thousands and thousands of systems during my career so far, and I continue to believe that a level of configuration management is required for any environment that consists of zero or more nodes.

To this purpose, I had deliberately chosen Puppet — some 8 years ago. I’ve authored a couple of dozen modules. Each according to a very narrow and strict principle;

Describe the desired state of your node.

This has meant that the node (an operating system instance) was told to install packages, place configuration files, start services, and other such, but not to create databases or buckets or queues — the state of the application is not the state of the node.

This also meant none of these modules ever attempted to encapsulate the option value available in configuration files, such as ‘httpd.conf’, by templating them and writing the modules to adapt to manifest changes. In my not so humble opinion, this would heavily and unfairly burden system administrators with learning the behaviour of a Puppet module and its templates, always limited in its options, rather than the application itself with all options included.

I believed it was not at all unfair to assume the system administrators consuming my modules would know what they were doing. In fact, I require system administrators to know what they are doing — and therefore what they know cannot be fit in to a module or a template. I decided to avoid requiring consumers of my modules to learn about a level of application logic inside the module in addition to learning about the application managed.

This ended up with Puppet mostly shipping entire files from sources, with the system administrator providing the correct contents (or the module providing you with the RPM default version) and for Puppet to do the rest (i.e. “First install package, then deploy configuration, then start service, then configure to start on boot”).

Over the years though, this has still built a fair amount of complexity in to some of the modules. Compatibility from RHEL 4 onward to Fedora 23 comes to mind. Ever so slight differences between a subscribed RHEL system and a free-for-all CentOS box. Vastly varying system infrastructure and configuration between entire generations of Free Software.

Furthermore, I wrote a module for a Puppet server to be managed by Puppet. This module was subsequently extended to support satellite Puppet servers. It pulls your modules from GIT. It is multi-tenant. It scales — the record is 7 Puppet servers for 24 tenants globally. It’s been a lot of work, and it’s also been satisfying. The necessity to be able to describe the desired state — and only ever the desired state — is what kept me from using other people’s Puppet modules, and other configuration management suites as well — with the exception of those written in Perl, those have been the reasons to put in that work.

It motivated me to argue for an exception of the anti-vendorizing rule in the package review for mod_passenger. It has seen me troubleshoot some 5 generations of Ruby, and a plurality of gems.

I never allowed Puppet to perform standard operating procedures such as creating a MySQL database or taking a system out of rotation in HAProxy. After all, these are one-off interventions (albeit they may be repeated numerous times) and require a form of coordination between multiple systems — rather than the seemingly arbitrary “between now and $runinterval”.

What you use for your standard operating procedures can be debated. I use Ansible where I can, but admittedly do not have a lot of time to develop proper play-books. A common routine is to take systems out of rotation, update software, run Puppet for configuration updates, await a reboot if some kernel or other base system libraries have received security updates, and bring the system back in rotation when it is alive again. This tends to span half a dozen systems, and orchestration of what action is taken where and when is required. Puppet is simply not up to the job.

However, this is today, and none of you are interested. What is slightly more interesting than today is … tomorrow.

Today we run long-lived nodes that require updates to their state while they live. Such a scenario suggests there is a boundary between a node’s desired state and whatever else you do to a system.

Sometimes, you might find base images get updated and running nodes get cycled with the new images. Some minimal post-configuration may be needed to get the runtime up-to-date with other (dynamic) aspects of the environment, but really it becomes a one- or two-shot delivery mechanism; The first shot is to provision and bootstrap your image, the second shot is to get it to run properly.

These very much sound like standard operating procedures to me, as the next iteration of a “desired state” for the environment causes you to cycle off of the current image, and on to the new ones.

If all you need to do to compose such an image is write “install application, copy application config from here to there, and make it start”, I’m very curious as to what you would like to need Java, Python, Ruby and PostgreSQL for. I’m intrigued by the necessity for compatibility breakage between clients and servers, and the tendency to drop support for otherwise current Java, Ruby and PostgreSQL versions.

I did mention tomorrow, and with that in mind I ask you to ponder, what it is you think is the most efficient and effective way to describe the desired state for something as volatile as a Docker container?

We’ll see the Dockerfile used to compose an image become limited in their capacity to do anything serious — think of it as a kickstart or preseed or autoyast, and you get the picture. After all, the composition of a Docker image is never the equivalent of a run-time environment at all, but there will remain a variety of assets and sensitivities that are available to a run-time environment only.

We’ll start to recognize that installing and running Java and Ruby crap inside a Docker container, to get off that second shot, is just a tiny little bit of a waste.

To summarize, I will end up managing nothing but hardware with Puppet — for the very limited purposes we have actual hardware (networking, virtualization, laptops and desktops), and with the very selective number of operating systems and operating system versions (Red Hat Enterprise Linux and Fedora). Maybe not even that. We’ll see.

Currently, I have hundreds of systems to manage, and we use (read: have to use) Puppet v2, Puppet v3 and Puppet v4 to do so. We have 2 dashboards, 3 databases and 6 software repositories to provide systems with versions of Puppet, Puppet DB, Puppet Board, Puppet Dashboard, Ruby & Gems, and Passenger. This list does not include the build dependencies.

I hereby rest my case of predicting the future of Puppet: Obsolescence.

Soon. I think. No, I hope.


One thought on “Puppet is out, in comes Ansible

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s