Versioning Infrastructure
- Posted in:
- versioning
- aws
- automation
- environment
I’ve talked at length previously about the usefulness of ensuring that your environments are able to be easily spun up and down. Typically this means that they need to be represented as code and that code should be stored in some sort of Source Control (Git is my personal preference). Obviously this is much easier with AWS (or other cloud providers) than it is with traditionally provisioned infrastructure, but you can at least control configurations and other things when you are close to the iron.
We’ve come a long way on our journey to represent our environments as code, but there has been one hole that’s been nagging me for some time.
Versioning.
Our current environment pattern looks something like this:
- A repository called X.Environment, where X describes the component the environment is for.
- A series of Powershell scripts and CloudFormation templates that describe how to construct the environment.
- A series of TeamCity Build Configurations that allow anyone to Create and Delete named versions of the environment (sometimes there are also Clone and Migrate scripts to allow for copying and updating).
When an environment is created via a TeamCity Build Configuration, the appropriate commit in the repository is tagged with something to give some traceability as to where the environment configuration came from. Unfortunately, the environment itself (typically represented as a CloudFormation stack), is not tagged for the reverse. There is currently no easy way for us to look at an environment and determine exactly the code that created it and, more importantly, how many changes have been made to the underlying description since it was created.
Granted, this information is technically available using timestamps and other pieces of data, but this is difficult, time-consuming, manual task, so its unlikely to be done with any regularity.
All of the TeamCity Build Configurations that I mentioned simply use the HEAD of the repository when they run. There is no concept of using an old Delete script or being able to (easily) spin up an old version of an environment for testing purposes.
The Best Version
The key to solving some of the problems above is to really immerse ourselves in the concept of treating the environment blueprint as code.
When dealing with code, you would never publish raw from a repository, so why would we do that for the environment?
Instead, you compile (if you need to), you test and then you package, creating a compact artefact that represents a validated copy of the code that can be used for whatever purpose you need to use it for (typically deployment). This artefact has some version associated with it (whatever versioning strategy you might use) which is traceable both ways (look at the repo, see the version, find artefact, look at the artefact, see the version, go to repository).
Obviously, for a set of Powershell scripts and CloudFormation templates, there is no real compilation step. There is a testing step though (Powershell tests written using Pester) and there can easily be a packaging step, so we have all of the bits and pieces that we need in order to provide a versioned package, and then use that package whenever we need to perform environment operations.
Versioning Details
As a general rule, I prefer to not encapsulate complicated build and test logic into TeamCity itself. Instead, I much prefer to have a self contained script within the repository, that is then used both within TeamCity and whenever you need to build locally. This typically takes the form of a build.ps1 script file with a number of common inputs, and leverages a number of common tools that I’m not going to go into any depth about. The output of the script is a versioned Nupkg file and some test results (so that TeamCity knows whether or not the build failed).
Adapting our environment repository pattern to build a nuget package is fairly straightforward (similar to the way in which we handle Logstash, just package up all the files necessary to execute the scripts using a nuspec file). Voila, a self contained package that can be used at a later date to spin up that particular version of the environment.
The only difficult part here was the actual versioning of the environment itself.
Prior to this, when an environment was created it did not have any versioning information attached to it.
The easiest way to attach that information? Introduce a new common CloudFormation template parameter called EnvironmentVersion and make sure that it is populated when an environment is created. The CloudFormation stack is also tagged with the version, for easy lookup.
For backwards compatibility, I made the environment version optional when you execute the New-Environment Powershell cmdlet (which is our wrapper around the AWS CFN tools). If not specified it will default to something that looks like 0.0.YYDDDD.SSSSS, making it very obvious that the version was not specified correctly.
For the proper versioning inside an environment’s source code, I simply reused some code we already had for dealing with AssemblyInfo files. It might not be the best approach, but including an AssemblyInfo file (along with the appropriate Assembly attributes) inside the repository and then reading from that file during environment creation is easy enough and consistency often beats optimal.
Improving Versioning
What I’ve described above is really a step in part of a larger plan.
I would vastly prefer if the mechanism for controlling what versions of an environment are present and where was delegated to Octopus Deploy, just like with the rest of our deployable components.
With a little bit of extra effort, we should be able to create a release for an appropriately named Octopus project and then push to that project whenever a new version of the environment is available.
This would give excellent visibility into what versions of the environment are where, and also allow us to leverage something I have planned for helping us see just how different the version in environment X is from the version in environment Y.
Ad-hoc environments will still need to be managed via TeamCity, but known environments (like CI, Staging and Production) should be able to be handled within Octopus.
Summary
I much prefer the versioned and packaged approach to environment management that I’ve outlined above. It seems much neater and allows for a lot of traceability and repeatability, something that was lacking when environments were being managed directly from HEAD.
It helps that it looks very similar to the way that we manage our code (both libraries and deployable components) so the pattern is already familiar and understandable.
You can see an example of what a versioned, packagable environment repository would look like here. Keep in mind that the common scripts inside that repository are not usually included directly like that. They are typically downloaded and installed via a bootstrapping process (using a Nuget package), but for this example I had to include them directly so that I didn’t have to bring along the rest or our build pipeline.
Speaking of the common scripts, unfortunately they are a constant reminder of a lack of knowledge about how to create reusable Powershell components. I’m hoping to restructure them into a number of separate modules with greater cohesion, but until then they are a bit unwieldy (just a massive chunk of scripts that are dot-included wherever they are needed).
That would probably make a good blog post actually.
How to unpick a mess of Powershell that past you made.
Sometimes I hate past me.