AWS Account Switching with Ansible

I recently worked on a project involving multiple AWS accounts, with different projects and environments spread through those accounts in different combinations. Having opted to use Ansible for driving deployments, I looked at built-in capabilities for account switching. It turns out you can easily inject credentials authenticating with another IAM user, but this can only be done on a per-task (or perhaps, per block?) level. This might seem flexible at first glance, but when you consider you have to duplicate tasks, and therefore roles, and even playbooks, when you have to use different accounts, it quickly becomes unwiedly. That’s not even considering the insane amount of boilerplate you get when forced to specify credentials for each and every task. Perhaps the biggest blocker is that Ansible has no support for assuming IAM roles, which is amplified by the fact that most of the core AWS modules still rely on boto2, which has patchy support for this at best, and won’t be improving any time in the future.

I spent some time digging in the boto2 and boto3 docs to find commonalities in authentication support, and eventually figured that I should be able to inject temporary credentials via environment variables. Thankfully even the Session Token issued with temporary credentials (such as when assuming a role) is barely supported in boto2, albeit with a different environment variable. Now I just needed a way to obtain the credentials, and set them before playbook execution.

My first pass was a wrapper script, making use of AWS CLI calls to STS and parsing out the required bits with jq. This worked, proving the concept, but lacked finesse and intelligence as you’d still need to purposely decide which role to assume before running a playbook.

What I really wanted was a way to automatically figure out which AWS account should be operated on, based on the project and or environment being managed. Since I already have a fairly consistent approach to writing playbooks, where the environment and project are almost always provided as extra vars, this should be easy!

I’ve previously made use of Ansible vars plugins; this is a very underdocumented feature of Ansible that whilst primarily designed for injecting group/host vars from alternative sources, actually provides a really flexible entrypoint into a running Ansible process in which you can do whatever you want. The outputs of a vars plugin are host variables, but with a little cheekiness you can manipulate the environment - which happens to be where Boto and Boto3 look for credentials!

Vars plugins, however cool, are just plugins. There are inputs and outputs, but those do not include a way to inspect existing variables (either global or per-host) from within the plugin itself. Personally I find this a major shortcoming in this particular plugin architecture, however since the required information is always passed as extra vars, I decided to manually parse the CLI arguments to extract them in the plugin and not relying on Ansible to do it.

Here’s how I went about it. So, starting in the vars_plugins directory (relative to playbooks), here is a skeleton plugin that runs but does not yet do anything useful.

We can extend this to parse the CLI arguments with ArgParse, making sure to use parse_known_args() so that we don’t have to duplicate the entire set of Ansible arguments.

Now we have made available any extra vars in dictionary form, making it easy to figure out which environment and project we’re working on. We’ll run playbooks like this:

ansible-playbook do-a-thing.yml -e env=staging -e project=canon

Next, we’ll build up a configuration to specify which account should be used for different projects/environments. In my situation, the makeup was complex due to some projects having all environments in a single account and some accounts having more than one project, so I needed to model this in a reusable manner. This is the structure I came up with. aws_profiles is a dictionary where the keys are names of AWS CLI/SDK profiles (as configured in ~/.aws), and the values are dictionaries of extra vars to match on.

Parsing this took a bit of thought, and some rubber ducking on zatech, but I eventually figured it out. This could probably be leaner but it balances well in my opinion. We store this configuration in vars_plugins/aws.yml, where the plugin can easily read it.

This got busy real quick, let’s break it down a little.

At line 54, we read the configuration file and store a config dictionary.

At line 55, we loop through each configured profile and instantiate a boto3 session for each one, saving the session objects as attributes on our module class.

At line 56, the magic happens. First we retrieve temporary credentials for each of the connected profiles (line 83) - this includes the usual secret access key plus a session key. From line 85 We check for an environment variable ANSIBLE_AWS_PROFILE, a play on the usual AWS_PROFILE, which allows us to override the account selection when invoking Ansible. Should this not be specified, from line 90 we iterate the profile specifications to determine if the specified extra vars match any profile. If they do, default_profile is populated and from line 103 we export the earlier acquired credentials using the usual AWS_* environment variables. From line 110, credentials for all profiles are exported with prefixed environment variable names to allow us to override them on a per-task basis.

This approach takes advantage of the fact that environment variables set here do propagate process-wide, and all Ansible modules running on the control host are able to see them, and will automatically use them to authenticate with AWS.

For specific tasks where you know you’ll always run that task for one specific account, you can reference the corresponding prefixed environment variables to specify credentials for the module. For example:

In this playbook, the ec2 task launches an instance in the account that was matched based on the env and project variables provided at runtime. The route53 task however, always creates a corresponding DNS record for the instance using the ops AWS profile.

Wrapping up, I added all of this functionality and more to my Ansible AWS Vars Plugin which you can grab from GitHub and use/modify as much as you find it useful.