• Accessing AWS Services When Remoting

    Having recently moved out of my place, where I had comparitively great Internet access supplied with static IP addresses, I’m currently working mostly tethered to my phone. It turns out that good LTE service actually works pretty well for most things – even long distance SSH – but sitting on a cellular network with a dynamic IP address can get really annoying.

    Yes, I should be establishing a VPN to reach internal services. Yes, opening holes for services on the wider Internet is totally a bad idea. But I have other security measures too, and a VPN would be just one extra. And surprise, tunnelling encrypted TCP over another encrypted TCP connection over a cellular network doesn’t provide a great experience.

    So I’m dialling straight in over SSH, and here’s the little script I knocked together to make that just a little bit nicer than logging into the AWS console and manually creating security exceptions for myself (requires Python 3.6).

    #!/usr/bin/env python3
    
    import argparse, boto3, botocore, os, sys, urllib
    from ipaddress import IPv4Address, IPv4Network
    from pprint import pprint
    
    class AuthError(Exception):
        pass
    
    def describe_rule(GroupId, IpPermissions):
        print(f"  Group ID: {GroupId}")
        print(f"  Port: {IpPermissions[0]['IpProtocol'].upper()}/{IpPermissions[0]['ToPort']}")
        print(f"  CIDR: {IpPermissions[0]['IpRanges'][0]['CidrIp']}")
        print(f"  Description: {IpPermissions[0]['IpRanges'][0]['Description']}")
    
    def main():
        name = os.environ.get('USER').capitalize()
        parser = argparse.ArgumentParser()
        parser.add_argument('-g', '--group', required=True, help='Name of security group to authorize')
        parser.add_argument('-p', '--profile', default='default', help='AWS profile to use')
        parser.add_argument('-r', '--region', default='', help='AWS region to use (defaults to profile setting)')
        parser.add_argument('-t', '--port', type=int, action='append', default=[], help='TCP port to allow (default: 22)')
        parser.add_argument('-d', '--description', default=name, help='Description for rule CIDR')
        parser.add_argument('-D', '--delete', action='store_true', help='Delete other rules with matching description')
        args = parser.parse_args()
    
        description = args.description
    
        try:
            session = boto3.session.Session(profile_name=args.profile)
        except botocore.exceptions.ProfileNotFound as e:
            raise AuthError(e)
    
        client_args = {}
        if args.region:
            client_args['region_name'] = args.region
        client = session.client('ec2', **client_args)
        groups = client.describe_security_groups(Filters=[{'Name': 'group-name', 'Values': [args.group]}])
        if 'SecurityGroups' not in groups or len(groups['SecurityGroups']) == 0:
            raise AuthError('Security group "{}" not found'.format(args.group))
        elif len(groups['SecurityGroups']) > 1:
            raise AuthError("More than one security group found for \"{0}\":\n - {1}".format(args.group, "\n - ".join([g['GroupName'] for g in groups['SecurityGroups']])))
    
        group = groups['SecurityGroups'][0]
        print('Found matching group: {}'.format(group['GroupName']))
    
        try:
            req = urllib.request.Request('https://ifconfig.co/ip', headers={'Accept': 'text/plain', 'User-Agent': 'curl/7.54.0'})
            res = urllib.request.urlopen(req)
            ip = res.read().decode('utf-8').strip()
        except urllib.error.HTTPError as e:
            raise AuthError('Could not determine public IP address, got {0} error when accessing ifconfig.co'.format(e.code))
        cidr = ip + '/32'
        print('Determined current public IP: {}'.format(ip))
    
        if len(args.port):
            ports = args.port
        else:
            ports = (22,)
    
        for port in ports:
            for perm in group['IpPermissions']:
                if perm['IpProtocol'] == 'tcp' and perm['FromPort'] <= port and perm['ToPort'] >= port:
                    for iprange in perm['IpRanges']:
                        if IPv4Address(ip) in IPv4Network(iprange['CidrIp']):
                            print('{0} already authorized by {1}'.format(ip, iprange['CidrIp']))
                            return True
    
            if args.delete:
                for perm in group['IpPermissions']:
                    if perm['IpProtocol'] == 'tcp' and perm['FromPort'] <= port and perm['ToPort'] >= port:
                        for iprange in perm['IpRanges']:
                            if 'Description' in iprange and iprange['Description'] == args.description:
                                old_rule = {
                                    'GroupId': group['GroupId'],
                                    'IpPermissions': [{
                                        'IpProtocol': perm['IpProtocol'],
                                        'FromPort': perm['FromPort'],
                                        'ToPort': perm['ToPort'],
                                        'IpRanges': [{
                                            'CidrIp': iprange['CidrIp'],
                                            'Description': iprange['Description'],
                                        }],
                                    }],
                                }
                                print('Deleting rule:')
                                describe_rule(**old_rule)
                                client.revoke_security_group_ingress(**old_rule)
    
            new_rule = {
                'GroupId': group['GroupId'],
                'IpPermissions': [{
                    'IpProtocol': 'tcp',
                    'FromPort': port,
                    'ToPort': port,
                    'IpRanges': [{
                        'CidrIp': cidr,
                        'Description': description,
                    }],
                }],
            }
            print('Creating rule:')
            describe_rule(**new_rule)
            client.authorize_security_group_ingress(**new_rule)
    
    
    if __name__ == "__main__":
        try:
            main()
        except AuthError as e:
            print(str(e), file=sys.stderr)
    
    
    # vim: set ft=python ts=4 sts=4 sw=4 et:

    Run it like this.

    # Create a new rule in the employees security group
    authorize-aws -g employees
    
    # Create a new rule as above, and delete any existing rule with your name on it
    authorize-aws -g employees -D
    
    # As above, but using a different AWS profile than the default one
    authorize-aws -p acme -g employees -D
    
    # For Windows instances
    authorize-aws -g employees -t 3389
    
    # By default, the description contains your local user name, but can be overridden
    authorize-aws -g employees -d 'Carmen mobile'
  • Terraform: AWS ACM Certificates for Multiple Domains

    My life got better when AWS introduced Certificate Manager, their service for issuing validated TLS certificates for consumption directly by other AWS services. You don’t get to download certificates issued by ACM to install on your own servers, but you can use them with your EC2 Load Balancers, CloudFront and some other services, alleviating the need to upload certificates and renew them since ACM renews them automatically.

    Closing the loop on automated certificates however, was still difficult since domain validation was done through verification emails. In Nov 2017 ACM started supporting DNS validation, which is especially great if your DNS resides on Route53. Looking to drive this combination with a single workflow, I looked at Terraform and happily enough, it supports all requisite services to make this happen. Let’s take a look.

    resource "aws_acm_certificate" "main" {
      domain_name = "example.net"
      subject_alternative_names = ["*.example.net"]
      validation_method = "DNS"
      tags {
        Name = "example.net"
        terraform = "true"
      }
    }
    
    data "aws_route53_record" "validation" {
      name = "example.net."
    }
    
    resource "aws_route53_record" "validation" {
      name = "${aws_acm_certificate.main.domain_validation_options[0].resource_record_name}"
      type = "${aws_acm_certificate.main.domain_validation_options[0].resource_record_type}"
      zone_id = "${data.aws_route53_zone.validation.zone_id}"
      records = ["${aws_acm_certificate.main.domain_validation_options[0].resource_record_value}"]
      ttl = 60
    }
    
    resource "aws_acm_certificate_validation" "main" {
      certificate_arn = "${aws_acm_certificate.main.arn}"
      validation_record_fqdns = ["${aws_route53_record.validation.*.fqdn}"]
    }

    In the basic workflow of a wildcard certificate for a single domain, Terraform first requests a certificate, then creates validation records in DNS using the zone it looked up, then goes back to ACM to request validation. Importantly, Terraform then waits for the validation to complete before continuing, a crucial point that makes it possible to immediately start using this certificate elsewhere with Terraform without racing against the validation process.

    This is pretty great, but it’s not yet portable, and what if we want to exploit all 10 (yes, ten) subjectAlternativeNames that ACM offers us?

    I toyed with this for some time, getting angry and then sad, but eventually elated, at Terraform’s interpolation functions, until I came up with this (excerpt from a working Terraform module):

    variable "domain_names" { type = "list" }
    variable "zone_id" {}
    
    resource "aws_acm_certificate" "main" {
      domain_name = "${var.domain_names[0]}"
      subject_alternative_names = "${slice(var.domain_names, 1, length(var.domain_names))}"
      validation_method = "DNS"
      tags {
        Name = "${var.domain_names[0]}"
        terraform = "true"
      }
    }
    
    resource "aws_route53_record" "validation" {
      count = "${length(var.domain_names)}"
      name = "${lookup(aws_acm_certificate.main.domain_validation_options[count.index], "resource_record_name")}"
      type = "${lookup(aws_acm_certificate.main.domain_validation_options[count.index], "resource_record_type")}"
      zone_id = "${var.zone_id}"
      records = ["${lookup(aws_acm_certificate.main.domain_validation_options[count.index], "resource_record_value")}"]
      ttl = 60
    }
    
    resource "aws_acm_certificate_validation" "main" {
      certificate_arn = "${aws_acm_certificate.main.arn}"
      validation_record_fqdns = ["${aws_route53_record.validation.*.fqdn}"]
    }
    
    output "arn" {
      value = "${aws_acm_certificate.main.arn}"
    }

    Use this as a module like this:

    module "acm_ops" {
      source = "modules/aws_acm_certificate"
      domain_names = ["ops.acme.net", "*.ops.acme.net"]
      zone_id = "${aws_route53_zone.external.id}"
    }
    
    module "acm_marketing" {
      source = "modules/aws_acm_certificate"
      domain_names = ["acme.com", "*.acme.com"]
      zone_id = "${aws_route53_zone.acme.id}"
    }

    The module accepts a list of domain names and a Route53 zone ID, and will generate a unified validated certificate, returning the ARN of the certificate which you can then use with your ELB or CloudFront resources. Peeking inside, this makes use of lookup() and splatting to parse the validation options and create all the necessary DNS records.

    The full source code for this module is available on GitHub.

  • AWS Account Switching with Ansible

    I recently worked on a project involving multiple AWS accounts, with different projects and environments spread through those accounts in different combinations. Having opted to use Ansible for driving deployments, I looked at built-in capabilities for account switching. It turns out you can easily inject credentials authenticating with another IAM user, but this can only be done on a per-task (or perhaps, per block?) level. This might seem flexible at first glance, but when you consider you have to duplicate tasks, and therefore roles, and even playbooks, when you have to use different accounts, it quickly becomes unwiedly. That’s not even considering the insane amount of boilerplate you get when forced to specify credentials for each and every task. Perhaps the biggest blocker is that Ansible has no support for assuming IAM roles, which is amplified by the fact that most of the core AWS modules still rely on boto2, which has patchy support for this at best, and won’t be improving any time in the future.

    I spent some time digging in the boto2 and boto3 docs to find commonalities in authentication support, and eventually figured that I should be able to inject temporary credentials via environment variables. Thankfully even the Session Token issued with temporary credentials (such as when assuming a role) is barely supported in boto2, albeit with a different environment variable. Now I just needed a way to obtain the credentials, and set them before playbook execution.

    My first pass was a wrapper script, making use of AWS CLI calls to STS and parsing out the required bits with jq. This worked, proving the concept, but lacked finesse and intelligence as you’d still need to purposely decide which role to assume before running a playbook.

    What I really wanted was a way to automatically figure out which AWS account should be operated on, based on the project and or environment being managed. Since I already have a fairly consistent approach to writing playbooks, where the environment and project are almost always provided as extra vars, this should be easy!

    I’ve previously made use of Ansible vars plugins; this is a very underdocumented feature of Ansible that whilst primarily designed for injecting group/host vars from alternative sources, actually provides a really flexible entrypoint into a running Ansible process in which you can do whatever you want. The outputs of a vars plugin are host variables, but with a little cheekiness you can manipulate the environment – which happens to be where Boto and Boto3 look for credentials!

    Vars plugins, however cool, are just plugins. There are inputs and outputs, but those do not include a way to inspect existing variables (either global or per-host) from within the plugin itself. Personally I find this a major shortcoming in this particular plugin architecture, however since the required information is always passed as extra vars, I decided to manually parse the CLI arguments to extract them in the plugin and not relying on Ansible to do it.

    Here’s how I went about it. So, starting in the vars_plugins directory (relative to playbooks), here is a skeleton plugin that runs but does not yet do anything useful.

    from __future__ import (absolute_import, division, print_function)
    __metaclass__ = type
    
    DOCUMENTATION = '''
        vars: aws
        version_added: "2.5"
        short_description: Nothing useful yet
        description:
            - Is run by Ansible
            - Runs without error
            - Does nothing, returns nothing
        notes:
            - Nothing to note
    '''
    
    class VarsModule(BaseVarsPlugin):
        def __init__(self, *args):
            super(VarsModule, self).__init__(*args)
    
        def get_vars(self, loader, path, entities, cache=True):
            super(VarsModule, self).get_vars(loader, path, entities)
            return {}
    
    
    # vim: set ft=python ts=4 sts=4 sw=4 et:

    We can extend this to parse the CLI arguments with ArgParse, making sure to use parse_known_args() so that we don’t have to duplicate the entire set of Ansible arguments.

    from __future__ import (absolute_import, division, print_function)
    __metaclass__ = type
    
    DOCUMENTATION = '''
        vars: aws
        version_added: "2.5"
        short_description: Nothing useful yet
        description:
            - Is run by Ansible
            - Runs without error
            - Does nothing, returns nothing
        notes:
            - Nothing to note
    '''
    
    import argparse
    
    def parse_cli_args():
        parser = argparse.ArgumentParser()
        parser.add_argument('-e', '--extra-vars', action='append')
        opts, unknown = parser.parse_known_args()
        args = dict()
        if opts.extra_vars:
            args['extra_vars'] = dict(e.split('=') for e in opts.extra_vars if '=' in e)
        return args
    
    
    class VarsModule(BaseVarsPlugin):
        def __init__(self, *args):
            super(VarsModule, self).__init__(*args)
            cli_args = parse_cli_args()
            self.extra_vars = cli_args.get('extra_vars', dict())
    
        def get_vars(self, loader, path, entities, cache=True):
            super(VarsModule, self).get_vars(loader, path, entities)
            return {}
    
    
    # vim: set ft=python ts=4 sts=4 sw=4 et:

    Now we have made available any extra vars in dictionary form, making it easy to figure out which environment and project we’re working on. We’ll run playbooks like this:

    ansible-playbook do-a-thing.yml -e env=staging -e project=canon

    Next, we’ll build up a configuration to specify which account should be used for different projects/environments. In my situation, the makeup was complex due to some projects having all environments in a single account and some accounts having more than one project, so I needed to model this in a reusable manner. This is the structure I came up with. aws_profiles is a dictionary where the keys are names of AWS CLI/SDK profiles (as configured in ~/.aws), and the values are dictionaries of extra vars to match on.

    ---
    aws_profiles:
      canon-staging:
        env:
          - stable
          - staging
        project: canon
      canon-production:
        env: production
        project: canon
      ops:
        env: ops
    
    # vim: set ft=yaml ts=2 sts=2 sw=2 et:

    Parsing this took a bit of thought, and some rubber ducking on zatech, but I eventually figured it out. This could probably be leaner but it balances well in my opinion. We store this configuration in vars_plugins/aws.yml, where the plugin can easily read it.

    from __future__ import (absolute_import, division, print_function)
    __metaclass__ = type
    
    DOCUMENTATION = '''
        vars: aws
        version_added: "2.5"
        short_description: Nothing useful yet
        description:
            - Is run by Ansible
            - Runs without error
            - Does nothing, returns nothing
        notes:
            - Nothing to note
    '''
    
    import argparse
    import os, re, yaml
    
    try:
        import boto3
        import botocore.exceptions
        HAS_BOTO3 = True
    except ImportError:
        HAS_BOTO3 = False
    
    
    def parse_cli_args():
        parser = argparse.ArgumentParser()
        parser.add_argument('-e', '--extra-vars', action='append')
        opts, unknown = parser.parse_known_args()
        args = dict()
        if opts.extra_vars:
            args['extra_vars'] = dict(e.split('=') for e in opts.extra_vars if '=' in e)
        return args
    
    
    def load_config():
        ''' Test for configuration file and return configuration dictionary '''
    
        DIR = os.path.dirname(os.path.realpath(__file__))
        with open(os.path.join(DIR, 'aws.yml'), 'r') as stream:
            try:
                config = yaml.safe_load(stream)
                return config
            except yaml.YAMLError as e:
                raise AnsibleParserError('Failed to read aws.yml: {0}'.format(e))
    
    
    class VarsModule(BaseVarsPlugin):
        def __init__(self, *args):
            super(VarsModule, self).__init__(*args)
            cli_args = parse_cli_args()
            self.extra_vars = cli_args.get('extra_vars', dict())
            self.config = load_config()
            self._connect_profiles()
            self._export_credentials()
    
    
        def _connect_profiles(self):
            for profile in self._profiles():
                self._init_session(profile)
    
    
        def _init_session(self, profile):
            if not hasattr(self, 'sessions'):
                self.sessions = dict()
            self.sessions[profile] = boto3.Session(profile_name=profile)
    
    
        def _credentials(self, profile):
            return self.sessions[profile].get_credentials().get_frozen_credentials()
    
    
        def _export_credentials(self):
            self.aws_profile = None
            profiles = self.config.get('aws_profiles', None)
    
            if isinstance(profiles, dict):
                profiles_list = profiles.keys()
            else:
                profiles_list = profiles
    
            credentials = {profile: self._credentials(profile) for profile in profiles_list}
    
            profile_override = os.environ.get('ANSIBLE_AWS_PROFILE')
            default_profile = None
            if profile_override:
                if profile_override in profiles:
                    default_profile = profile_override
            elif isinstance(profiles, dict) and self.extra_vars:
                for profile, rules in profiles.iteritems():
                    if isinstance(rules, dict):
                        rule_matches = {var: False for var in rules.keys()}
                        for var, vals in rules.iteritems():
                            if isinstance(vals, basestring):
                                vals = [vals]
                            if var in self.extra_vars and self.extra_vars[var] in vals:
                                rule_matches[var] = True
                        if all(m == True for m in rule_matches.values()):
                            default_profile = profile
                            break
    
            if default_profile:
                self.aws_profile = default_profile
                os.environ['AWS_ACCESS_KEY_ID'] = credentials[default_profile].access_key
                os.environ['AWS_SECRET_ACCESS_KEY'] = credentials[default_profile].secret_key
                os.environ['AWS_SECURITY_TOKEN'] = credentials[default_profile].token
                os.environ['AWS_SESSION_TOKEN'] = credentials[default_profile].token
    
            cleaner = re.compile('[^a-zA-Z0-9_]')
            for profile, creds in credentials.iteritems():
                profile_clean = cleaner.sub('_', profile).upper()
                os.environ['{}_AWS_ACCESS_KEY_ID'.format(profile_clean)] = creds.access_key
                os.environ['{}_AWS_SECRET_ACCESS_KEY'.format(profile_clean)] = creds.secret_key
                os.environ['{}_AWS_SECURITY_TOKEN'.format(profile_clean)] = creds.token
                os.environ['{}_AWS_SESSION_TOKEN'.format(profile_clean)] = creds.token
    
    
        def get_vars(self, loader, path, entities, cache=True):
            super(VarsModule, self).get_vars(loader, path, entities)
            return {}
    
    
    # vim: set ft=python ts=4 sts=4 sw=4 et:

    This got busy real quick, let’s break it down a little.

    At line 54, we read the configuration file and store a config dictionary.

    At line 55, we loop through each configured profile and instantiate a boto3 session for each one, saving the session objects as attributes on our module class.

    At line 56, the magic happens. First we retrieve temporary credentials for each of the connected profiles (line 83) – this includes the usual secret access key plus a session key. From line 85 We check for an environment variable ANSIBLE_AWS_PROFILE, a play on the usual AWS_PROFILE, which allows us to override the account selection when invoking Ansible. Should this not be specified, from line 90 we iterate the profile specifications to determine if the specified extra vars match any profile. If they do, default_profile is populated and from line 103 we export the earlier acquired credentials using the usual AWS_* environment variables. From line 110, credentials for all profiles are exported with prefixed environment variable names to allow us to override them on a per-task basis.

    This approach takes advantage of the fact that environment variables set here do propagate process-wide, and all Ansible modules running on the control host are able to see them, and will automatically use them to authenticate with AWS.

    For specific tasks where you know you’ll always run that task for one specific account, you can reference the corresponding prefixed environment variables to specify credentials for the module. For example:

    ---
    
    - hosts: localhost
      connection: local
      
      pre_tasks:
      
        - name: Validate extra vars
          assert:
            that:
              - env is defined
              - project is defined
              - name is defined
      
      tasks:
      
        - name: Launch EC2 instance
          ec2:
            assign_public_ip: yes
            group: external
            image: ami-aabbccde
            instance_tags:
              Name: "{{ name }}"
              env: "{{ env }}"
              project: "{{ project }}"
            instance_type: t2.medium
            keypair: ansible
            vpc_subnet_id: subnet-22334456
            wait: yes
          register: result_ec2
    
        - name: Create DNS record
          route53:
            aws_access_key: "{{ lookup('env', 'OPS_AWS_ACCESS_KEY_ID') | default(omit) }}"
            aws_secret_key: "{{ lookup('env', 'OPS_AWS_SECRET_ACCESS_KEY') | default(omit) }}"
            security_token: "{{ lookup('env', 'OPS_AWS_SECURITY_TOKEN') | default(omit) }}"
            command: create
            overwrite: yes
            record: "{{ name }}.example.net"
            value: "{{ item.public_ip }}"
            zone: example.net
          with_flattened:
            - "{{ result_ec2.results | map(attribute='instances') | list }}"
            - "{{ result_ec2.results | map(attribute='tagged_instances') | list }}"
    
    # vim: set ft=ansible ts=2 sts=2 sw=2 et:

    In this playbook, the ec2 task launches an instance in the account that was matched based on the env and project variables provided at runtime. The route53 task however, always creates a corresponding DNS record for the instance using the ops AWS profile.

    Wrapping up, I added all of this functionality and more to my Ansible AWS Vars Plugin which you can grab from GitHub and use/modify as much as you find it useful.