Fault Tolerant Multi AZ EC2, On a beer budget – live from AWS Meetup

Filmed on 18th of March 2021 at the Adelaide AWS User Group, where Arran Peterson presented on how to put together best practice (and cheap!) cloud architecture for business continuity. The title:

“Enterprise grade fault tolerant multi-AZ server deployment on a beer budget”

Recording

RATED ‘PG’ – Mild course language.

Presenter

Arran Peterson
Arran Peterson

Arran is an Infrastructure Consultant with a passion for Microsoft Unified Communications and the true flexibility and scalability of cloud-based solutions.
As a Senior Consultant, Arran brings his expertise in enterprise environments to work with clients around Microsoft Unified Communications product portfolio of Office 365, Exchange and Skype/Teams, along with expertise around transitioning to the cloud-based platforms including AWS, Azure and Google.

More Reading

Amazon Elastic Block Store

https://aws.amazon.com/ebs/

AWS Sydney outage prompts architecture rethink

https://www.itnews.com.au/news/aws-sydney-outage-prompts-architecture-rethink-420506

Chalice Framework

https://aws.github.io/chalice/

Adelaide AWS User Group

https://www.meetup.com/en-AU/Amazon-Web-Services-User-Group-Adelaide/events/276728885/


Pimp my VS Code

Those who know me, know that I have a keen interest in software tools and exploring the various different ways that people use them. I take great joy in exploring custom or 3rd party plugins and add-ons to get the most out of the tools I use every day. From OS automation tools (like BetterTouchTool) to custom screen savers (Brooklyn is my current favourite), I love it all.

On a good day, I spend quite a bit of time in Visual Studio Code, my IDE of choice. VS Code has all that you need right out of the box, but why stop there? Heres a list of some of my favourite VS Code Extensions that I now consider essential when doing a fresh install.

Indent-Rainbow and Bracket Pair Colorizer 2 are must installs for me. Both really simple, change colours of indents and brackets so you can easily see them at a glance. Always useful when working with ident heavy languages like YAML.

GitLense is another essential if you are working with Git repositories. GitLense integrates lots of various Git tools and information into the editor. My favourite feature of GitLense is the current line blame, you can see it in the screenshot above which shows an unobtrusive annotation at the end of each line as you select it. The annotation shows commit information for that piece of code.

Beautify helps you make your code beautiful. Beautify can automatically indent Javascript, JSON, CSS, and HTML.

Better Comments makes your comments human readable by changing the colour of comments based on an opening tag. You can even define your own.

Source: Better Comments Documentation in Visual Studio Code

Next up, some extensions that I install to match the work I’m doing. In my day to day work, I’m regularly authoring infrastructure templates for Azure and AWS (ARM and CloudFormation). To assist with making this as simple as possible I install some specific extensions for syntax highlighting, autocompletion and even do some code snippet referencing.

Azure Resource Manager (ARM) Tools is a collection of extensions for working with Azure made by Microsoft. This one has lots of features so I’ll just pick a few. You can use the ‘arm!’ shortcut to create a blank ARM template with all the property you need – this one makes life so much better, spend less time lining up brackets in JSON and more time defining resources!

Image showing the arm template scaffolding snippet
Source: Azure Resource Manager (ARM) Tools Documentation in Visual Studio Code

Each time you use a snippet, you can also use tab complete to go through commonly modified configurations, again, less time reading documentation more time writing code!

Image showing snippet tab stops with pre-populated parameter values
Source: Azure Resource Manager (ARM) Tools Documentation in Visual Studio Code

CloudFormation Template Linter and CloudFormation Resource Snippets add some similar functionality for working with AWS CloudFormation templates. While neither of these are created by Amazon, they both do a good job at implementing similar functionality to the above ARM tools.

Next up is one of my new favourites, Dash, sorry Windows guru’s this one’s only on Mac. Dash is an API documentation browser which can hook into your VS code to quickly search documentation (from their 200+ built in doc sets, or add your own GitHub doc sets). Sounds boring, but I think it’s far from it. I’ve loaded mine up with lots of Microsoft Azure Documentation and AWS documentation. It’s really handy to be able to highlight a resource type or PowerShell Command, hit control + H and have the document reference instantly pop up, each time it saves me minutes.

Dash - Visual Studio Marketplace
Source: Dash Documentation in Visual Studio Code

Finally, my icon and colour theme I use VSCode Icons and Atom One Dark. This really comes down to personal preference. I like the syntax colour coding included in the Atom One Dark theme, I find it useful especially when writing PowerShell. VSCode icons is the most popular icons extension, and I’ve had no issues since installation.

Source: Atom One Dark Theme Documentation in Visual Studio Code

Thats my round up for my must have extensions. Are there any missing off this list that you think should be here? – Comment below with your must have extensions.

Cheers, Joel


Cognito authentication integration with Django using authorization code grant.

Note: Assumed knowledge of AWS Cognito backend configuration and underlying concepts, mostly it’s just the setup from an application integration perspective that is talked about here.

Recently we have been working on a Django project where a secure and flexible authentication system was required, as most of our existing structure is on AWS we chose Cognito as the backend.

Below are the steps we took to get this working and some insights learned on the way.

Django Warrant

The first attempt was using django_warrant, this is probably going to be the first thing that comes up when you google ‘how to django and cognito’.

Django_warrant works by injecting an authentication backend into django which does some magic that allows your username/password to be submitted and checked against a configured user pool, on success it authenticates you and if required creates a stub django user.

The basics of this were very easy to get working and integrated but had a few issues such as:

  • We still see username/password requests and have to send them on.
  • By default can only be configured for one user pool.
  • Does not support federated identity provider workflows.
  • Github project did not seem super active or updated.

Ultimately we chose not to use this module, however inspiration was taken from its source code to do some of the user handling stuff we implemented later on.

Custom authorization_code workflow implementation

This involves using the cognito hosted login form, which does both user pool and connected identity provider authentication (O365/Azure, Google, Facebook, Amazon) .

The form can be customised with HTML, CSS, images and put behind a custom URL, other aspects of the process and events can be changed and reacted upon using triggers and lambda.

Once you are authenticated in cognito it redirects you back to the page of your choosing (usually your applications login page or custom endpoint) with a set of tokens, using these tokens you then grab the authenticated users details and authenticate them within the context of your app.

The difference between authorization code grant and implicit grant are:

  • Implicit grant
    • Intended for client side authentication (javascript applications mostly)
    • Sends both the id_token (JWT) and acccess_token in the redirect response
    • Sends the tokens with an #anchor before them so it is not seen by the web server
    • https://your-app/login#id_token=n&auth_token=n
  • Authorization code grant
    • Intended for server side authentication
    • Sends a authorization code in the redirect response
    • Sends this as a normal GET parameter
    • https://your-app/login?code=n
    • Your application holds a preconfigured secret
    • Code + secret get turned into id_token token and access_token via oauth2/token endpoint

We chose to use the authorization code grant workflow, it takes a bit more effort to setup but is generally more secure and alleviates any hacky javascript shenanigans that would be needed to get implicit grant working with a django server based backend.

After these steps you can use boto3 or helpers to turn those tokens into a set of attributes (email, name, other custom attributes) kept by cognito. Then you simply hook this up to your internal user/session logic by matching them with your chosen attributes like email, username etc.

I was unable to find any specific library support to handle some aspects of this, like the token handling in python or the django integration so i have included some code which may be useful.

Code

This can be integrated into a view to get the user details from Cognito based on a token, this will be sitting at the redirect URL that cognito returns from.

import warrant
import cslib.aws

def tokenauth(request):
    authorization_code = request.GET.get("code")
    token_grabber = cslib.aws.CognitoToken(
        <client_id>
        <client_secret>
        <domain>
        <redir>
        <region>?
    )

    id_token, access_token = token_grabber.get(authorization_code)

    if id_token and access_token:
        # This uses warrant (different than django_warrant)
        # A helper lib that wraps cognito
        # Plain boto3 can do this also.  
        cognito = warrant.Cognito(
            <user_pool_id>
            <client_id>
            id_token=id_token,
            access_token=access_token,
        )

        # Their lib is a bit broken, because we dont supply a username it wont
        # build a legit user object for us, so we reach into the cookie jar....
        # {'given_name': 'Joe', 'family_name': 'Smith', 'email': 'joe@jtwo.solutions'}
        data = cognito.get_user()._data
        return data
    else:
        return None

Class that handles the oauth/token2 workflow, this is mysteriously missing from the boto3 library which seems to handle everything else quite well…

from http.client import HTTPSConnection
from base64 import b64encode
import urllib.parse
import json

class CognitoToken(object):
    """
    Why you no do this boto3...
    """
    def __init__(self, client_id, client_secret, domain, redir, region="ap-southeast-2"):
        self.client_id = client_id
        self.client_secret = client_secret
        self.redir = redir
        self.token_endpoint = "{0}.auth.{1}.amazoncognito.com".format(domain, region)
        self.token_path = "/oauth2/token"

    def get(self, authorization_code):
        headers = {
            "Authorization" : "Basic {0}".format(self._encode_auth()),
            "Content-type": "application/x-www-form-urlencoded",
        }

        query = urllib.parse.urlencode({
                "grant_type" : "authorization_code",
                "client_id" : self.client_id,
                "code" : authorization_code,
                "redirect_uri" : self.redir,
            }
        )

        con = HTTPSConnection(self.token_endpoint)
        con.request("POST", self.token_path, body=query, headers=headers)
        response = con.getresponse()

        if response.status == 200:
            respdata = str(response.read().decode('utf-8'))
            data = json.loads(respdata)
            return (data["id_token"], data["access_token"])

        return None, None

    def _encode_auth(self):
        # Auth is a base64 encoded client_id:secret
        string = "{0}:{1}".format(self.client_id, self.client_secret)
        return b64encode(bytes(string, "utf-8")).decode("ascii")

Further reading


Tagging EC2 EBS Volumes in Auto Scaling Groups

Tagging becomes a huge part of your life when in the public cloud. Metadata is thrown around like hotcakes, and why not. At cloudstep.io we preach the ways of the DevOps gods and especially infrastructure as code for repeatable and standardised deployments. This way everything is uniform and everything gets a TAG!

I ran into an issue recently where I would build an EC2 instance and capture the operating system into an AMI as part of a CloudFormation stack. This AMI would then be used as part of a launch configuration and subsequent auto scaling group. The original EC2 instance had every tag needed across all parts that make up the virtual machine including:

  • EBS root volume
  • EBS data volumes
  • Elastic Network Interfaces (ENI)
  • EC2 Instance itself

When deploying my auto scaling group all the user level tags I’d applied had been removed from the volumes and ENI. This caused a few issues:

  1. EBS volumes couldn’t be tagged for billing.
  2. EBS volumes couldn’t be snapped based on tag level policies in Lifecycle Manager.
  3. Objects didn’t have a ‘Name’ tag which made it hard in the console to understand which virtual machine instance the object belonged too.

There are two methods I derived to add my tags back that I’ll share with you. The tags needed to be added upon launch of the instance when the auto scaling group added a server. The methods I used were:

  1. The auto scaling group has a Launch Configuration where the ‘User data’ field runs a script block at startup.
  2. Initiate a Lambda whenever CloudTrail logged an API reference of a launch event of an instance using CloudWatch.

Tagging with the User Data property and PowerShell

User data is simply:

When you launch an instance in Amazon EC2, you have the option of passing user data to the instance that can be used to perform common automated configuration tasks and even run scripts after the instance starts. You can pass two types of user data to Amazon EC2: shell scripts and cloud-init directives.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html

Try {
 # Use the metadata service to discover which instance the script is running on
 $InstanceId = (Invoke-WebRequest '169.254.169.254/latest/meta-data/instance-id').Content
 $AvailabilityZone = (Invoke-WebRequest '169.254.169.254/latest/meta-data/placement/availability-zone').Content
 $Region = $AvailabilityZone.Substring(0, $AvailabilityZone.Length -1)
 $mac = (Invoke-WebRequest '169.254.169.254/latest/meta-data/network/interfaces/macs/').content
 $URL = "169.254.169.254/latest/meta-data/network/interfaces/macs/"+$mac+"/interface-id"
 $eni = (Invoke-WebRequest $URL).content
# Get the list of volumes attached to this instance
 $BlockDeviceMappings = (Get-EC2Instance -Region $Region -Instance $InstanceId).Instances.BlockDeviceMappings
 $Tags = (Get-EC2Instance -Region $Region -Instance $InstanceId).Instances.tag

 }
Catch{Write-Host "Could not access the AWS API, are your credentials loaded?" -ForegroundColor Yellow}
$BlockDeviceMappings | ForEach-Object -Process {
        $volumeid = $_.ebs.volumeid # Retrieve current volume id for this BDM in the current instance
        # Set the current volume's tags
        $Tags | ForEach-Object -Process {
        If($_.Key -notlike "aws:*"){
        New-EC2Tag -Resources $volumeid -Tags @{ Key = $_.Key ; Value = $_.Value } # Add tag to volume
        }
        }
}
# Set the current nics tag
$Tags | ForEach-Object -Process {
  If($_.Key -notlike "aws:*"){
        New-EC2Tag -Resources $eni -Tags @{ Key = $_.Key ; Value = $_.Value } # Add tag to eni
  }
}


This script block is great and works a treat with newly created instances from an Amazon Marketplace AMI’s e.g. a vanilla Windows Server 2019 template. The launch configuration would apply the script as a part of the cfn-init function at startup. Unfortunately I’d already used the cfn-init function as part of the original image customisation and capture, the cfn-init would not re-run and didn’t execute this script block. So back to the drawing board in my scenario.

Tagging with CloudWatch and Lambda Function

The second solution was to create a Lambda function and trigger it using an Amazon CloudWatch Events rule. The Instance ID is parsed from the CloudWatch event in JSON to the Lambda function.

Here is the Lambda function that is written in python2.7 and leverages the boto3 and JSON modules.



from __future__ import print_function
import json
import boto3
def lambda_handler(event, context):
  print('Received event: ' + json.dumps(event, indent=2))
  ids = []
  try:
      ec2 = boto3.resource('ec2')
      items = event['detail']['responseElements']['instancesSet']['items']
      for item in items:
        ids.append(item['instanceId'])
      base = ec2.instances.filter(InstanceIds=ids)
      for instance in base:
        ec2tags = instance.tags
        tags = [n for n in ec2tags if not n["Key"].startswith("aws:") ]
        print('   original tags:', ec2tags)
        print('   applying tags:', tags)
        for volume in instance.volumes.all():
          print('    volume:', volume)
          if volume.tags != ec2tags:
            volume.create_tags(DryRun=False, Tags=tags)
        for eni in instance.network_interfaces:
          print('    eni:', eni)
          eni.create_tags(DryRun=False, Tags=tags)
      return True
  except Exception as e:
    print('Something went wrong: ' + str(e))
    return False   



The OVF package is invalid and cannot be deployed – In the trenches with the AWS Discovery Connector

I was working with a customer recently who had trouble deploying the AWS Discovery Connector to their VMware environment. AWS offer this appliance as an OVA file. For those who aren’t aware, OVA (Open Virtualisation Archive) is an open standard used to describe virtual infrastructure to be deployed on a hypervisor of your choice. Typically speaking, these files are hashed with an algorithm to ensure that the contents of the files are not changed or modified in transit (prior to being deployed within your own environment.)

At the time of writing, AWS currently offer this appliance hashed in two flavours… MD5 or SHA256. All sounds quite reasonable right?

  • Download the OVA with a hash of your choice
  • Deploy to VMware.
  • Profit???

Wrong! I was surprised to receive an email from my customer stating that their deployment had failed (see below.)

There’s a small clue here…

The Solution

My immediate response was to fire up google and do some reading. Surely someone had blogged about this before? After all…. I am no VMware expert. I finally arrived at the VMware knowledge base, where I began sifting through supported ciphers for ESX/ESXi and vCenter. The findings were quite interesting, you can find them summarised below:

  • If your VMware cluster consists of hosts which run ESX/ESXi 4.1 or less (hopefully no one) – MD5 is supported
  • If your VMware cluster consists of hosts which run ESX/ESXi 5.x or 6.0 – SHA1 is supported
  • If your VMware cluster consists of hosts which run ESX/ESXi 6.5 or greater – SHA256 is supported

In the particular environment I was working in, the customer had multiple environments with a mix of 5.5 and 6.0 physical hosts. As I was short on time, I had no real way of telling if the MD5 hashed image would deploy on a newer environment. I also don’t have a VMware development environment to test this approach on (by design.)

After a few more minutes of googling, I was rewarded with another VMware knowledge base article. VMware provide a small utility called “OVFTool.” This applications sole purpose in life is to convert OVA files (you guessed it) ensuring that they are hashed with supported cipher of your choice. In my scenario, the file was re-written using the supported SHA1 cipher. All of this was triggered from a windows command line by executing:

ovftool.exe –shaAlgorithm=SHA1 <source image.ova> <destination image.ova>

After this I was able to successfully deploy the AWS Discovery Connector OVA as expected using my freshly minted image.

You can grab a copy of the tool – here

You can read more about VMware supported ciphers – here

Finally, I should call out that this solution is not specific to deploying the AWS Discovery Connector. Consider this approach if you are experiencing similar symptoms deploying another OVA based appliance in your VMware environment.