Automating Azure Site Recovery with PowerShell

In a recent consulting engagement, I’ve needed to perform a large-scale migration of a company’s virtual machine (VM) fleet from an On-premise datacenter to Microsoft Azure. Thinking about what that actually means – We’re picking up many compute workloads that are (in most cases) essential for day to day business operation and re-homing them to a new slice of a Microsoft-managed datacenter. After coming out the other end and completing the project, I thought I would shed some light on the tools that I used and developed to make the vision a reality.

In this particular engagement the customer is a large enterprise with a VMware environment servicing 300+ VM’s. When we consider the business value behind each of these compute workloads, it quickly becomes apparent that selecting the right tooling and approach is vital to deliver a successful outcome whilst causing as little disruption to the business as possible.

Enter Azure Site Recovery

Azure Site Recovery (ASR) is Microsoft’s Disaster Recovery as a Service solution which can replicate workloads running on physical and virtual machines from one location to a secondary location. As a disaster recovery platform, it’s possible for workloads to failover and successfully failback in a disaster scenario. ASR can also be used to migrate workloads to Azure by completing the failover component without failing back.

Why Should We Automate Azure Site Recovery?

I like to automate things like this because a computer following a process that someone writes will always perform it the same way. We can expect what the output will look like. In this case that means a like for like VM that looks and feels like it did in its previous life, before being migrated. When we introduce an operator into the mix we also introduce the human element. Things like resource names and groups, VM specs, disk settings, network location and ip addresses all need to be configured for each VM migration.

To have success running migrations at scale, it is important to use known, well-tested, repeatable processes. For me, that means figuring out the best way to use a tool then automating it so that you (or anyone else) can use it the right way, everytime, easily. 

How Can We Automate Azure Site Recovery?

I use PowerShell as an automation tool on top of ASR for a couple of reasons. The main reason being that Microsoft provide and maintain a set of PowerShell modules for interacting with Azure resources, including ASR. This is known as the Az module – See our previous post on the Azure PowerShell Az module for a deeper explanation.  PowerShell can also run almost anywhere thanks to PowerShell Core, a cross-platform edition of PowerShell that runs on Windows, macOS and Linux.

Armed with PowerShell and the Az module, we can get cracking on with the fun stuff – bashing out some lines of code. My approach and methodology here usually involve a fair bit of back and forth, playing with the commands that are available to me and learning the best ways to drive them. Importantly, you don’t want to do this with live data, setting up an isolated sandpit with dummy data will go a long way in allowing you to upskill knowledge around the tools while making sure your production systems remain untouched.

Once I’ve got a handle on the commands that are needed and how they fit together, I make a MVP (minimum viable product) script. The idea here is to demonstrate that its possible for the tooling to work (it’s not pretty but it works). To paint a picture, one of my MVP scripts will usually have a bunch of variables at the start declaring all the info that is required, things like VM name, source location, target location etc. From here, I usually design the script to be ran line by line, this is mostly for simplicity sake, complexity can come later, right now it just needs to be as simple as possible. At this stage, we can demonstrate our capability to perform a migration with PowerShell. A quick example of this is setting up a replication job, preceding this line, I do a series of get statements to build up all the variables seen in the command bellow.

$replicationJob = New-AzRecoveryServicesAsrReplicationProtectedItem -VMwareToAzure -ProtectableItem $vm -Name (New-Guid).Guid -ProtectionContainerMapping $replicationPolicy -ProcessServer $ProcessServer -Account $Account -RecoveryResourceGroupId $ResourceGroup.ResourceId -logStorageAccountId $LogStorageAccount.Id -RecoveryAzureNetworkId $vnet.Id -RecoveryAzureSubnetName $failoverSubnetName

From here, I like to put some lipstick on it and make it feel like a more polished product. Personally, I like to use a series of questions and prompts to generate the variables I described in the last paragraph. I also add status checks and operator prompts to continue. An example of this could be when performing a failover, once the operator confirms he is ready to begin, the command executes the failover, then continuously checks the failover job status until it has completed, once completed, tell the user running the script that its complete. Here is an example of a status check that I wrote for checking the progress of a failover job.

do {
    Clear-Host
    Write-Host "======== Monitoring Failover ========"
    Write-Host "This will refresh every 5 seconds."
    try {
        $failoverJob = Get-ASRJob -Name $failoverJob.Name
    }
    catch {
        Write-Host -ForegroundColor Red "ERROR - Unable to get status of Failover job"
        Write-Host -ForegroundColor Red "ERROR - " + $_
        log "ERROR" "Unable to get status of Failover job"
        log "ERROR" $_
        exit 
    }
    Write-Host "Failover status for $($VMName.FriendlyName) is $($failoverJob.State)"
    sleep 5;
} while (($failoverJob.State -eq "InProgress") -or ($failoverJob.State -eq "NotStarted"))

Once you get this far, the sky is the limit. Like most things, it can evolve over time. I like to add error handling and logging so we can elegantly handle a failure and have an audit trail of operations. I take this approach with most of the processes I automate, I think it’s important to start small and work up from there.


Azure PowerShell ‘Az’ Module

https://azure.microsoft.com/en-us/blog/azure-powershell-az-module-version-1/

Microsoft released a new PowerShell module specifically for Azure late last year called “Az”. On the plus side Az ensures that Windows PowerShell and PowerShell Core users can get the latest Azure tooling from PowerShell on every platform be it Windows PowerShell or PowerShell Core for my preferred operating system macOs.

Microsoft state that the Az module will be updated on a two-week cadence and will always be up-to-date, so that’s nice.

I’ve resisted upgrading to the new Az module until the completion of a recent customer engagement so as to avoid any complexity that a switch in modules may introduce. Call me risk adverse I know. . .So now that the project is complete, I’m excited to make the switch.

Ok so how do I upgrade from AzureRM to Az?

If you’ve been using PowerShell for Azure, you undoubtedly already have the AzureRM module installed. So its out with the old and in with the new. . . To accomplish this task I made use of some simple PowerShell to find the modules installed with a name like AzureRM and then uninstall them. Here is the code I lazily leached from my colleague Arran Peterson after he successfully uninstalled the old modules.

Remove all the old AuzreRM modules first . . .

$azurerm = get-module -ListAvailable | ? {$_.Name -like “AzureRM*”}

ForEach ($module in $azurerm) {

$name = $module.Name

$version = $module.Version

Uninstall-Module -Name $Name -MaximumVersion $version -Force

}

At the time of writing this blog the latest version available from the PowerShell Gallery is 1.5.0 https://www.powershellgallery.com/packages/Az/1.5.0

To install the module simply open PowerShell on your machine and enter:

Install-Module -Name Az

Boom its that easy. . .

Ok Great, but wont this break all my scripts?

So when I first heard of the new module and the change in cmdlet namespace, my first reaction was shock. .  I’ve produced loads of PowerShell for customers over the past couple of years that use the “azurerem” named cmdlets.

Microsoft state on their PowerShell Az blog that ‘Users are not required to migrate from AzureRM, as AzureRM will continue to be supported. However, it is important to note that all new Azure PowerShell features will appear only in ‘Az’.’  So my old stuff would continue to work, but they also state ‘Az and AzureRM cannot be executed in the same PowerShell session.’ So I’d need to make customers aware that they cannot mix AzureRm and Az cmdlets within a single session.

This all sounds like a bunch of annoying conversations and explanations I’d be faced with, I began to feel frustrated and was questioning why Microsoft saw the need to rename all of their cmdlets. I could feel a hate blog brewing. . .

However, as I read more I came across a diamond in the rough. . .AzureRM Aliases. Ah someone at Microsoft has considered my pain. . I could feel the catharsis as I read the official migration guide https://github.com/Azure/azure-powershell/blob/master/documentation/migration-guides/Az.1.0.0-migration-guide.md and came across the following statement. ‘To make the transition to these new cmdlet names simpler, Az introduces two new cmdlets, Enable-AzureRmAlias and Disable-AzureRmAlias. Enable-AzureRmAlias creates aliases from the older cmdlet names in AzureRM to the newer Az cmdlet names. The cmdlet allows creating aliases in the current session, or across all sessions by changing your user or machine profile.’

What Now?

Its time for a coffee then back to more PowerShell. . Happy Days. .


Tagging EC2 EBS Volumes in Auto Scaling Groups

Tagging becomes a huge part of your life when in the public cloud. Metadata is thrown around like hotcakes, and why not. At cloudstep.io we preach the ways of the DevOps gods and especially infrastructure as code for repeatable and standardised deployments. This way everything is uniform and everything gets a TAG!

I ran into an issue recently where I would build an EC2 instance and capture the operating system into an AMI as part of a CloudFormation stack. This AMI would then be used as part of a launch configuration and subsequent auto scaling group. The original EC2 instance had every tag needed across all parts that make up the virtual machine including:

  • EBS root volume
  • EBS data volumes
  • Elastic Network Interfaces (ENI)
  • EC2 Instance itself

When deploying my auto scaling group all the user level tags I’d applied had been removed from the volumes and ENI. This caused a few issues:

  1. EBS volumes couldn’t be tagged for billing.
  2. EBS volumes couldn’t be snapped based on tag level policies in Lifecycle Manager.
  3. Objects didn’t have a ‘Name’ tag which made it hard in the console to understand which virtual machine instance the object belonged too.

There are two methods I derived to add my tags back that I’ll share with you. The tags needed to be added upon launch of the instance when the auto scaling group added a server. The methods I used were:

  1. The auto scaling group has a Launch Configuration where the ‘User data’ field runs a script block at startup.
  2. Initiate a Lambda whenever CloudTrail logged an API reference of a launch event of an instance using CloudWatch.

Tagging with the User Data property and PowerShell

User data is simply:

When you launch an instance in Amazon EC2, you have the option of passing user data to the instance that can be used to perform common automated configuration tasks and even run scripts after the instance starts. You can pass two types of user data to Amazon EC2: shell scripts and cloud-init directives.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html

Try {
 # Use the metadata service to discover which instance the script is running on
 $InstanceId = (Invoke-WebRequest '169.254.169.254/latest/meta-data/instance-id').Content
 $AvailabilityZone = (Invoke-WebRequest '169.254.169.254/latest/meta-data/placement/availability-zone').Content
 $Region = $AvailabilityZone.Substring(0, $AvailabilityZone.Length -1)
 $mac = (Invoke-WebRequest '169.254.169.254/latest/meta-data/network/interfaces/macs/').content
 $URL = "169.254.169.254/latest/meta-data/network/interfaces/macs/"+$mac+"/interface-id"
 $eni = (Invoke-WebRequest $URL).content
# Get the list of volumes attached to this instance
 $BlockDeviceMappings = (Get-EC2Instance -Region $Region -Instance $InstanceId).Instances.BlockDeviceMappings
 $Tags = (Get-EC2Instance -Region $Region -Instance $InstanceId).Instances.tag

 }
Catch{Write-Host "Could not access the AWS API, are your credentials loaded?" -ForegroundColor Yellow}
$BlockDeviceMappings | ForEach-Object -Process {
        $volumeid = $_.ebs.volumeid # Retrieve current volume id for this BDM in the current instance
        # Set the current volume's tags
        $Tags | ForEach-Object -Process {
        If($_.Key -notlike "aws:*"){
        New-EC2Tag -Resources $volumeid -Tags @{ Key = $_.Key ; Value = $_.Value } # Add tag to volume
        }
        }
}
# Set the current nics tag
$Tags | ForEach-Object -Process {
  If($_.Key -notlike "aws:*"){
        New-EC2Tag -Resources $eni -Tags @{ Key = $_.Key ; Value = $_.Value } # Add tag to eni
  }
}


This script block is great and works a treat with newly created instances from an Amazon Marketplace AMI’s e.g. a vanilla Windows Server 2019 template. The launch configuration would apply the script as a part of the cfn-init function at startup. Unfortunately I’d already used the cfn-init function as part of the original image customisation and capture, the cfn-init would not re-run and didn’t execute this script block. So back to the drawing board in my scenario.

Tagging with CloudWatch and Lambda Function

The second solution was to create a Lambda function and trigger it using an Amazon CloudWatch Events rule. The Instance ID is parsed from the CloudWatch event in JSON to the Lambda function.

Here is the Lambda function that is written in python2.7 and leverages the boto3 and JSON modules.



from __future__ import print_function
import json
import boto3
def lambda_handler(event, context):
  print('Received event: ' + json.dumps(event, indent=2))
  ids = []
  try:
      ec2 = boto3.resource('ec2')
      items = event['detail']['responseElements']['instancesSet']['items']
      for item in items:
        ids.append(item['instanceId'])
      base = ec2.instances.filter(InstanceIds=ids)
      for instance in base:
        ec2tags = instance.tags
        tags = [n for n in ec2tags if not n["Key"].startswith("aws:") ]
        print('   original tags:', ec2tags)
        print('   applying tags:', tags)
        for volume in instance.volumes.all():
          print('    volume:', volume)
          if volume.tags != ec2tags:
            volume.create_tags(DryRun=False, Tags=tags)
        for eni in instance.network_interfaces:
          print('    eni:', eni)
          eni.create_tags(DryRun=False, Tags=tags)
      return True
  except Exception as e:
    print('Something went wrong: ' + str(e))
    return False   



Add VC Accounts to Microsoft Teams Channels with Azure Automation

At cloudstep.io® HQ Microsoft Teams is a big part of how we organise digital asset structure in the business. We are a consulting firm by trade, as new prospects become paying customers we add them as a team. The default General channel is used for administration and accounts, additional channels are created per project name or scope of works. We find ourselves no longer needing to going into dark corners of SharePoint administration (commonly referred to in the office as ‘SwearPoint!’). We have adopted Microsoft Teams as our preferred web, audio and video conferencing platform for internal and external meetings. Our board room video conferencing unit runs a full version of Windows 10 and Microsoft Teams that we setup as a ‘do it yourself‘ versus the off the shelf room systems. The VC unit requirements we had were:

  • cloudstep.io®, our web application uses a full desktop browser experience.
  • Mouse and keyboard are preferred for web navigation inside the app.
  • VC to have full OS is preferred to eliminate employees having to BYOD and connect either physically or wirelessly for screen presentation.
  • We can connect to third party conferencing platforms by installing the addons for guest access (zoom, webex, gotomeeting, chime etc) with our partner lead meetings direct onto the machine.
  • Wirelessly present our Macs, iPads, iPhones, Androids and Windows laptops.
  • We are all ‘power users‘ and can handle the meeting join experience in Microsoft Teams client without the need for a single ‘click-to-join’ button on the table which the Microsoft Teams Room (MTR) system provides via a touch device.

We have a boardroom account that has a 365 license to be able to leverage the desktop tools. Windows 10 automatically logs in each morning and the Microsoft Teams client is started automatically. The bill of materials is notably:

  • Intel NUC
  • Windows 10
    • Teams Client
    • Office 365 Pro Plus (Word, Excel, PowerPoint, OneNote)
    • Windows 10 Calendar (Connect to Office 365 Mailbox)
    • AirServer client (ChromeCast, MiraCast, AirPlay)
    • Chrome Browser
  • Office 365 user license
  • Logitech Meetup camera
  • Biggest screen we could fit in the room
  • Microsoft Bluetooth keyboard and mouse

The VC mailbox type is set to ‘room‘ with the following to enhance the experience for scheduled meetings in the board room:


#Add tips when searching in Outlook
Set-Mailbox -Identity $VC -MailTip "This room is equipped to support native Teams & Skype for Business Meetings remember to add meeting invite details via the Teams outlook button in the ribbon." 

#Auto Accept
Set-CalendarProcessing -Identity $VC -AutomateProcessing AutoAccept -AddOrganizerToSubject $false -RemovePrivateProperty $false -DeleteComments $false -DeleteSubject $false –AddAdditionalResponse $true –AdditionalResponse "Your meeting is now scheduled and if it was enabled as a Teams Meeting will be available from the conference room client."

This has worked well in the last 12 months, the only user experience problem we have had is when running a meeting from the VC unit, the account isn’t a member of the team where the data attempting to be presented is stored and therefor cannot see/open the content. A simple solution for this is automation. We looked to investigated two automation solutions available in the Microsoft services offering we have available.

  1. Flow (Office 365 Suite)
  2. Azure Automation (Azure Subscription)

Unfortunately option 1 didn’t have any native integration for triggers based on Office 365 groups or teams creation. So we resorted to a quick Azure Powershell Runbook that executes on a simple schedule. The steps needed to run were:

  1. Get a list of all the teams.
  2. Query them against the UnifiedGroup properties to get…
    1. AccessType equals ‘Public
    2. CreationDate within 2 days
  3. Check the newly created teams group membership for the VC unit username.
  4. If it doesn’t exist add the VC unit as the role ‘member‘.
Write-verbose "Getting Credentials ..." -Verbose
$Credentials = Get-AutomationPSCredential -Name 'Admin-365'
Write-verbose  "Credential Imported : $($Credentials.UserName)" -Verbose

$o365Cred = New-Object System.Management.Automation.PSCredential ($Credentials.UserName, $Credentials.Password)
Write-verbose  "Credential Loaded : $($o365Cred.UserName)" -Verbose
Write-verbose 'Connecting to 365 ...' -Verbose
$Session = New-PSSession –ConfigurationName Microsoft.Exchange -ConnectionUri https://outlook.office365.com/powershell-liveid/ -Credential $Credentials -Authentication Basic -AllowRedirection
Write-verbose 'Importing UnifiedGroups PowerShell Commands ...' -Verbose
Import-PSSession -Session $Session -DisableNameChecking:$true -AllowClobber:$true | Out-Null
Write-verbose 'Connecting to Teams ...' -Verbose
Connect-MicrosoftTeams -Credential $Credentials

$creationdate = ((Get-Date).AddDays(-2))
$teams = get-team
#$groups = Get-UnifiedGroup |Where-Object {$_.WelcomeMessageEnabled -like "False" -and $_.AccessType -like "Public" -and $_.WhenCreated -ge $creationdate}
$TeamsOutput = @()
foreach ($Team in $Teams){
$UnifiedGroup = Get-UnifiedGroup -Identity $Team.GroupId
    if($UnifiedGroup.AccessType -like "Public" -and $UnifiedGroup.WhenCreated -ge $creationdate){
    Write-verbose "Processing team named: $($UnifiedGroup.DisplayName)" -Verbose
        $VC = Get-TeamUser -GroupId $Team.GroupId | Where-Object {$_.User -like "user@domain.com"} 
        If($VC.count -eq 0){
            Write-verbose "VC not member, adding..." -Verbose
            Add-TeamUser -GroupId $Team.GroupId -User "user@domain.com" -Role Member
        }else{Write-verbose "VC is member already" -Verbose}
    }

$TeamsOutput+=$UnifiedGroup
}
Write-verbose "Total teams processed for selection: $($TeamsOutput.Count)" -Verbose 

The result is simple

Additional member added via PowerShell

Next day the board room account is logged in, the Microsoft Teams client will have access to all the teams channels, files, OneNote and apps. This is great for native Teams meetings, but also when we have customers in the board room without the need for an online meeting. The VC account has access to see the required teams and channel data to present to the physical display.

This solution doesn’t have to be for a video conferencing units, you may have some standardised members you want on all groups, or it could be certain owner enforcement or member list.

Hello Microsoft Teams! Bye bye SwearPoint, may you remain in the background forever.


Using the AWS CLI for Process Automation

Amazon Web Services is a well established cloud provider. In this blog, I am going to explore how we can interface with the orange cloud titan programmatically. First of all, lets explore why we may want to do this. You might be thinking “But hey, the folks at AWS have built a slick web interface which offers all the capability I could ever need.”Whilst this is true, repetitive tasks quickly become onerous. Additionally, manual repetition introduces the opportunity to introduce human error. That sounds like something we should avoid, right? After all, many of the core tenets of the DevOps movement is built on these principles (“To increase the speed, efficiency and quality of software delivery”– amongst others.)

From a technology perspective, we achieve this by establishing automated services. This presents a significant speed advantage as automated processes are much faster than their manual counterparts. The quality of the entire release process improves because steps in the pipeline become standardised, thus creating predictable outcomes.

Here at cloudstep, this is one of our core beliefs when operating a cloud infrastructure platform. Simply put, the portal is a great place to look around and check reporting metrics. However, any services should be provisioned as code. Once again, to realise efficiency and improve overall quality.

How do we go about this and what are some example use cases?”

AWS provide an open source CLI bundle which enables you to interface directly with their public API’s. Typically speaking, this is done using a terminal of your choice (Linux shells, Windows Command Line, PowerShell, Puty, Remotely.. You name it, its there.) Additionally, they also offer SDK’s which provide a great starting point for developing applications on-top of their services in many different languages (PowerShell, Java, .NET, JavaScript, Ruby, Python, PHP and GO.)   

So lets get into it… The first thing you’ll want to do is walk through the process of aligning your operating environment with any mandatory prerequisites, then you can get install the AWS CLI tools in a flavour of your choice. The process is well documented, so I wont cover it off here.

Link – https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html

Once you have the tools installed, you will need to provide the CLI tools with a base level of configuration which is stored in a profile of your choice. Running “AWS Configure” from a terminal of your choice is the fastest way to do this. Here you will provide IAM credentials to interface with your tenant, a default region and an output format. For the purpose of this example I’ve set my region to “ap-southeast-2” and my output format to “JSON.”

aws configure example

From here I could run “aws ec2 describe-instances” to validate that my profile had been defined correctly within the AWS CLI tools. The expected return would be a list of EC2 instances hosted within my AWS subscription as shown below.

aws ec2 describe-instances example

This shouldn’t take more than 5 minutes to get you up and running. However, don’t stop here. The AWS CLI supports almost all of the capability which can be found within the management portal. Therefore, if you’re in an operations role and your company is investing in AWS in 2019. You should be spending some time to learn about how to interface with services such as DynamoDB, EC2, S3/Glacier, IAM, SNS and SWF using the AWS CLI.

Lets have a look at a more practical example whereby automating a simple task can potentially save you hours of time each year. As a Mac user (you’ve probably already picked up on that) I often need to fire up a windows PC for Visual Studio or Visio. AWS is a great use case for this. I simply fire up my machine when I need it and shut it down when I’m done. I pay them a couple of bucks a month for some storage costs and some compute hours and I’m a happy camper. Simple right?

Lets unpack it further. I am not only a happy camper. I’m also a lazy camper. Firing up my VM to do my day job means:

  • Opening my browser and navigating to the AWS management console
  • Authenticating to the console
  • Navigating to the EC2 service
  • Scrolling through a long list of instances looking for my jumpbox
  • Starting my VM
  • Waiting for the network interface to refresh so I can get the public IP for RDP purposes.

This is all getting too hard right? All of this has to happen before I can even do my job and sometimes I have to do this a few times each day. Maybe its time to practice what I preach? I could automate all of this using the AWS tools for PowerShell, which would allow me to automate this process by running a script which saves me hours each year (employers love that.) Whilst this example wont necessarily increase the overall quality of my work, it does provide me with a predictable outcome every single time.

For a measly 20 lines of PowerShell I was able to define an executable script which authenticates to the AWS EC2 service, checks the power state of my VM in question. If the VM is already running it will return the connectivity details for my RDP client. If the VMis not running, it will fire up my instance, wait for the NIC to refresh and then return the connectivity details for my RDP client. I then have a script based on the same logic to shutdown my VM to save money when I’m not using the service. All of this takes less than 5 seconds to execute.

PowerShell Automation Example

The AWS CLI tools provide an interface to interact with the cloud provider programmatically. In this simple example we looked at automating a manual process which has the potential to save hours of time each year whilst also ensuring a predictable outcome for each execution. Each of the serious public cloud players offer similar capability. If you are looking to increase your overall efficiency, improve the quality of your work whilst automating monotonous tasks, consider investing some effort into learning a how to interface with your cloud provider of choice programmatically. You will be surprised how many repetitive tasks you can bowl over when you maximise the usage of the tools you have available to you.