Understanding Undocumented ARM Oddities

Over the past year I’ve been working pretty heavily with Azure Resource Manager (ARM) templates to create safe, reusable and consistent deployments of virtual infrastructure. When producing ARM templates, it’s important to understand what resource types are available, and what values to use in your template. I always use the Azure Template Reference to understand exactly how to define a certain resource type. However, sometimes you will run into situations where the Azure Template Reference is missing something that can be done in the Azure portal. So, how do we figure out how to template it if it’s not in the reference documentation?

Export Templates – Perhaps the quickest way to solve this problem is to use the native ‘Export Template’ blade in the Azure portal. For this, you will need to deploy your resource and configure it as you would like, using the Azure Portal. Once you have your resource ready, open the Export Template blade on your resource. This will create an automatically generated ARM template based on the current running state of your resource. From here, you can inspect the generated template and see if your undocumented settings or configuration has been captured in the generated template.

Download template

Azure Resource Explorer – Next stop is the Azure Resource Explorer which provides a visual interface for you to examine the Azure API’s. With the Azure Resource Explorer, you can explore the current running state of an Azure environment in JSON format. This can be very useful when attempting to reverse engineer an existing resource or environment. While Azure Resource Explorer isn’t returning data that can be directly used in an ARM template, it can be used as a mechanism to learn the syntax of resource properties that are missing from the Azure Template Reference.

This image has an empty alt attribute; its file name is 13nov1.png

When issues are encountered with undocumented resources, typically the fastest way to resolve the issue is by manually deploying the resources using the Azure Portal (clicky clicky) then reverse engineering the ARM template with a combination of using the Export Templates function and Azure resource explorer. Going down the route of doing everything in ARM templates, can lead to a lot of trial and error before getting a fully automated deployment, for now at-least.

Cheers,

Joel


Undocumented ARM Oddities – .Net Core App Services

Every once in a while, when working with ARM templates you come across something that is missing from the official Microsoft ARM template reference. In my case yesterday, I was looking to update the configuration of an Azure App Service to use the DotNetCore stack (rather than .NET 4.8).

While I initially thought this would be a quick job to simply look up the ARM reference and make the required changes, I found that there was nothing about DotNetCore in the ARM reference. Funny enough, there is a value for “netFrameworkVersion”, but don’t be deceived, if you are looking to setup DotNetCore – this value is not for you (this is for regular .Net only).

To better understand the problem, I Clickly Clikcy’d in an App Service and configured it for DotNetCore (Clickly Clicky is our lingo for deploying infrastructure using the portal rather than a CLI or template). With this, I attempted my usual trick of exporting a template and observing the JSON it spits out. However, much to my amazement I couldn’t see any reference to dotnetcore in there at all.

In the end it was the Azure Resource Explorer which came to my rescue. Used the tool to explore the example I created and found a value called “CURRENT_STACK” in the properties of the “Microsoft.Web/sites/config” resource type.

After playing this this for a while, I was able to translate this into my ARM template with the following JSON.

{
    "type": "Microsoft.Web/sites",
    "name": "[variables('WebSiteName')]",
    "apiVersion": "2020-06-01",
    "location": "[resourceGroup().location]",
    "kind": "app",
    "properties": {
        "siteConfig": {
            "metadata": [{
                "name": "CURRENT_STACK",
                "value": "dotnetcore"
            }]
        },

Hopefully this helps anyone who encounters this problem.

Cheers,

Joel


Automating Azure Site Recovery with PowerShell

In a recent consulting engagement, I’ve needed to perform a large-scale migration of a company’s virtual machine (VM) fleet from an On-premise datacenter to Microsoft Azure. Thinking about what that actually means – We’re picking up many compute workloads that are (in most cases) essential for day to day business operation and re-homing them to a new slice of a Microsoft-managed datacenter. After coming out the other end and completing the project, I thought I would shed some light on the tools that I used and developed to make the vision a reality.

In this particular engagement the customer is a large enterprise with a VMware environment servicing 300+ VM’s. When we consider the business value behind each of these compute workloads, it quickly becomes apparent that selecting the right tooling and approach is vital to deliver a successful outcome whilst causing as little disruption to the business as possible.

Enter Azure Site Recovery

Azure Site Recovery (ASR) is Microsoft’s Disaster Recovery as a Service solution which can replicate workloads running on physical and virtual machines from one location to a secondary location. As a disaster recovery platform, it’s possible for workloads to failover and successfully failback in a disaster scenario. ASR can also be used to migrate workloads to Azure by completing the failover component without failing back.

Why Should We Automate Azure Site Recovery?

I like to automate things like this because a computer following a process that someone writes will always perform it the same way. We can expect what the output will look like. In this case that means a like for like VM that looks and feels like it did in its previous life, before being migrated. When we introduce an operator into the mix we also introduce the human element. Things like resource names and groups, VM specs, disk settings, network location and ip addresses all need to be configured for each VM migration.

To have success running migrations at scale, it is important to use known, well-tested, repeatable processes. For me, that means figuring out the best way to use a tool then automating it so that you (or anyone else) can use it the right way, everytime, easily. 

How Can We Automate Azure Site Recovery?

I use PowerShell as an automation tool on top of ASR for a couple of reasons. The main reason being that Microsoft provide and maintain a set of PowerShell modules for interacting with Azure resources, including ASR. This is known as the Az module – See our previous post on the Azure PowerShell Az module for a deeper explanation.  PowerShell can also run almost anywhere thanks to PowerShell Core, a cross-platform edition of PowerShell that runs on Windows, macOS and Linux.

Armed with PowerShell and the Az module, we can get cracking on with the fun stuff – bashing out some lines of code. My approach and methodology here usually involve a fair bit of back and forth, playing with the commands that are available to me and learning the best ways to drive them. Importantly, you don’t want to do this with live data, setting up an isolated sandpit with dummy data will go a long way in allowing you to upskill knowledge around the tools while making sure your production systems remain untouched.

Once I’ve got a handle on the commands that are needed and how they fit together, I make a MVP (minimum viable product) script. The idea here is to demonstrate that its possible for the tooling to work (it’s not pretty but it works). To paint a picture, one of my MVP scripts will usually have a bunch of variables at the start declaring all the info that is required, things like VM name, source location, target location etc. From here, I usually design the script to be ran line by line, this is mostly for simplicity sake, complexity can come later, right now it just needs to be as simple as possible. At this stage, we can demonstrate our capability to perform a migration with PowerShell. A quick example of this is setting up a replication job, preceding this line, I do a series of get statements to build up all the variables seen in the command bellow.

$replicationJob = New-AzRecoveryServicesAsrReplicationProtectedItem -VMwareToAzure -ProtectableItem $vm -Name (New-Guid).Guid -ProtectionContainerMapping $replicationPolicy -ProcessServer $ProcessServer -Account $Account -RecoveryResourceGroupId $ResourceGroup.ResourceId -logStorageAccountId $LogStorageAccount.Id -RecoveryAzureNetworkId $vnet.Id -RecoveryAzureSubnetName $failoverSubnetName

From here, I like to put some lipstick on it and make it feel like a more polished product. Personally, I like to use a series of questions and prompts to generate the variables I described in the last paragraph. I also add status checks and operator prompts to continue. An example of this could be when performing a failover, once the operator confirms he is ready to begin, the command executes the failover, then continuously checks the failover job status until it has completed, once completed, tell the user running the script that its complete. Here is an example of a status check that I wrote for checking the progress of a failover job.

do {
    Clear-Host
    Write-Host "======== Monitoring Failover ========"
    Write-Host "This will refresh every 5 seconds."
    try {
        $failoverJob = Get-ASRJob -Name $failoverJob.Name
    }
    catch {
        Write-Host -ForegroundColor Red "ERROR - Unable to get status of Failover job"
        Write-Host -ForegroundColor Red "ERROR - " + $_
        log "ERROR" "Unable to get status of Failover job"
        log "ERROR" $_
        exit 
    }
    Write-Host "Failover status for $($VMName.FriendlyName) is $($failoverJob.State)"
    sleep 5;
} while (($failoverJob.State -eq "InProgress") -or ($failoverJob.State -eq "NotStarted"))

Once you get this far, the sky is the limit. Like most things, it can evolve over time. I like to add error handling and logging so we can elegantly handle a failure and have an audit trail of operations. I take this approach with most of the processes I automate, I think it’s important to start small and work up from there.