AWS EventBridge Triggering SSM Automation IAM Role Error

I recently wanted to create an Amazon EventBridge rule that will schedule an SSM Automation document.

A rule watches for certain events (cron in my case) and then routes them to AWS targets that you choose. You can create a rule that performs an AWS action automatically when another AWS action happens, or a rule that performs an AWS action regularly on a set schedule.

EventBridge needs permission to call SSM Start Automation Execution with the supplied Automation document and parameters. The rule will offer the generation of a new IAM role for this task.

In my case I received an error like below:

Error Output

The Automation definition for an SSM Automation target must contain an AssumeRole that evaluates to an IAM role ARN.

If you recieving this error you can create the role manually using the following CloudFormation Template.

AWSTemplateFormatVersion: '2010-09-09'
Description: AWS CloudFormation template IAM Roles for Event Bridge | SSM Automation

Resources:
  AutomationServiceRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - events.amazonaws.com
          Action: sts:AssumeRole
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/service-role/AmazonSSMAutomationRole
      Path: "/"
      RoleName: EventBridgeAutomationServiceRole

Undocumented ARM Oddities – .Net Core App Services

Every once in a while, when working with ARM templates you come across something that is missing from the official Microsoft ARM template reference. In my case yesterday, I was looking to update the configuration of an Azure App Service to use the DotNetCore stack (rather than .NET 4.8).

While I initially thought this would be a quick job to simply look up the ARM reference and make the required changes, I found that there was nothing about DotNetCore in the ARM reference. Funny enough, there is a value for “netFrameworkVersion”, but don’t be deceived, if you are looking to setup DotNetCore – this value is not for you (this is for regular .Net only).

To better understand the problem, I Clickly Clikcy’d in an App Service and configured it for DotNetCore (Clickly Clicky is our lingo for deploying infrastructure using the portal rather than a CLI or template). With this, I attempted my usual trick of exporting a template and observing the JSON it spits out. However, much to my amazement I couldn’t see any reference to dotnetcore in there at all.

In the end it was the Azure Resource Explorer which came to my rescue. Used the tool to explore the example I created and found a value called “CURRENT_STACK” in the properties of the “Microsoft.Web/sites/config” resource type.

After playing this this for a while, I was able to translate this into my ARM template with the following JSON.

{
    "type": "Microsoft.Web/sites",
    "name": "[variables('WebSiteName')]",
    "apiVersion": "2020-06-01",
    "location": "[resourceGroup().location]",
    "kind": "app",
    "properties": {
        "siteConfig": {
            "metadata": [{
                "name": "CURRENT_STACK",
                "value": "dotnetcore"
            }]
        },

Hopefully this helps anyone who encounters this problem.

Cheers,

Joel


Azure Application Insights – No Client Source IP Address

Working with one of your customers this week who is implementing Azure API Management alongside their web applications. We are funnelling all the request logs into an Application Insights services to manage visibility of the end-to-end transaction data. We noticed that all the client GET requests had ‘0.0.0.0’ in Client IP Address.

Request PropertiesValue
Client IP address0.0.0.0

I since learned that Microsoft obfuscate this data from Azure Monitor as it’s ingested into Applications Insights for what I call a ‘privacy policy‘. As this was a corporate application anonymity wasn’t needed and the development team wanted to understand when a request was made from their application either from inside corporate network or an unknown internet address.

A good habit to get into is first do a quick review of the latest API version for ‘Microsoft.Insights/components’ which does show a boolean value for DisableIpMasking.

{
  "name": "string",
  "type": "Microsoft.Insights/components",
  "apiVersion": "2020-02-02-preview",
  "location": "string",
  "tags": {},
  "kind": "string",
  "properties": {
    "Application_Type": "string",
    "Flow_Type": "Bluefield",
    "Request_Source": "rest",
    "HockeyAppId": "string",
    "SamplingPercentage": "number",
    "DisableIpMasking": "boolean",
    "ImmediatePurgeDataOn30Days": "boolean",
    "WorkspaceResourceId": "string",
    "publicNetworkAccessForIngestion": "string",
    "publicNetworkAccessForQuery": "string"
  }
}

Reviewing the property values for ApplicationInsightsComponentProperties object DisableIpMasking gave the following short but sweet answer.

NameTypeRequriedValue
DisableIpMaskingbooleanNoDisable IP masking.

Yeah I reckon that is worth a shot!

Update ApplicationInsightsComponentProperties value DisableIpMasking

As this value only seems to be exposed through the API we have to either push a new incremental ARM template through the sausage maker or perform a API request directly. An API request seems like the quicker request method, but doing this in a script with authentication and correct structure takes time. I have a nice trick when wanting to update or add a value to an object when either of those feel like overkill.

  1. Navigate to the Azure Resource Explorer
  2. Find the Application Insights Resource Group
  3. Select Providers > Microsoft.Insights
  4. Select Components > ‘Application Insights Name

You will be shown the JSON definition of your Application Insights Object. You can tell this by the line:

"type": "microsoft.insights/components"

To know your in the right place, under properties there will be many values, we should see Application_Type, InstrumentationKey, ConnectionString, Retention, but what will be missing is DisableIpMasking. So it’s as simple as adding it.

  1. Up the top of the page toggle the blue switch to ‘Read/Write’ from ‘Read Only’.
  2. Select ‘Edit‘.
  3. Remember to add a ‘,’ to the previous last line (in my case “HockeyAppToken“) before adding your new property.

The final step is to use the PUT button to update the object. Which intern has authenticated you to the API using your existing login token, constructed the JSON object and is sending a ‘POST’ method to the API endpoint for ‘management.azure.com/subscriptions/<subscriptionId>/resourceGroups/<rgName>/providers/microsoft.insights/components/<resourceName>?api-version=2015-05-01‘. Much simpler than doing a Powershell or Bash script, what a clever little tool it is.

The result will be that new request in Application Insights will have the source NAT IP address. Unfortunately all previous requests will remain scrubbed with ‘0.0.0.0’.

Closing thoughts

This is a great way to tweak services while attempting to understand whether it’s the correct knob to turn in the Azure service. But while it’s quick, it isn’t documented. If you have a repository of deployment ARM templates make sure you go back and amend the deployment JSON. The day will come when it gets re-deployed and it wont come out the sausage maker the same. The finger will get pointed back at that Azure administrator who doesn’t follow good DevOps practices.


ARM Template Role Assignment Learnings

ARM templates are one of those things that the learning curve can be considered steep, but once you get there they make your life so much easier you’re glad you did it. If you’re like me, Google is your friend and whenever you hit an issue with your latest template you resort to searching error messages and hope someone else has not only found the solution, but also been kind enough to write it up. And the latter point is where this post comes in, when doing ARM Template Role Assignments, there are a couple of gotchas that I often forget and when Google doesn’t have any “I’m Feeling Lucky” results, it’s time I try to be a nice person!

Let’s set the scene, doing permissions in the template rather than after the deployment allows you to use incremental updates, and stops people doing “clicky clicky” changes in the environment. You know those people, “I’ll just do it quickly…”, “I’ll fix it later, honest”. And hell, we’ve all done it, but let’s try to be better than that.

So permissions is all about scope, you can assign to the Resource Group of the resource itself. The approach is actually different, and this is explained perfectly here, there is every chance you’ve stumbled across this post because of this error:

"error": {
"code": "InvalidCreateRoleAssignmentRequest",
"message": "The request to create role assignment '{guid}' is not valid. Role assignment scope '/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/Microsoft.Insights/components/{resourceGroupName}' must match the scope specified on the URI  '/subscriptions/{resourceGroupName}/resourcegroups/{resourceGroupName}'."
  }

After reading the Microsoft documentation you’d think the scope property would help you, but instead I found it was easier to follow the approach the kind people at Stack Overflow explained. So lets look at some examples:

Parameters & Variables:

    "parameters": {        
"runbookAutomationOperators": {
            "value": [
                "a195af43-xxxx-49fd-xxxx-c1e0de11b118",
                "5deb670c-xxxx-4642-xxxx-de5290266bad",
                "2541d966-xxxx-4d1d-xxxx-8ecc6c2e8a39"
            ]
        }
}
"variables": {
        "automationOperatorId": "d3881f73-407a-4167-8283-e981cbba0404",
        "readerId": "acdd72a7-3385-48ef-bd42-f606fba81ae7"
    },

Notes:

  • I’ve put an array of accounts so I can loop through them and assign them
  • I’ve put the built in Azure roles as variables as they are referenced more than once (you can find the Id’s by running Get-AzRoleDefinition)

Resource Group:

        {
            "type": "Microsoft.Authorization/roleAssignments",
            "apiVersion": "2020-04-01-preview",
            "name": "[guid(concat(resourceGroup().id), resourceGroup().name, variables('readerId'), parameters('runbookAutomationOperators')[copyIndex()])]",
            "copy": {
                "name": "resourceGroupReader",
                "count": "[length(parameters('runbookAutomationOperators'))]"
                },
            "dependsOn": [],
            "properties": {
                "roleDefinitionId": "[concat('/subscriptions/', subscription().subscriptionId, '/providers/Microsoft.Authorization/roleDefinitions/', variables('readerId'))]",                        
                "principalId": "[parameters('runbookAutomationOperators')[copyIndex()]]"
            }
        }

Notes:

  • The role assignment is at the top level, as we’re doing a resource group deployment it’s the RG where the rights will be written
  • The name has to be a unique guid, so I’ve included the actual account id in the string be used to build the guid. By doing so it ensures if you have multiple assignments (which you will) that they are unique as it’s combing the Resource Group and the account being assigned to the RG
  • I’ve done a copy because I want to run it for the length of the array, in this case 3

Resource:

        {
            "type": "Microsoft.Automation/automationAccounts/providers/roleAssignments",
            "apiVersion": "2020-04-01-preview",
            "name": "[concat(parameters('automationAccountName'), '/Microsoft.Authorization/', guid(concat(resourceGroup().id), resourceId('Microsoft.Automation/automationAccounts', parameters('automationAccountName')),variables('automationOperatorId'), parameters('runbookAutomationOperators')[copyIndex()]))]",
            "copy": {
                "name": "runbookAutomationOperators",
                "count": "[length(parameters('runbookAutomationOperators'))]"
                },
            "dependsOn": [
                "[parameters('automationAccountName')]"
            ],
            "properties": {
                "roleDefinitionId": "[concat('/subscriptions/', subscription().subscriptionId, '/providers/Microsoft.Authorization/roleDefinitions/', variables('automationOperatorId'))]",
                "principalId": "[parameters('runbookAutomationOperators')[copyIndex()]]"
            }
        },

Notes:

  • This example is for an automation account and that is key, because you actually specify the type of account in the type (something you can easily change for other resource, e.g. Microsoft.Storage/storageAccounts)
  • Similar to the RG, you need a unique guid, so I’ve included the resource itself as well as the account we are allocating, so it once again is unique
  • Again we’ve done a copy so it’s ran three times, allocating the users to this Operator Role

So I talked about the name having to be a guid in both examples, but it’s a point I’d like to talk about some more. firstly, if it’s not you’ll get this error:

The role assignment ID must be a GUID

I mean, you can’t blame MS for this error, it is pretty clear. But how do you make a guid? Well I thought ARM Templates have guid functions so I jumped over to the MS doco. But on reading this I definitely over thought these two lines of text:

  • The returned value isn’t a random string, but rather the result of a hash function on the parameters. The returned value is 36 characters long. It isn’t globally unique. To create a new GUID that isn’t based on that hash value of the parameters, use the newGuid function.
  • Returns a value in the format of a globally unique identifier. This function can only be used in the default value for a parameter.

And if you’ve not had enough coffee you might think, I just want a bloody unique Guid and I don’t want to mess around with default values, especially as we’re doing a copy loop so we need some salt to make it different. But, just to state the obvious, this actually makes sense. You want to create a string that will be the same every single time, because you want to be able to run this incrementally. If it was unique, your IAM role assignment page would be a shambles as every time you try run your pipeline it’s come up with a lovely new guid! So the guid function is your friend, just ensure you include all the attributes so that the base string is unique. E.g. if you’re doing an Automation account, you want both that resource and the account you are assigning as the base string. If you don’t include that resource, then it could clash with another role assignment of that user in the RG, and if you don’t include user, well then it’s going to be the same for all users being granted access to that account.

While I appreciate this may be obvious I do hope it is useful as I really didn’t think the vendor doco was particularly clear, if anything else it’ll save Future Dave from working this out yet again because if he had a memory he’d be dangerous…


Lazy Afternoons with Azure ARM Conditionals

Like many IT professionals I spent my early years in the industry working in customer oriented support roles. Working in a support role can be challenging. You are the interface between the people who consume IT services and the complexity of the IT department. You interact with a broad range of people who use the IT services you are responsible for to do the real work of the business. Positives include broad exposure to a wide range of issues and personality types. Some of the roles I worked in had the downside of repetitive reactive work, however, the flip side to this is, that this is where I really started to understand the value proposition of automation.

Unfortunately the nature of being subjected to reactive support work and grumpy frustrated consumers is you can fall into the trap of just going through the motions and become caught in a cycle of just doing a good job and not striving to make a real difference. Not this late in the day. . . This phrase was used by a co-worker who for anonymity purposes we will call “Barry”. Barry had grown tired of changes to the cyclical rhythm of his daily support routine and at the slightest suggestion of any activity or initiative that came remotely close to threatening a lazy afternoon would be rebutted with “Not this late in the day. . .” Sometimes not this late in the day could have been interchanged with not this late in the year. . .sigh.

What I learnt from Barry, is there is no need to do repetitive tasks. Humans are terrible at this, despite best intentions we make mistakes which introduce subtle differences into environments, this almost always causes stability and reliability issues and no one wants that. We all like lazy afternoons, so in the spirit of doing more with less I’d like to share some examples of how the use of conditionals within ARM templates will change your life. . .well maybe just make your afternoon a little easier and the consumers of your services happier. . . Win!

Conditionals support in ARM has been around for a little while now, but are not as widely used as I believe they should be. Conditionals are useful in scenarios where you may want to optionally include a resource that would have previously required the use of separate templates or complex nesting scenarios.

In the following example I am provisioning a virtual machine and I want to optionally include a data disk and/or a public IP.

In the parameters section of the ARM template I’ve included a couple of parameters to determine if we want to create a public IP (pip) and a Data Disk (dataDisk) as follows:

"pip": {
"type": "string",
"allowedValues": [
"Yes",
"No"
],
"metadata": {
"description": "Select whether the VM should have a public ip."
}
},
"dataDisk": {
"type": "string",
"allowedValues": [
"Yes",
"No"
],
"metadata": {
"description": "Select whether the VM should have an additional data disk."
}
},

In the variables section I’ve defined a public IP object and a dataDisk array. I’ll use these values later if “Yes” is chosen for either my pip or dataDisk.

"pipObject": {
"id": "[resourceId('Microsoft.Network/publicIPAddresses',variables('publicIPAddressName'))]"
},
"dataDisks": [
{
"name": "[concat(parameters('virtualMachineName'),'_Data1')]",
"caching": "None",
"diskSizeGB": "[parameters('vmDataDisk1Size')]",
"lun": 0,
"createOption": "Empty",
"managedDisk": {
"storageAccountType": "[parameters('vmDataDisk1Sku')]"
}
}
]

Condition is an optional keyword that you can use with any of the resource objects within your ARM templates. To make it easily identifiable as a conditionally deployed resource, I’d suggest placing above the other required keywords of the resource.
Conditionally create a public IP resource by adding a “condition”: keyword. I’m using an equals comparison function to determine if the pip parameter is set to “Yes” or “No”. In this case, if its set to “Yes”, then the resource is created.

{
"condition": "[equals(parameters('pip'), 'Yes')]",
"apiVersion": "2015-06-15",
"type": "Microsoft.Network/publicIPAddresses",
"name": "[variables('publicIPAddressName')]",
"location": "[parameters('location')]",
"properties": {
"publicIPAllocationMethod": "[variables('publicIPAddressType')]",
"dnsSettings": {
"domainNameLabel": "[variables('dnsNameForPublicIP')]"
}
}
},

Ok so far that’s pretty straight forward, the tricky part is how we tell other resources that consume our public IP about it (like the network interface) in the condition where we have created a public IP we need to provide a pipObject containing a resourceId for the “id”: keyword. Or in the case of not creating a public IP we need to provide a null value. For this I’ve chosen to use a “if” logical function in conjunction with the “equals” comparison function to either provide the aforementioned resourceId or a json null value.

{
"name": "[toLower(concat('nic',parameters('virtualMachineName')))]",
"type": "Microsoft.Network/networkInterfaces",
"apiVersion": "2018-04-01",
"location": "[parameters('location')]",
"properties": {
"ipConfigurations": [
{
"name": "ipconfig1",
"properties": {
"subnet": {
"id": "[variables('subnetRef')]"
},
"privateIPAllocationMethod": "Dynamic",
"publicIPAddress": "[if(equals(parameters('pip'), 'Yes'), variables('pipObject'), json('null'))]"
}
}
]
},

Similarly I am doing much the same thing with the dataDisk

{
"name": "[parameters('virtualMachineName')]",
"type": "Microsoft.Compute/virtualMachines",
"apiVersion": "2018-06-01",
"location": "[parameters('location')]",
"dependsOn": [
"[resourceid('Microsoft.Network/networkInterfaces/', toLower(concat('nic',parameters('virtualMachineName'))))]"
],
"properties": {
"hardwareProfile": {
"vmSize": "[parameters('virtualMachineSize')]"
},
"storageProfile": {
"imageReference": {
"publisher": "SUSE",
"offer": "SLES-Standard",
"sku": "12-SP3",
"version": "latest"
},
"osDisk": {
"name": "[concat(parameters('virtualMachineName'),copyIndex(1))]",
"createOption": "fromImage"
},
"dataDisks": "[if(equals(parameters('dataDisk'), 'Yes'), variables('dataDisks'), json('null'))]"
},
"networkProfile": {
"networkInterfaces": [
{
"id": "[resourceid('Microsoft.Network/networkInterfaces/', toLower(concat('nic',parameters('virtualMachineName'))))]"
}
]
},
"osProfile": {
"computerName": "[parameters('virtualMachineName')]",
"adminUsername": "[parameters('adminUsername')]",
"adminPassword": "[parameters('adminPassword')]"
}
}
}

So there we have it, the use of conditionals has reduced the need for more complex scenarios like nested templates or multiple templates for different scenarios. Next time you get a request that’s not the status quo perhaps you can accommodate it with a few conditionals. . . Now back to my lazy afternoon.