How To Easily Enable Windows Azure Diagnostics Remotely

One of the great things about life is PowerShell. That could be the end of the post, but instead I’m going to add some steps on how to enable Diagnostics on a Windows Azure service once it has been deployed. This turned out to be a long post with lots of PowerShell to get excited about Smile

Apart from the obvious separation of duties – unless you are specifically debugging a service deployment issue, you don’t need to write code in your service to enable diagnostics. You will get it wrong anyway by collecting 800 counters at 1 second intervals which is not likely to be any help to anyone.

What I typically start out collecting is some pretty basic things, CPU, Memory Usage, Network and if it is a ASP.NET site some ASP.NET specific counters. Basically I’m looking for enough information to give me a good finger on the pulse. These can be collected at intervals between 5 and 15 minutes because we are really interested in the load over time, rather than individual peaks.

Getting Started

First you need to make sure you have a management certificate created and installed.

Once you have that, you need to download the PowerShell cmdlets from codeplex.

Next figure out what the thumbprint of your certificate is. You can see this in the Windows Azure management portal when you click your certificate. While you are there, make a note of your subscription Id, Storage Account Name and Key and the service name.

Tip: While I’m figuring out the right commands to use, I usually set these up as variables.

Rather than dump the whole script on you, let me walk you through what is happening and why. You can do this interactively in a PowerShell window if you want to follow along. The entire script is at the end.

Here are a few lines of the script to set the variables up:

# TODO: Cert thumbprint of a certificate already installed into the Windows Azure Portal
$thumb = "PUTTHETHUMBHERE"
$cert = get-item cert:\CurrentUser\My\$thumb
# TODO: Subscription Id
$subid = "PUTYOURSUBSCRIPTIONIDHERE"
# TODO: Storage Account Name
$SAN = "PUTYOURSTORAGEACCOUNTNAMEHERE"
# TODO: Storage Account Key
$SAK = "PUTYOURSTORAGEACCOUNTKEYHERE"
# TODO: Service Name
$serviceName = "PUTYOURSERVICENAMEHERE"

As you can see, we can grab the actual certificate very easily using the get-item cmdlet.

Getting a list of services

You can check really quickly to see if you have everything configured correctly by using the Get-HostedServices cmdlet:

Get-HostedServices -SubscriptionId $subid -Certificate $cert

This cmdlet should return a list of services you have deployed to that service account.

To get information on an individual service you can you Get-HostedService:

Get-HostedService -SubscriptionId $subid -Certificate $cert -ServiceName $serviceName

Before I get carried away, you can get a full list of cmdlets to explore using:

Get-Command -Module WAPPSCmdlets

Getting the current deployment

Back to diagnostics. When configuring diagnostics, you need to know which deployment to work with. You specify the deployment using the deployment id. You can grab the current deployment using the Get-Deployment cmdlet:

Get-Deployment -ServiceName $serviceName -Certificate $cert -SubscriptionId $subid -Slot Production

One property of the result is DeploymentId. You can grab that and save it to another variable using:

$did = (Get-Deployment -ServiceName $serviceName -Certificate $cert -SubscriptionId $subid -Slot Production).DeploymentId

Once you have the DeploymentId you use it in many of the other commands to setup log collection etc.

To get a list of roles you can grab the RoleInstanceList from the deployment:

$roles = (Get-Deployment -ServiceName $serviceName -Certificate $cert -SubscriptionId $subid -Slot Production).RoleInstanceList

This returns both the role names as well as the instance names.

Note: There is a cmdlet Get-DiagnosticAwareRoles which returns the role names but not the instance names. There is also a Get-DiagnosticAwareRoleInstances which returns the instance names as strings. You could use both commands…

Just Web or Worker roles

Some diagnostics collection you just want to configure on web roles. I haven’t found a way from the PowerShell cmdlets to return a specific role type, so instead I always make sure my role name has the word “Web” in and use the following:

$webroles = $roles | where { $_.RoleName -match "Web"}

I haven’t figured out a better way of doing this, if you do let me know.

Guess how I find worker roles?

$workerroles = $roles | where { $_.RoleName -notmatch "Web"}

Anyway, if you have a list of roles and instance names, along with the storage account name and key and the deployment id, you are ready to go!

Configure the Windows Azure Logs

To configure the Windows Azure Logs you can use the Set-WindowsAzureLog cmdlet:

$roles | foreach { Set-WindowsAzureLog -LogLevelFilter "Error" -RoleName $_.RoleName -InstanceId $_.InstanceId -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did -BufferQuotaInMB 50 -TransferPeriod 5 }

This will tell the diagnostics agent to copy the logs every 5 minutes. The total amount of data in the local log will not exceed 50MB

To check the configuration was set you have to use the Get-DiagnosticConfiguration cmdlet:

$roles | foreach {Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName Logs }

Configure the Diagnostics Infrastructure Logs

To configure collection of the Windows Azure infrastructure logs use the Set-InfrastructureLog cmdlet:

$roles | foreach { Set-InfrastructureLog -LogLevelFilter "Error" -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferQuotaInMB 50 -TransferPeriod 5 -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did }

To check the configuration use:

$roles | foreach {Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName DiagnosticInfrastructureLogs }

Configure the Windows Event Logs

To configure the collection of the Windows Event Logs use you need to first specify which logs to collect. If you just want to collect the Application and System logs, you first need to create an array:

$logs = "Application!*","System!*"

You can then use this list in the following:

$roles | foreach { Set-WAEventLog -EventLogs $logs -LogLevel "Error" -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferQuotaInMB 10 -TransferPeriod 5 -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did }

To check the configuration use:

$roles | foreach {Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName WindowsEventLogs }

Configuring Performance Counters

To configure performance counters to be collected, you need to first create one or more PerformanceCounterConfiguration objects, then pass them into the Set-PerformanceCounter cmdlet.

To create a PerformanceCounterConfiguration object use:

$cpu_perfcounter = new-object Microsoft.WindowsAzure.Diagnostics.PerformanceCounterConfiguration
$cpu_perfcounter.CounterSpecifier = "\Processor(_Total)\% Processor Time"
$cpu_perfcounter.SampleRate = new TimeSpan(0,5,0)

For all roles, you should create performance counters for the following:

  • \Processor(_Total)\% Processor Time
  • \Memory\Available Mbytes
  • \Memory\Committed Bytes
  • \Network Interface(*)\Bytes Received/sec
  • \Network Interface(*)\Bytes Sent/sec

For a webrole you should add at least:

  • \ASP.NET Applications(__Total__)\Requests/Sec

Once you have created each PerformanceCounterConfiguration object create an array then use the Set-PerformanceCounter cmdlet:

$cpu_perfcounter = new-object Microsoft.WindowsAzure.Diagnostics.PerformanceCounterConfiguration
$cpu_perfcounter.CounterSpecifier = "\Processor(_Total)\% Processor Time"
$cpu_perfcounter.SampleRate = $counter_time
To view the performance counters, you can use:
Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName PerformanceCounters    

If you actually want to view the performance counter names and sample times you need to look at the datasources property:

(Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName PerformanceCounters).DataSources

Configuring IIS Log Files

Windows Azure lets you configure the collection of any log files. Typically for a web role you would want to grab the IIS log files. In order to do this, you have to configure a datasource that specifies which folder the log files will be located in. To figure this out, the pattern is:

C:\Resources\directory\<DeploymentId>.<RoleName>.DiagnosticStore\LogFiles

Which can be constructed using:

$logfile = "C:\Resources\directory\" + $did + "." + $WebRoles[0].RoleName + ".DiagnosticStore\LogFiles"

Other formats are Failed Request Logs

C:\Resources\directory\<DeploymentID>.<RoleName>\FailedReqLogFile

and Crash Dumps:

 C:\Resources\directory\<DeploymentId>.<RoleName>\CrashDumps

To create the actual datasource, you need to create an instance of DirectoryConfiguration:

$iisDirectorySource = new-object -TypeName Microsoft.WindowsAzure.Diagnostics.DirectoryConfiguration
$iisDirectorySource.Path = $logfile
$iisDirectorySource.DirectoryQuotaInMB = 100
$iisDirectorySource.Container = "wad-iis-logfiles"

You can then set the directory sources:

$webroles | foreach { Set-FileBasedLog -DirectoriesConfiguration @($iisDirectorySource) -RoleName $_.RoleName -InstanceId $_.InstanceId -BufferQuotaInMB 1024 -TransferPeriod 5 -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did }

Phew – made it.

Here is the full script for setting the configuration, followed by a script that will display the configuration:

# TODO: Cert thumbprint of a certificate already installed into the Windows Azure Portal
$thumb = "ENTERTHUMBHERE"
$cert = get-item cert:\CurrentUser\My\$thumb
# TODO: Subscription Id
$subid = "ENTERSUBIDHERE"
# TODO: Storage Account Name
$SAN = "ENTERSTORAGEACCOUNTNAMEHERE"
# TODO: Storage Account Key
$SAK = "ENTERSTORAGEACCOUNTKEYHERE"
# TODO: Service Name
$serviceName = "ENTERSERVICENAMEHERE"

# You can leave the remainder of the script alone, will collect some good counters at 5 minute intervals and setup logging of event logs etc.
# Does not setup Fail Request Logs
# 
# Logs to collect
$logs = "Application!*","System!*"
# configure performance counters sample TimeSpan (h,m,s) - recommended 5 - 15 minutes
$counter_time = new-object TimeSpan(0,10,0)
# Configure default collection time (recommended 5-15 minutes)
$transferPeriod = 5

# performance counters
$cpu_perfcounter = new-object Microsoft.WindowsAzure.Diagnostics.PerformanceCounterConfiguration
$cpu_perfcounter.CounterSpecifier = "\Processor(_Total)\% Processor Time"
$cpu_perfcounter.SampleRate = $counter_time

$mem_available_bytes_perfcounter = New-Object Microsoft.WindowsAzure.Diagnostics.PerformanceCounterConfiguration
$mem_available_bytes_perfcounter.CounterSpecifier = "\Memory\Available Mbytes"
$mem_available_bytes_perfcounter.SampleRate = $counter_time

$mem_committed_bytes_perfcounter = New-Object Microsoft.WindowsAzure.Diagnostics.PerformanceCounterConfiguration
$mem_committed_bytes_perfcounter.CounterSpecifier = "\Memory\Committed Bytes"
$mem_committed_bytes_perfcounter.SampleRate =$counter_time

$asp_app_requests_perfcounter =  New-Object Microsoft.WindowsAzure.Diagnostics.PerformanceCounterConfiguration
$asp_app_requests_perfcounter.CounterSpecifier = "\ASP.NET Applications(__Total__)\Requests/Sec"
$asp_app_requests_perfcounter.SampleRate = $counter_time

$net_received_perfcounter =  New-Object Microsoft.WindowsAzure.Diagnostics.PerformanceCounterConfiguration
$net_received_perfcounter.CounterSpecifier = "\Network Interface(*)\Bytes Received/sec"
$net_received_perfcounter.SampleRate = $counter_time

$net_sent_perfcounter =  New-Object Microsoft.WindowsAzure.Diagnostics.PerformanceCounterConfiguration
$net_sent_perfcounter.CounterSpecifier = "\Network Interface(*)\Bytes Sent/sec"
$net_sent_perfcounter.SampleRate = $counter_time

# deployment Id
$did = (Get-Deployment -ServiceName $serviceName -Certificate $cert -SubscriptionId $subid -Slot Production).DeploymentId
# Role List
$roles = (Get-Deployment -ServiceName $serviceName -Certificate $cert -SubscriptionId $subid -Slot Production).RoleInstanceList

#Web Role List
$webroles = $roles | where { $_.RoleName -match "Web"}
#worker Role List
$workerroles = $roles | where { $_.RoleName -notmatch "Web"}

$roles | foreach { Set-WindowsAzureLog -LogLevelFilter "Error" -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferQuotaInMB 50 -TransferPeriod $transferPeriod -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did}

$roles | foreach { Set-InfrastructureLog -LogLevelFilter "Error" -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferQuotaInMB 50 -TransferPeriod $transferPeriod -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did }

$logs = "Application!*","System!*"
$roles | foreach { Set-WAEventLog -EventLogs $logs -LogLevel "Error" -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferQuotaInMB 10 -TransferPeriod $transferPeriod -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did }

#Web Role Specific Configuration
if ($WebRoles -ne $null)
{
    $logfile = "C:\Resources\directory\" + $did + "." + $WebRoles[0].RoleName + ".DiagnosticStore\LogFiles"

    $iisDirectorySource = new-object -TypeName Microsoft.WindowsAzure.Diagnostics.DirectoryConfiguration
    $iisDirectorySource.Path = $logfile
    $iisDirectorySource.DirectoryQuotaInMB = 100
    $iisDirectorySource.Container = "wad-iis-logfiles"

    $webroles | foreach { Set-FileBasedLog -DirectoriesConfiguration @($iisDirectorySource) -RoleName $_.RoleName -InstanceId $_.InstanceId -BufferQuotaInMB 1024 -TransferPeriod 5 -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did }
    
    $web_perf_counters = $cpu_perfcounter, $mem_available_bytes_perfcounter, $mem_committed_bytes_perfcounter, $net_received_perfcounter, $net_sent_perfcounter, $asp_app_requests_perfcounter
    $webroles | foreach { Set-PerformanceCounter -PerformanceCounters $web_perf_counters -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferQuotaInMB 10 -TransferPeriod $transferPeriod -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did } 
}

#Worker Role Specific Configuration
if ($workerroles -ne $null)
{    
    $worker_perf_counters = $cpu_perfcounter, $mem_available_bytes_perfcounter, $mem_committed_bytes_perfcounter, $net_received_perfcounter, $net_sent_perfcounter
    $workerroles | foreach { Set-PerformanceCounter -PerformanceCounters $worker_perf_counters -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferQuotaInMB 10 -TransferPeriod $transferPeriod -StorageAccountName $SAN -StorageAccountKey $SAK -DeploymentId $did } 
}

The display script:

# TODO: Cert thumbprint of a certificate already installed into the Windows Azure Portal
$thumb = "ENTERTHUMBHERE"
$cert = get-item cert:\CurrentUser\My\$thumb
# TODO: Subscription Id
$subid = "ENTERSUBIDHERE"
# TODO: Storage Account Name
$SAN = "ENTERSTORAGEACCOUNTNAMEHERE"
# TODO: Storage Account Key
$SAK = "ENTERSTORAGEACCOUNTKEYHERE"
# TODO: Service Name
$serviceName = "ENTERSERVICENAMEHERE"


# deployment Id
$did = (Get-Deployment -ServiceName $serviceName -Certificate $cert -SubscriptionId $subid -Slot Production).DeploymentId
# Role List
$roles = (Get-Deployment -ServiceName $serviceName -Certificate $cert -SubscriptionId $subid -Slot Production).RoleInstanceList

$roles | foreach {
    write-host "======================================================"
    write-host $_.RoleName 
    write-host $_.InstanceName

    write-host "Event Logs"
    Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName WindowsEventLogs | out-host
        
    write-host "Infrastructure Logs"
    Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName DiagnosticInfrastructureLogs 

    write-host "Logs"
    Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName Logs 

    write-host "Performance Counters"
    Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName PerformanceCounters | out-host    
    (Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName PerformanceCounters).DataSources | out-host
    
    
    write-host "Directories"
    Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName Directories | out-host
    (Get-DiagnosticConfiguration -DeploymentId $did -StorageAccountName $SAN -StorageAccountKey $SAK -RoleName $_.RoleName -InstanceId $_.InstanceName -BufferName Directories).DataSources | fl | out-host
    
    write-host ""
}

THIS POSTING IS PROVIDED “AS IS” WITH NO WARRANTIES, AND CONFERS NO RIGHTS, EVEN IF YOU HAVE CHOCOLATE

4 thoughts on “How To Easily Enable Windows Azure Diagnostics Remotely

  1. David,
    Thanks for sharing this! Doing diagnostics the right way is not as obvious as it should be, and this is a huge help.

Comments are closed.