Azure Windows VM Scale Set

Available Metrics

Possible metric types that can be tracked on a given resource.

Metric Type Data Type Description
AggregatedMetric
Object
Allows for aggregating existing metrics over a period of time. Useful for running sums or averages; also for extracting single data item from collection-based metrics
AzureAdvisorHealthMetric
AzureAdvisorHealth
AvailabilityState String
Summary String
ReasonType String
ReasonChronicity String
DetailedStatus String
OccuredTime DateTime
ReportedTime DateTime
Tracks Azure health using latest Azure Health API - more info @ https://docs.microsoft.com/en-us/rest/api/resourcehealth/availabilitystatuses/getbyresource/
AzureAdvisorRecommendationMetric
AzureAdvisorRecommendation[]
Category String
Impact String
LastUpdated Nullable`1
Risk String
Problem String
Solution String
Tracks Azure recommendations using latest Azure Advisor API - more info @ https://docs.microsoft.com/en-us/rest/api/advisor/
AzureMonitorMetric
Double
Tracks Azure metrics using latest Azure Monitor API - more info @ https://docs.microsoft.com/en-us/rest/api/monitor/
AzureServiceFabricStatus
String
Indicates the state of the Azure Service Fabric
AzureVirtualMachineOperations
AzureOperation
Name String
Category String
Description String
Caller String
EventName String
Level String
Status String
SubStatus String
ExtendedInfo String
EventTimestamp DateTime
AzureVmssInstanceDetails
AzureVmssInstanceDetails[]
Instance String
Size String
ProvisioningState String
PowerState String
AgentState String
StateDetails String
Tracks detailed information about Azure VM instances as a list.
DerivedMetric
Double
Allows for deriving new metrics from existing ones. Useful for combining existing metrics together or for multiplying metrics by a factor
InternalUrlResponseCode
String
Tracks an http result from testing of internal IP address. Possible values are http verbs: OK, Unauthorized, etc.
InternalUrlResponseTime
Double
Tracks response time of http request to internal IP address
InternalUrlState
UrlStatus[]
Host String
Down Boolean
ResponseTime Double
StatusCode Int32
ErrorMessage String
Timestamp String
Tracks results of http request to internal IP address
LinkedMetric
Object
Allows for tracking of metrics from other resources. Useful when it is important to evaluate metrics from different resources side-by-side
ResourceInstanceCount
Int32
Tracks current number of compute instances
ResourceStatus
String
Tracks overall status of the resource. This is an important metric as it is used to drive Uptime reports. Possible values: Ready, Down, Unknown and in some cases Stopped
WindowsCustomEventLogEntry
EventEntity
UniqueId Guid
EventId Int64
MachineName String
Message String
Source String
UserName String
EntryType String
Timestamp DateTime
Tracks entries from the Windows Event Log.
WindowsEventLogEntry
EventEntity
UniqueId Guid
EventId Int64
MachineName String
Message String
Source String
UserName String
EntryType String
Timestamp DateTime
Tracks entries from the Windows Event Log.
WindowsPerformanceCounter
Double
Tracks performance counters defined as individual metrics. Any performance counter might be tracked.
WindowsPerformanceCounterMultiInstance
PerformanceCounterInstance[]
Server String
Instance String
Value Double
Tracks multi-instance performance counters. It returns an array of PerformanceCounterInstance objects for each counter instance.

Available Commands

Possible commands that can be executed on a given resource. Ultimate subscription is required.

Command Type Description
AzureVmScaleSetInstanceReboot Reboots specified VM ScaleSet instance
AzureVmScaleSetInstanceReimage Reimages specified VM ScaleSet instance
AzureVmScaleSetStart Starts specified VM ScaleSet instance
AzureVmScaleSetStopDeallocate Deallocates specified VM ScaleSet instance
WebRequest Runs custom WebRequest to specified URL

Default Templates

CloudMonix provided default monitoring templates.

Pre-configured Metrics

Metric Name Metric Type Description
ApplicationsEventLogs WindowsEventLogEntry Tracks entries from the Windows Event Log (Application source)
AvgDiskReadPerSec WindowsPerformanceCounter Tracks the average time it takes to make a read across all drives
AvgDiskReadQueueLength WindowsPerformanceCounter Tracks the average Read queue length across all drives
AvgDiskWritePerSec WindowsPerformanceCounter Tracks the average time it takes to make a write across all drives
AvgDiskWriteQueueLength WindowsPerformanceCounter Tracks the average Write queue length across all drives
CpuTime WindowsPerformanceCounter Tracks overall CPU utilization on the monitored server
CpuTime5MinAverage AggregatedMetric Tracks 30-minute CPU utilization average across all instances within monitored VMSS
DiskFreeSpace WindowsPerformanceCounterMultiInstance Tracks free space in Megabytes on each disk
DiskFreeSpacePct WindowsPerformanceCounterMultiInstance Tracks amount of free space in % on each disk
DiskFreeSpaceTotal WindowsPerformanceCounter Tracks total amount of free space across all drives
DiskIdleTime WindowsPerformanceCounter Tracks the percentage of time when disk. Sustained numbers below 20% indicate an over-saturated disk.
DiskReadsBytesPerSec WindowsPerformanceCounter Tracks the Read throughput across all drives
DiskReadSpeed WindowsPerformanceCounter Tracks average time, in seconds, it takes to read data from the disk
DiskWritesBytesPerSec WindowsPerformanceCounter Tracks the Writes throughput across all drives
DiskWriteSpeed WindowsPerformanceCounter Tracks average time, in seconds, it takes to write data to the disk
FabricProcessDetected AggregatedMetric
FabricProcessMemory WindowsPerformanceCounter
InboundNetworkErrors WindowsPerformanceCounterMultiInstance Tracks the number of inbound packets that could not be transmitted because of errors
Instances AzureVmssInstanceDetails
MemoryCommittedPct WindowsPerformanceCounter Tracks the amount of virtual memory in use. It is the ratio of Commited Bytes to the Commit Limit
MemoryFree WindowsPerformanceCounter Tracks free memory (in MBs) on the monitored instance
OutboundNetworkErrors WindowsPerformanceCounterMultiInstance Tracks the number of outbound packets that could not be transmitted because of errors
PagingFileUsage WindowsPerformanceCounter Overall usage of all paging files in %
ServiceFabricAvgInvocationTime WindowsPerformanceCounterMultiInstance Tracks the time taken to execute service methods in milliseconds
ServiceFabricAvgRequestTime WindowsPerformanceCounterMultiInstance Tracks time taken (in milliseconds) by the service to process requests
ServiceFabricExceptions WindowsPerformanceCounterMultiInstance Tracks the number of times that service methods throw an exception per second
ServiceFabricInvocations WindowsPerformanceCounterMultiInstance Tracks the number of times that service methods are invoked per second
ServiceFabricOutstandingRequests WindowsPerformanceCounterMultiInstance Tracks the number of requests being processed by services
ServiceFabricStatus AzureServiceFabricStatus Tracks the Service Fabric status of the monitored Scale Set
Status ResourceStatus Tracks the overall running status of the monitored instances within Scale Set
SystemEventLogs WindowsEventLogEntry Tracks entries from the Windows Event Log (System source)
SystemUptime WindowsPerformanceCounter

Pre-configured Alerts

Alert Name Expression Severity Description
Fabric Process Not Running FabricProcessDetected == 0 Warning
High CPU CpuTime > 70 Warning Raises an alert when CPU utilization for a specific instance is over 70% for the last 5 minutes sustained
Instance Was Rebooted SystemUptime < 600 Warning
Low Disk Space (Any(DiskFreeSpace, "Value < 1024 && Contains(Instance,\":\")") || Any(DiskFreeSpacePct, "Value < 20 && Contains(Instance,\":\")") ) && DiskFreeSpaceTotal > 0 Warning Raises an alert when any of the disks has less than 1GB or 20% of free space left
Low Memory MemoryFree < 100 Warning Raises an alert if the amount of available physical memory on a specific instance, falls below 100MBs for the last 2 monitoring cycles sustained
Resource Outage Status == "Down" Error Raises an alert when monitored server is reported as not-Ready by Azure of if no metrics come through from diagnostic agents, for a sustained period of time
Service Fabric Is Not Ready ServiceFabricStatus != "Ready" Warning Raises an alert when monitored Service Fabric resource is reported as not-Ready by Azure for a sustained period of time

Pre-configured Actions

Action Name Command Type Expression Severity Description
Daily reboot AzureVmScaleSetInstanceReboot CheckTimeUtc.Hour == (InstanceIndex % 24) Information Disabled by default. Reboots VMSS instances once per day. Reboot happens when instance's index matches current clock hour (in UTC). For example: 1st instance is rebooted at UTC midnight, 2nd instance is rebooted at 1am UTC, etc. For deployments with 25+ instances, this action reboots every 24th instance. For example, for deployment with 100 instances, at UTC midnight the 1st, 25th, 49th, 73rd and 96th instances will be rebooted; at UTC 1am, 2nd, 26th, 50th, 74th and 97th instances will be rebooted; etc. More information here: http://support.cloudmonix.com/support/solutions/articles/5000629071
Low Ram Reboot AzureVmScaleSetInstanceReboot MemoryFree < 100 Warning Disabled by default. Reboot VMSS instance if available memory drops below 100MB for 5 minutes sustained. This action will not be executed more than once per hour due to Suspended period setting.

Pre-configured Metrics

Metric Name Metric Type Description
ApplicationsEventLogs WindowsEventLogEntry Tracks entries from the Windows Event Log (Application source)
CpuTime WindowsPerformanceCounter Tracks overall CPU utilization on the monitored server
CpuTime30MinAverage AggregatedMetric Tracks 30-minute CPU utilization average across all instances within monitored VMSS
DiskFreeSpace WindowsPerformanceCounterMultiInstance Tracks free space in Megabytes on each disk
DiskFreeSpaceTotal WindowsPerformanceCounter Tracks total amount of free space across all drives
DiskIdleTime WindowsPerformanceCounter Tracks the percentage of time when disk. Sustained numbers below 20% indicate an over-saturated disk.
DiskReadSpeed WindowsPerformanceCounter Tracks average time, in seconds, it takes to read data from the disk
DiskWriteSpeed WindowsPerformanceCounter Tracks average time, in seconds, it takes to write data to the disk
Instances AzureVmssInstanceDetails
MemoryCommittedPct WindowsPerformanceCounter Tracks the amount of virtual memory in use. It is the ratio of Commited Bytes to the Commit Limit
MemoryFree WindowsPerformanceCounter Tracks free memory (in MBs) on the monitored instance
RecommendedActions AzureAdvisorRecommendationMetric Tracks recommended actions for specified resource.
Status ResourceStatus Tracks the overall running status of the monitored instances within Scale Set
SystemEventLogs WindowsEventLogEntry Tracks entries from the Windows Event Log (System source)

Pre-configured Alerts

Alert Name Expression Severity Description
High CPU CpuTime > 70 Warning Raises an alert when CPU utilization for a specific instance is over 70% for the last 5 minutes sustained
Low Disk Space Any(DiskFreeSpace, "Value < 1024 && Contains(Instance,\":\")") && DiskFreeSpaceTotal > 0 Warning Raises an alert when any of the disks has less than 1GB of free space left
Low Memory MemoryFree < 100 Warning Raises an alert if the amount of available physical memory on a specific instance, falls below 100MBs for the last 2 monitoring cycles sustained
Resource Outage Status == "Down" Error Raises an alert when monitored server is reported as not-Ready by Azure of if no metrics come through from diagnostic agents, for a sustained period of time

Pre-configured Actions

Action Name Command Type Expression Severity Description
Daily reboot AzureVmScaleSetInstanceReboot CheckTimeUtc.Hour == (InstanceIndex % 24) Information Disabled by default. Reboots VMSS instances once per day. Reboot happens when instance's index matches current clock hour (in UTC). For example: 1st instance is rebooted at UTC midnight, 2nd instance is rebooted at 1am UTC, etc. For deployments with 25+ instances, this action reboots every 24th instance. For example, for deployment with 100 instances, at UTC midnight the 1st, 25th, 49th, 73rd and 96th instances will be rebooted; at UTC 1am, 2nd, 26th, 50th, 74th and 97th instances will be rebooted; etc. More information here: http://support.cloudmonix.com/support/solutions/articles/5000629071
Low Ram Reboot AzureVmScaleSetInstanceReboot MemoryFree < 100 Warning Disabled by default. Reboot VMSS instance if available memory drops below 100MB for 5 minutes sustained. This action will not be executed more than once per hour due to Suspended period setting.

Pre-configured Metrics

Metric Name Metric Type Description
ApplicationsEventLogs WindowsEventLogEntry Tracks entries from the Windows Event Log (Application source)
AspNetApplicationRestarts WindowsPerformanceCounter Tracks the number of times that an application has been restarted during the Web server's lifetime. Application restarts are incremented each time an Application_OnEnd event is raised. An application restart can occur because of changes to the Web.config file, changes to assemblies stored in the application's Bin directory, or when an application must be recompiled due to numerous changes in ASP.NET Web pages. Unexpected increases in this counter can mean that problems are causing Web application to recycle.
AspNetBytesOut WindowsPerformanceCounter Tracks total size in bytes of responses sent to a client. Does not include HTTP response headers.
AspNetErrors WindowsPerformanceCounter Tracks the average number of errors that occurred per second during the execution of HTTP requests. Includes any parser, compilation, or run-time errors.
AspNetRequests WindowsPerformanceCounter Tracks the number of requests executed per second. This represents the current throughput of the application. Under constant load, this number should remain within a certain range, barring other server work (such as garbage collection, cache cleanup thread, external server tools, and so on).
AspNetRequestsQueued WindowsPerformanceCounter Tracks the number of requests waiting for service from the queue. When this number starts to increment linearly with increased client load, the Web server computer has reached the limit of concurrent requests that it can process.
AspNetRequestsRejected WindowsPerformanceCounter Tracks the total number of requests not executed because of insufficient server resources to process them. This counter represents the number of requests that return a 503 HTTP status code, indicating that the server is too busy
AspNetRequestsWaitTime WindowsPerformanceCounter Tracks the number of milliseconds that the most recent request waited in the queue for processing
CpuTime WindowsPerformanceCounter Tracks overall CPU utilization on the monitored server
CpuTime30MinAverage AggregatedMetric Tracks 30-minute CPU utilization average across all instances within monitored VMSS
DiskFreeSpace WindowsPerformanceCounterMultiInstance Tracks free space in Megabytes on each disk
DiskFreeSpaceTotal WindowsPerformanceCounter Tracks total amount of free space across all drives
DiskIdleTime WindowsPerformanceCounter Tracks the percentage of time when disk. Sustained numbers below 20% indicate an over-saturated disk.
DiskReadSpeed WindowsPerformanceCounter Tracks average time, in seconds, it takes to read data from the disk
DiskWriteSpeed WindowsPerformanceCounter Tracks average time, in seconds, it takes to write data to the disk
Instances AzureVmssInstanceDetails
MemoryCommittedPct WindowsPerformanceCounter Tracks the amount of virtual memory in use. It is the ratio of Commited Bytes to the Commit Limit
MemoryFree WindowsPerformanceCounter Tracks free memory (in MBs) on the monitored instance
RecommendedActions AzureAdvisorRecommendationMetric Tracks recommended actions for specified resource.
Status ResourceStatus Tracks the overall running status of the monitored instances within Scale Set
SystemEventLogs WindowsEventLogEntry Tracks entries from the Windows Event Log (System source)

Pre-configured Alerts

Alert Name Expression Severity Description
High CPU CpuTime > 70 Warning Raises an alert when CPU utilization for a specific instance is over 70% for the last 5 minutes sustained
Low Disk Space Any(DiskFreeSpace, "Value < 1024 && Contains(Instance,\":\")") && DiskFreeSpaceTotal > 0 Warning Raises an alert when any of the disks has less than 1GB of free space left
Low Memory MemoryFree < 100 Warning Raises an alert if the amount of available physical memory on a specific instance, falls below 100MBs for the last 2 monitoring cycles sustained
Requests are Queueing Up AspNetRequestsQueued > 10 Warning Raises an alert when the number of queued requests exceeds 10, for 5 minutes sustained. Queued requests indicate that IIS or backened processes are not able to process the requests quickly enough
Resource Outage Status == "Down" Error Raises an alert when monitored server is reported as not-Ready by Azure of if no metrics come through from diagnostic agents, for a sustained period of time

Pre-configured Actions

Action Name Command Type Expression Severity Description
Daily reboot AzureVmScaleSetInstanceReboot CheckTimeUtc.Hour == (InstanceIndex % 24) Information Disabled by default. Reboots VMSS instances once per day. Reboot happens when instance's index matches current clock hour (in UTC). For example: 1st instance is rebooted at UTC midnight, 2nd instance is rebooted at 1am UTC, etc. For deployments with 25+ instances, this action reboots every 24th instance. For example, for deployment with 100 instances, at UTC midnight the 1st, 25th, 49th, 73rd and 96th instances will be rebooted; at UTC 1am, 2nd, 26th, 50th, 74th and 97th instances will be rebooted; etc. More information here: http://support.cloudmonix.com/support/solutions/articles/5000629071
Low Ram Reboot AzureVmScaleSetInstanceReboot MemoryFree < 100 Warning Disabled by default. Reboot VMSS instance if available memory drops below 100MB for 5 minutes sustained. This action will not be executed more than once per hour due to Suspended period setting.