CloudMonix | Azure Windows VM Scale Set Documentation

Available Metrics

Possible metric types that can be tracked on a given resource.

Metric Type Data Type Description

AggregatedMetric

Object

Allows for aggregating existing metrics over a period of time. Useful for running sums or averages; also for extracting single data item from collection-based metrics

AzureAdvisorHealthMetric

AzureAdvisorHealth

AvailabilityState	String
Summary	String
ReasonType	String
ReasonChronicity	String
DetailedStatus	String
OccuredTime	DateTime
ReportedTime	DateTime

Tracks Azure health using latest Azure Health API - more info @ https://docs.microsoft.com/en-us/rest/api/resourcehealth/availabilitystatuses/getbyresource/

AzureAdvisorRecommendationMetric

AzureAdvisorRecommendation[]

Category	String
Impact	String
LastUpdated	Nullable`1
Risk	String
Problem	String
Solution	String

Tracks Azure recommendations using latest Azure Advisor API - more info @ https://docs.microsoft.com/en-us/rest/api/advisor/

AzureMonitorMetric

Double

Tracks Azure metrics using latest Azure Monitor API - more info @ https://docs.microsoft.com/en-us/rest/api/monitor/

AzureServiceFabricStatus

String

Indicates the state of the Azure Service Fabric

AzureVirtualMachineOperations

AzureOperation

Name	String
Category	String
Description	String
Caller	String
EventName	String
Level	String
Status	String
SubStatus	String
ExtendedInfo	String
EventTimestamp	DateTime

AzureVmssInstanceDetails

AzureVmssInstanceDetails[]

Instance	String
Size	String
ProvisioningState	String
PowerState	String
AgentState	String
StateDetails	String

Tracks detailed information about Azure VM instances as a list.

DerivedMetric

Double

Allows for deriving new metrics from existing ones. Useful for combining existing metrics together or for multiplying metrics by a factor

InternalUrlResponseCode

String

Tracks an http result from testing of internal IP address. Possible values are http verbs: OK, Unauthorized, etc.

InternalUrlResponseTime

Double

Tracks response time of http request to internal IP address

InternalUrlState

UrlStatus[]

Host	String
Down	Boolean
ResponseTime	Double
StatusCode	Int32
ErrorMessage	String
Timestamp	String

Tracks results of http request to internal IP address

LinkedMetric

Object

Allows for tracking of metrics from other resources. Useful when it is important to evaluate metrics from different resources side-by-side

ResourceInstanceCount

Int32

Tracks current number of compute instances

ResourceStatus

String

Tracks overall status of the resource. This is an important metric as it is used to drive Uptime reports. Possible values: Ready, Down, Unknown and in some cases Stopped

WindowsCustomEventLogEntry

EventEntity

UniqueId	Guid
EventId	Int64
MachineName	String
Message	String
Source	String
UserName	String
EntryType	String
Timestamp	DateTime

Tracks entries from the Windows Event Log.

WindowsEventLogEntry

EventEntity

UniqueId	Guid
EventId	Int64
MachineName	String
Message	String
Source	String
UserName	String
EntryType	String
Timestamp	DateTime

Tracks entries from the Windows Event Log.

WindowsPerformanceCounter

Double

Tracks performance counters defined as individual metrics. Any performance counter might be tracked.

WindowsPerformanceCounterMultiInstance

PerformanceCounterInstance[]

Server	String
Instance	String
Value	Double

Tracks multi-instance performance counters. It returns an array of PerformanceCounterInstance objects for each counter instance.

Available Commands

Possible commands that can be executed on a given resource. Ultimate subscription is required.

Command Type		Description
AzureVmScaleSetInstanceReboot		Reboots specified VM ScaleSet instance
AzureVmScaleSetInstanceReimage		Reimages specified VM ScaleSet instance
AzureVmScaleSetStart		Starts specified VM ScaleSet instance
AzureVmScaleSetStopDeallocate		Deallocates specified VM ScaleSet instance
WebRequest		Runs custom WebRequest to specified URL

Default Templates

CloudMonix provided default monitoring templates.

Samlpe configuration for Windows Service Fabric Scale Set

Pre-configured Metrics

Metric Name	Metric Type	Description
ApplicationsEventLogs	WindowsEventLogEntry	Tracks entries from the Windows Event Log (Application source)
AvgDiskReadPerSec	WindowsPerformanceCounter	Tracks the average time it takes to make a read across all drives
AvgDiskReadQueueLength	WindowsPerformanceCounter	Tracks the average Read queue length across all drives
AvgDiskWritePerSec	WindowsPerformanceCounter	Tracks the average time it takes to make a write across all drives
AvgDiskWriteQueueLength	WindowsPerformanceCounter	Tracks the average Write queue length across all drives
CpuTime	WindowsPerformanceCounter	Tracks overall CPU utilization on the monitored server
CpuTime5MinAverage	AggregatedMetric	Tracks 30-minute CPU utilization average across all instances within monitored VMSS
DiskFreeSpace	WindowsPerformanceCounterMultiInstance	Tracks free space in Megabytes on each disk
DiskFreeSpacePct	WindowsPerformanceCounterMultiInstance	Tracks amount of free space in % on each disk
DiskFreeSpaceTotal	WindowsPerformanceCounter	Tracks total amount of free space across all drives
DiskIdleTime	WindowsPerformanceCounter	Tracks the percentage of time when disk. Sustained numbers below 20% indicate an over-saturated disk.
DiskReadsBytesPerSec	WindowsPerformanceCounter	Tracks the Read throughput across all drives
DiskReadSpeed	WindowsPerformanceCounter	Tracks average time, in seconds, it takes to read data from the disk
DiskWritesBytesPerSec	WindowsPerformanceCounter	Tracks the Writes throughput across all drives
DiskWriteSpeed	WindowsPerformanceCounter	Tracks average time, in seconds, it takes to write data to the disk
FabricProcessDetected	AggregatedMetric
FabricProcessMemory	WindowsPerformanceCounter
InboundNetworkErrors	WindowsPerformanceCounterMultiInstance	Tracks the number of inbound packets that could not be transmitted because of errors
Instances	AzureVmssInstanceDetails
MemoryCommittedPct	WindowsPerformanceCounter	Tracks the amount of virtual memory in use. It is the ratio of Commited Bytes to the Commit Limit
MemoryFree	WindowsPerformanceCounter	Tracks free memory (in MBs) on the monitored instance
OutboundNetworkErrors	WindowsPerformanceCounterMultiInstance	Tracks the number of outbound packets that could not be transmitted because of errors
PagingFileUsage	WindowsPerformanceCounter	Overall usage of all paging files in %
ServiceFabricAvgInvocationTime	WindowsPerformanceCounterMultiInstance	Tracks the time taken to execute service methods in milliseconds
ServiceFabricAvgRequestTime	WindowsPerformanceCounterMultiInstance	Tracks time taken (in milliseconds) by the service to process requests
ServiceFabricExceptions	WindowsPerformanceCounterMultiInstance	Tracks the number of times that service methods throw an exception per second
ServiceFabricInvocations	WindowsPerformanceCounterMultiInstance	Tracks the number of times that service methods are invoked per second
ServiceFabricOutstandingRequests	WindowsPerformanceCounterMultiInstance	Tracks the number of requests being processed by services
ServiceFabricStatus	AzureServiceFabricStatus	Tracks the Service Fabric status of the monitored Scale Set
Status	ResourceStatus	Tracks the overall running status of the monitored instances within Scale Set
SystemEventLogs	WindowsEventLogEntry	Tracks entries from the Windows Event Log (System source)
SystemUptime	WindowsPerformanceCounter

Pre-configured Alerts

Alert Name	Expression	Severity	Description
Fabric Process Not Running	`FabricProcessDetected == 0`	Warning
High CPU	`CpuTime > 70`	Warning	Raises an alert when CPU utilization for a specific instance is over 70% for the last 5 minutes sustained
Instance Was Rebooted	`SystemUptime < 600`	Warning
Low Disk Space	`(Any(DiskFreeSpace, "Value < 1024 && Contains(Instance,\":\")") \|\| Any(DiskFreeSpacePct, "Value < 20 && Contains(Instance,\":\")") ) && DiskFreeSpaceTotal > 0`	Warning	Raises an alert when any of the disks has less than 1GB or 20% of free space left
Low Memory	`MemoryFree < 100`	Warning	Raises an alert if the amount of available physical memory on a specific instance, falls below 100MBs for the last 2 monitoring cycles sustained
Resource Outage	`Status == "Down"`	Error	Raises an alert when monitored server is reported as not-Ready by Azure of if no metrics come through from diagnostic agents, for a sustained period of time
Service Fabric Is Not Ready	`ServiceFabricStatus != "Ready"`	Warning	Raises an alert when monitored Service Fabric resource is reported as not-Ready by Azure for a sustained period of time

Pre-configured Actions

Action Name	Command Type	Expression	Severity	Description
Daily reboot	AzureVmScaleSetInstanceReboot	`CheckTimeUtc.Hour == (InstanceIndex % 24)`	Information	Disabled by default. Reboots VMSS instances once per day. Reboot happens when instance's index matches current clock hour (in UTC). For example: 1st instance is rebooted at UTC midnight, 2nd instance is rebooted at 1am UTC, etc. For deployments with 25+ instances, this action reboots every 24th instance. For example, for deployment with 100 instances, at UTC midnight the 1st, 25th, 49th, 73rd and 96th instances will be rebooted; at UTC 1am, 2nd, 26th, 50th, 74th and 97th instances will be rebooted; etc. More information here: http://support.cloudmonix.com/support/solutions/articles/5000629071
Low Ram Reboot	AzureVmScaleSetInstanceReboot	`MemoryFree < 100`	Warning	Disabled by default. Reboot VMSS instance if available memory drops below 100MB for 5 minutes sustained. This action will not be executed more than once per hour due to Suspended period setting.

Sample configuration for basic Azure VM Scale Set

Pre-configured Metrics

Metric Name	Metric Type	Description
ApplicationsEventLogs	WindowsEventLogEntry	Tracks entries from the Windows Event Log (Application source)
CpuTime	WindowsPerformanceCounter	Tracks overall CPU utilization on the monitored server
CpuTime30MinAverage	AggregatedMetric	Tracks 30-minute CPU utilization average across all instances within monitored VMSS
DiskFreeSpace	WindowsPerformanceCounterMultiInstance	Tracks free space in Megabytes on each disk
DiskFreeSpaceTotal	WindowsPerformanceCounter	Tracks total amount of free space across all drives
DiskIdleTime	WindowsPerformanceCounter	Tracks the percentage of time when disk. Sustained numbers below 20% indicate an over-saturated disk.
DiskReadSpeed	WindowsPerformanceCounter	Tracks average time, in seconds, it takes to read data from the disk
DiskWriteSpeed	WindowsPerformanceCounter	Tracks average time, in seconds, it takes to write data to the disk
Instances	AzureVmssInstanceDetails
MemoryCommittedPct	WindowsPerformanceCounter	Tracks the amount of virtual memory in use. It is the ratio of Commited Bytes to the Commit Limit
MemoryFree	WindowsPerformanceCounter	Tracks free memory (in MBs) on the monitored instance
RecommendedActions	AzureAdvisorRecommendationMetric	Tracks recommended actions for specified resource.
Status	ResourceStatus	Tracks the overall running status of the monitored instances within Scale Set
SystemEventLogs	WindowsEventLogEntry	Tracks entries from the Windows Event Log (System source)

Pre-configured Alerts

Alert Name	Expression	Severity	Description
High CPU	`CpuTime > 70`	Warning	Raises an alert when CPU utilization for a specific instance is over 70% for the last 5 minutes sustained
Low Disk Space	`Any(DiskFreeSpace, "Value < 1024 && Contains(Instance,\":\")") && DiskFreeSpaceTotal > 0`	Warning	Raises an alert when any of the disks has less than 1GB of free space left
Low Memory	`MemoryFree < 100`	Warning	Raises an alert if the amount of available physical memory on a specific instance, falls below 100MBs for the last 2 monitoring cycles sustained
Resource Outage	`Status == "Down"`	Error	Raises an alert when monitored server is reported as not-Ready by Azure of if no metrics come through from diagnostic agents, for a sustained period of time

Pre-configured Actions

Action Name	Command Type	Expression	Severity	Description
Daily reboot	AzureVmScaleSetInstanceReboot	`CheckTimeUtc.Hour == (InstanceIndex % 24)`	Information	Disabled by default. Reboots VMSS instances once per day. Reboot happens when instance's index matches current clock hour (in UTC). For example: 1st instance is rebooted at UTC midnight, 2nd instance is rebooted at 1am UTC, etc. For deployments with 25+ instances, this action reboots every 24th instance. For example, for deployment with 100 instances, at UTC midnight the 1st, 25th, 49th, 73rd and 96th instances will be rebooted; at UTC 1am, 2nd, 26th, 50th, 74th and 97th instances will be rebooted; etc. More information here: http://support.cloudmonix.com/support/solutions/articles/5000629071
Low Ram Reboot	AzureVmScaleSetInstanceReboot	`MemoryFree < 100`	Warning	Disabled by default. Reboot VMSS instance if available memory drops below 100MB for 5 minutes sustained. This action will not be executed more than once per hour due to Suspended period setting.

Sample configuration for IIS farm on Azure VM Scale Set

Pre-configured Metrics

Metric Name	Metric Type	Description
ApplicationsEventLogs	WindowsEventLogEntry	Tracks entries from the Windows Event Log (Application source)
AspNetApplicationRestarts	WindowsPerformanceCounter	Tracks the number of times that an application has been restarted during the Web server's lifetime. Application restarts are incremented each time an Application_OnEnd event is raised. An application restart can occur because of changes to the Web.config file, changes to assemblies stored in the application's Bin directory, or when an application must be recompiled due to numerous changes in ASP.NET Web pages. Unexpected increases in this counter can mean that problems are causing Web application to recycle.
AspNetBytesOut	WindowsPerformanceCounter	Tracks total size in bytes of responses sent to a client. Does not include HTTP response headers.
AspNetErrors	WindowsPerformanceCounter	Tracks the average number of errors that occurred per second during the execution of HTTP requests. Includes any parser, compilation, or run-time errors.
AspNetRequests	WindowsPerformanceCounter	Tracks the number of requests executed per second. This represents the current throughput of the application. Under constant load, this number should remain within a certain range, barring other server work (such as garbage collection, cache cleanup thread, external server tools, and so on).
AspNetRequestsQueued	WindowsPerformanceCounter	Tracks the number of requests waiting for service from the queue. When this number starts to increment linearly with increased client load, the Web server computer has reached the limit of concurrent requests that it can process.
AspNetRequestsRejected	WindowsPerformanceCounter	Tracks the total number of requests not executed because of insufficient server resources to process them. This counter represents the number of requests that return a 503 HTTP status code, indicating that the server is too busy
AspNetRequestsWaitTime	WindowsPerformanceCounter	Tracks the number of milliseconds that the most recent request waited in the queue for processing
CpuTime	WindowsPerformanceCounter	Tracks overall CPU utilization on the monitored server
CpuTime30MinAverage	AggregatedMetric	Tracks 30-minute CPU utilization average across all instances within monitored VMSS
DiskFreeSpace	WindowsPerformanceCounterMultiInstance	Tracks free space in Megabytes on each disk
DiskFreeSpaceTotal	WindowsPerformanceCounter	Tracks total amount of free space across all drives
DiskIdleTime	WindowsPerformanceCounter	Tracks the percentage of time when disk. Sustained numbers below 20% indicate an over-saturated disk.
DiskReadSpeed	WindowsPerformanceCounter	Tracks average time, in seconds, it takes to read data from the disk
DiskWriteSpeed	WindowsPerformanceCounter	Tracks average time, in seconds, it takes to write data to the disk
Instances	AzureVmssInstanceDetails
MemoryCommittedPct	WindowsPerformanceCounter	Tracks the amount of virtual memory in use. It is the ratio of Commited Bytes to the Commit Limit
MemoryFree	WindowsPerformanceCounter	Tracks free memory (in MBs) on the monitored instance
RecommendedActions	AzureAdvisorRecommendationMetric	Tracks recommended actions for specified resource.
Status	ResourceStatus	Tracks the overall running status of the monitored instances within Scale Set
SystemEventLogs	WindowsEventLogEntry	Tracks entries from the Windows Event Log (System source)

Pre-configured Alerts

Alert Name	Expression	Severity	Description
High CPU	`CpuTime > 70`	Warning	Raises an alert when CPU utilization for a specific instance is over 70% for the last 5 minutes sustained
Low Disk Space	`Any(DiskFreeSpace, "Value < 1024 && Contains(Instance,\":\")") && DiskFreeSpaceTotal > 0`	Warning	Raises an alert when any of the disks has less than 1GB of free space left
Low Memory	`MemoryFree < 100`	Warning	Raises an alert if the amount of available physical memory on a specific instance, falls below 100MBs for the last 2 monitoring cycles sustained
Requests are Queueing Up	`AspNetRequestsQueued > 10`	Warning	Raises an alert when the number of queued requests exceeds 10, for 5 minutes sustained. Queued requests indicate that IIS or backened processes are not able to process the requests quickly enough
Resource Outage	`Status == "Down"`	Error	Raises an alert when monitored server is reported as not-Ready by Azure of if no metrics come through from diagnostic agents, for a sustained period of time

Pre-configured Actions

Action Name	Command Type	Expression	Severity	Description
Daily reboot	AzureVmScaleSetInstanceReboot	`CheckTimeUtc.Hour == (InstanceIndex % 24)`	Information	Disabled by default. Reboots VMSS instances once per day. Reboot happens when instance's index matches current clock hour (in UTC). For example: 1st instance is rebooted at UTC midnight, 2nd instance is rebooted at 1am UTC, etc. For deployments with 25+ instances, this action reboots every 24th instance. For example, for deployment with 100 instances, at UTC midnight the 1st, 25th, 49th, 73rd and 96th instances will be rebooted; at UTC 1am, 2nd, 26th, 50th, 74th and 97th instances will be rebooted; etc. More information here: http://support.cloudmonix.com/support/solutions/articles/5000629071
Low Ram Reboot	AzureVmScaleSetInstanceReboot	`MemoryFree < 100`	Warning	Disabled by default. Reboot VMSS instance if available memory drops below 100MB for 5 minutes sustained. This action will not be executed more than once per hour due to Suspended period setting.