Monitoring Workflow Manager 1.0

 

Updated: October 26, 2012

To ensure high availability and reliability of Workflow Manager 1.0, it is important to monitor your server to ensure that it is operating in good health and to detect failures as quickly as they occur such that corrective action can be taken. This article will discuss the capabilities available to monitor your Workflow Manager 1.0 environment.

Typical ways to monitor a server include the following:

  1. Performance counters

  2. Event tracing

  3. PowerShell

  4. System Center Operations Manager Management Pack

Performance Counters

Performance counters are great for providing information as to how well the server is performing. Performance counters are grouped by Counter Sets.

Workflow Manager 1.0 emits its own set of Performance Counters to aid you monitor your server. Workflow Manager 1.0 defines two counter sets: Management and Dispatcher. The individual counters are defined under the respective counter set. You can find the performance counters in these counter sets when you open Performance Monitor on a machine with Workflow Manager 1.0 installed. You can then take a look at the "Workflow Management" and "Workflow Dispatcher" counter sets.

The below table summarizes the performance counters available in these two sets

IndexPerformance CounterDetails
1Management requests per secondNumber of requests processed by the front-end per second on a given node.
2Workflow events per secondNumber of successful PublishNotification calls per second on a given node.
3Management request failures per secondNumber of front-end calls per second resulting in an error response to the caller on a given node (per second). The errors could be because of bad requests, authorization errors, or validation errors.
4Authorization errors per secondNumber of authorization errors per second on a given node.
5Publish workflow event durationAverage latency of publishing a workflow notification.
6Episodes outstandingNumber of workflow instances executing on a given backend node.
7Episodes failed per secondNumber of workflow instance execution errors reported per second on a given backend node.
8Events processed per secondNumber of workflow notifications successfully processed per second on a given node.

The following is an example of a health model derived from the above performance counters.

SymptomSourceContent: Cause, Resolution, Summary
Node not appearing to be processing any messages.RequestsProcessedPerSecondNo activity for 10 minutes.
Workflow instance not appearing to complete(EpisodesCompletedPerSecond / RequestsProcessedPerSecond) * 100Below N% - N can be user defined; for example, 10.
Workflow Instance FailureRequestsFailedPerSecondNumber of failures.

You can also add performance counters from Windows such as CPU and Memory Utilization.

Event Tracing

Workflow Manager 1.0 components use Event Tracing for Windows (ETW) for tracing. ETW is the ideal choice for tracing as it has the least overhead in terms of performance. Also, ETW logs are smaller than logs in other formats. All components of the service use an ETW provider named Microsoft- Workflow.

Workflow Manager 1.0 uses the following ETW channels, which are available by default.

  • Operational Channel: This channel is used for traces reporting critical issues that requires operator involvement. Examples include service faulting or SLA threshold reached.

  • Debug Channel: All diagnostic traces use this channel.

  • Analytic Channel: This channel is used for high value traces, such as the amount of time taken to complete an operation. The events can have additional metadata like scope or operation name.

A complete list of events generated by Workflow Manager 1.0 can be found in the Microsoft.Workflow.EventDefinitions.man ETW Manifest file located in the [InstallDrive]:\Program Files\Workflow Manager\1.0\Workflow folder.

Some of the events of interest in that file that are particularly important to monitoring the health of your server are listed in the table below.

IssueEvent IDs emitted
WF backend startup failed289
Unhandled exception1, 10, 19
Frequent unhandled exceptions in a particular node5 events of 1, 10, or 19 within 30 minutes
Frequent Service Started events5 events of 288 or 582 within 30 minutes

PowerShell Cmdlets

PowerShell is a great way to administrate your Workflow Manager 1.0 server. Workflow Manager 1.0 includes cmdlets that provide you the state of the Workflow farm and its health status. Workflow Manager 1.0 provides administrators with a shortcut to initiate the Workflow PowerShell prompt in the Workflow Manager 1.0 Programs group in the Start menu. You could also invoke these cmdlets programmatically by importing the Workflow Manager 1.0 PowerShell modules. All Workflow Manager 1.0 cmdlets are defined in the Microsoft.Workflow.Commands PowerShell module found in the Workflow Manager 1.0 installation directory.

There are two cmdlets that are particularly useful for server monitoring: Get-WFFarm and Get-WFFarmStatus.

Get-WFFarm

The Get-WFFarm cmdlet is a quick way of retrieving all the details about your Workflow Farm. This cmdlet will return the below information about your farm.

ValueDescription
HostsLists the hosts (or computers) in your farm.
EndpointsLists both the http and https endpoints on the hosts.
WFFarmDBConnectionStringThe connection string for the workflow farm database. The workflow farm database contains all of the configuration information for the farm.
RunAsAccountThe account under which the workflow backend service is run.
AdminGroupReturns which Windows Authentication Security Group is configured as the Administrators group for the Workflow Farm.
InstanceDBConnectionStringThe connection string for the Instance database. The Instance database contains instance information of your persisted workflows. It is highly recommended that you do not update any information in this database. This connection string is only used for supplying to other offline cmdlets such as ones used for disaster recovery.
ResourceDBConnectionStringThe connection string for the Resource database. The Resource database contains your workflow and activity definitions. It is highly recommended that you do not update any information in this database. This connection string is only used for supplying to other offline cmdlets such as ones used for disaster recovery.
HttpPortLists the Httpport of the Workflow front end if the service is configured with http.
HttpsPortLists the Httpsport of the Workflow front end.
OutboundCertificateReturns the thumbprint of the outbound certificate. Also returns whether this certificate was autogenerated during installation.
SslCertificateReturns the thumbprint of the SSL certificate. Also returns whether this certificate was autogenerated during installation.

Get-WFFarmStatus

System_CAPS_ICON_note.jpg Note

Get-AzureWFFarmStatus is not included in Workflow Manager 1.0, but will be included as part of the 1.0 RTM.

The Get-AzureWFFarmStatus cmdlet provides the basic status of the farm and its nodes.

From each of the nodes, Get-AzureWFFarmStatus will provide the health of the Workflow Backend Windows service and whether the Workflow Front end was reachable on that node or not.

Management Pack

System_CAPS_ICON_note.jpg Note

Workflow Manager 1.0 does not include a Management Pack as part of installation, but it will be available for download separately around the time of our 1.0 RTM. This Management Pack will support Microsoft System Center 2012 as well as System Center 2007 R2.

The Performance Counters, Event Traces and PowerShell cmdlets provide insights into the health of the farm. However, true enterprise-class reliability requires not only constant monitoring of the server but also an alerting mechanism that activates when a failure is detected. Microsoft System Center Operations Manager Management Pack provides this alerting capability.

The majority of the events and performance counters covered in this article will be supported in the System Center Management Pack. The management pack will be targeted at monitoring the Workflow Manager 1.0 farm and its nodes, and not particularly targeted at monitoring Workflow Manager 1.0 artifacts such as workflow instances.

The following diagram shows a typical health model for Workflow Manager 1.0.

Workflow health model

Community Additions

ADD
Show: