Diagnostics and Error Reporting

Microsoft Lync Server 2010 adds significant improvements in diagnostics and debugging for the administrator and application developer. Microsoft Unified Communications Managed API (UCMA) 3.0 has made a number of improvements that can aid those working with UCMA 3.0 to quickly and efficiently discover, diagnose, and resolve any failures that occur within the platform or application, and that can result in failure or call termination.

As part of the manageability and alerting capabilities in Lync Server 2010, errors are placed on the wire by UCMA 3.0 applications, Lync, and Lync Server 2010 that provide a first hint to an administrator tracking a failure in the Lync Server 2010 system. Errors are categorized and surfaced to the administrator of an Lync Server 2010 deployment according to severity and frequency of occurrence. Based on these factors, alerts are raised to the administrator to indicate failures or recurring issues within the Lync Server 2010 deployment or the application pool.

If the UCMA 3.0 platform causes the termination of an existing session or declines an incoming session, the platform automatically sends the appropriate error code on the wire in a header with the following form, shown in Backus-Naur form.

ms-diagnostics HCOLON  ErrorId SEMI source-param  SEMI reason-param *(SEMI generic-param)

ErrorId = unsigned-integer

Required. Value MUST be within unsigned integer range. ErrorId represents a specific error condition, and SHOULD be used by the SIP client to determine appropriate error handling behavior.

source-param = "source=" source-valuesource-value = quoted-string

Optional. Value SHOULD be the FQDN or the IP address of SIP server generating the header.

reason-param = "reason=" reason-value

reason-value = quoted-stringOptional. Reason should indicate a specific reason for an explanation of the error. The SIP client SHOULD NOT use this parameter value for defining error handling behavior. This parameter value can be used for SIP server troubleshooting purposes.

*(SEMI generic-param) Optional. generic-param can be used to define custom attribute-value pairs to convey additional information to the SIP client on how to troubleshoot or fix the problem.

The following are three example headers. An "ms-diagnostics" header is sent for application endpoints; an "ms-client-diagnostics" header is sent for user endpoints.

ms-diagnostics: 24081;Component="RTCC/4.0.0.0_ContosoApplication";Reason="Endpoint termination";Source= applicationserver.contoso.com

ms-client-diagnostics: 24083;Component="RTCC/4.0.0.0_ContosoApplication";Reason="Message was received out of dialog.";request=BENOTIFY;Source=applicationserver.contoso.com

ms-client-diagnostics: 24067;Component="RTCC/4.0.0.0_ContosoApplication";Reason="Mcu is rolling over";Source= applicationserver.contoso.com

"ms-diagnostics" and "ms-client-diagnostics" do not cross federation boundaries. To provide diagnostic information to an administrator across a federation boundary, use an “ms-diagnostics-public” header.

For the full error code descriptions for UCMA 3.0, UCCP, and Lync Server 2010, see [MS-OCER]: Client Error Reporting Protocol Specification.

It is strongly advised that an application add its own diagnostic header if the API call to send failure or rejection is caused by the application, either directly or as a reaction to changing server/platform conditions. Applications can use the members on the DiagnosticsInformation class to supply diagnostics code.

The UCMA 3.0 platform has reserved a set of diagnostic ranges solely for use by developers so that errors raised from the application will be captured and logged within the Lync Server 2010 reporting infrastructure.

The following table lists the reserved ranges.

Range and range name

Description

60000 - 60999

These codes should be used to indicate ‘success’: terminations or responses that are expected in the normal functioning of the application. An example is an application hanging up due to the user completing an Interactive Voice Response (IVR) application.

61000 - 61999

These codes should be used to indicate expected failures: terminations or responses that are expected in common error cases. An example is an application hang-up due to a user not providing any input to an IVR, and the application timing out.

62000 - 62999

These codes should be used to indicate unexpected failures: terminations or responses due to the application entering some unexpected state or corner case. An example is an AVFlow changing its state to Idle in the middle of a speech recognition attempt, and the application choosing to terminate at that time.