HangRecoveryAction

Specifies the recovery action taken by the cluster service in response to a heartbeat countdown timeout.

AttributeValue
Data typeDWORD
Access Read/write
Structure CLUSPROP_DWORD
MinimumWatchdogActionDisable (0)

Windows Server 2008 R2 and Windows Server 2003:  ClussvcHangActionDisable (0) is the minimum value.

MaximumWatchdogActionBugCheckAlsoFromProcess (4)

Windows Server 2008 R2 and Windows Server 2003:  ClussvcHangActionBugCheckMachine (3) is the maximum value.

DefaultWatchdogActionBugCheckOnlyFromNetFt (3)

Windows Server 2008 R2:  ClussvcHangActionBugCheckMachine (3) is the default value.

Windows Server 2003:  ClussvcHangActionTerminateService (2) is the default value.

 

Remarks

The Cluster network driver maintains a countdown timer that initiates the HangRecoveryAction property when it reaches 0 (zero). Whenever the ClusNet driver receives a Cluster service heartbeat, the countdown time is reset to the ClusSvcHeartbeatTimeout property. Additionally, when the Cluster service stops for any reason, the Cluster network driver automatically turns off the countdown timer.

The HangRecoveryAction property can be set to the following values.

ValueDescription

WatchdogActionDisable

Windows Server 2008 R2 and Windows Server 2003:  The name of the value is ClussvcHangActionDisable.

0

NetFt WatchDog: Takes no action.

Core Operations WatchDog: Takes no action.

Windows Server 2008 R2 and Windows Server 2003:  Disables the cluster heartbeat and monitoring mechanism.

WatchdogActionLog

Windows Server 2008 R2 and Windows Server 2003:  The name of the value is ClussvcHangActionLog.

1

NetFt WatchDog: Logs a system event.

Core Operations WatchDog: Logs a system event.

Windows Server 2008 R2 and Windows Server 2003:  Log an event in the system log of the event viewer when a heartbeat countdown timeout occurs.

WatchdogActionTerminateProcess

Windows Server 2008 R2 and Windows Server 2003:  The name of the value is ClussvcHangActionTerminateService.

2

NetFt WatchDog: Terminates the Cluster service.

Core Operations WatchDog: Terminates the cluster service.

Windows Server 2008 R2 and Windows Server 2003:  Terminate the cluster service when a heartbeat countdown timeout occurs.

WatchdogActionBugCheckOnlyFromNetFt

Windows Server 2008 R2 and Windows Server 2003:  The name of the value is ClussvcHangActionBugCheckMachine.

3

NetFt WatchDog: Bugchecks the machine.

Core Operations WatchDog: Terminates the cluster service.

Windows Server 2008 R2 and Windows Server 2003:  Create a system Stop error (BugCheck) when a heartbeat countdown timeout occurs.

WatchdogActionBugCheckAlsoFromProcess

Windows Server 2008 R2 and Windows Server 2003:  This value is not available.

4

NetFt WatchDog: Bugchecks the machine.

Core Operations WatchDog: Bugchecks the machine.

 

Note  In some extreme cases, system services may also stop responding, and actions 1 and 2 may not succeed. In such cases, action 3 (bugcheck) is the only effective recovery measure.

If the action is set to cause a bugcheck on the cluster node, Windows stops responding and you receive the Stop error Bugcheck code of 0x9E. The Stop error causes a failover to another cluster node. Additionally, if the node where the Stop error occurs is configured to capture a memory dump file, you may be able to use the information that is contained in the memory dump file to diagnose the cause of the unresponsive cluster node.

The following code is an example of a stack trace from a Kernel dump that the Cluster network driver initiated:

ChildEBP    RetAddr
f9c33ea8    f6e2e11f    nt!KeBugCheckEx+0x19
f9c33ecc    f6e2e836    clusnet!CnpCheckClussvcHang+0xef
f9c33ef0    805070d7    clusnet!CnpHeartBeatDpc+0x47e
f9c33fa4    8050735d    nt!KiTimerExpiration+0x371
f9c33ff4    80543ccf    nt!KiRetireDpcList+0x63

The Bugcheck error code is similar to the following error code: BugCheck 9E, {812d5b08, 3c, 0, 0}

Note  You must manually configure the server to generate a memory dump file in response to a Bugcheck.

Examples

The property value portion of a property list entry for HangRecoveryAction can be set with the following example code:


DWORD          ClusSvcHangActionData = 1;
CLUSPROP_DWORD ClusSvcHangActionValue;

ClusSvcHangActionValue.Syntax.dw = CLUSPROP_SYNTAX_LIST_VALUE_DWORD;
ClusSvcHangActionValue.cbLength  = sizeof(DWORD);
ClusSvcHangActionValue.dw        = ClusSvcHangActionData;


Requirements

Minimum supported client

None supported

Minimum supported server

Windows Server 2003 Enterprise, Windows Server 2003 Datacenter

See also

CLUSPROP_DWORD

 

 

Show:
© 2014 Microsoft