Maintaining Physical Disk Resources

When a Physical Disk Resource has been created and assigned to a cluster, routine disk maintenance creates an ownership conflict between the Cluster Administrator and the system utilities used to perform the disk maintenance. Windows Server 2003 with Service Pack 1 (SP1) introduces maintenance mode for disk resources to address the ownership issue without causing a resource fail over.

Administrators use tools such as ChkDsk and VSS as part of weekly maintenance operation to ensure that disks are functional and there are no operational issues. These tools require exclusive access to the volume during their run. While these tools are in use, applications cannot read or write to the disk. The administrator expects the disk maintenance to succeed without ChkDsk failure and without a failover of the disk that the ChkDsk is run against.

Under normal circumstances, the cluster disk resources will fail over when ChkDsk (fix error mode), VSS restore or any other tool that locks or dismounts the volume is run against a clustered disk. These tools fails part way through since cluster disk resource fails its health check that causes the cluster service to fail over the disk to the other node. This causes the node where these tools are run to lose access to the disk.

Previous to Windows Server 2003 with SP1, the only way to run these tools without causing a failover is for the administrator to modify the LooksAlive and IsAlive time of the disk resource to maximum values prior to running them. Windows Server 2003 with SP1 adds maintenance mode support which allows the administrator to keep the resource in online mode and run maintenance utilities without causing failover.

Maintenance mode is a mechanism provided through failover cluster cmdlets and the Failover Cluster API that places the specified resource in a mode that will disable health checking. After maintenance mode is enabled for a resource, Resource Monitor will ignore health check calls on the resource even though the resource is left in online mode. This will allow tools like ChkDsk to function against a resource that is in maintenance mode. Administrators should note that while ChkDsk is running the disk resource is not available to the application even though the resource is in online mode. Additionally, if the resource is genuinely not available then any application or resource dependent on the resource that is put in maintenance mode will fail. Hence, it is critical that the user remove the maintenance mode as soon as the administrative task is done. Leaving a resource in maintenance mode may lead to unexpected application failures. This mode is persisted on the node until the administrator explicitly removes it. However, this mode is non-persistent across groups, and any failover or reboot of the node will clear this mode.

Windows Server 2008, and Windows Server 2003:  

Failover cluster cmdlets are not available; the Cluster.exe command is used instead.

The cluster administrator can set, remove, or query a disk resource's maintenance mode through failover cluster cmdlets, or a calling application can place the resource in maintenance mode by using the ClusterResourceControl API function with either the CLUSCTL_RESOURCE_SET_MAINTENANCE_MODE or CLUSCTL_RESOURCE_QUERY_MAINTENANCE_MODE cluster resource control codes.

Windows Server 2008, and Windows Server 2003:  

Failover cluster cmdlets are not available; the Cluster.exe command is used instead.

Resources must be online and not be part of the resource quorum in order to set, remove, or query maintenance mode. Events which set or remove maintenance mode of a server are logged into the system event log.

Windows Server 2003:  If an event takes the resource offline or the resource fails, the resource monitor automatically cancels maintenance mode for that resource. Moving a resource through administrative action or through node failure cancels maintenance mode for that resource.



© 2014 Microsoft