Failover Clustering and Always On Availability Groups (SQL Server)

Applies to: SQL Server - Windows only

Always On availability groups, the high availability and disaster recovery solution introduced in SQL Server 2012 (11.x), requires Windows Server Failover Clustering (WSFC). Also, though Always On availability groups is not dependent upon SQL Server Failover Clustering, you can use a failover clustering instance (FCI) to host an availability replica for an availability group. It is important to know the role of each clustering technology, and to know what considerations are necessary as you design your Always On availability groups environment.

Note

For information about Always On availability groups concepts, see Overview of Always On Availability Groups (SQL Server).

Windows Server Failover Clustering and Availability Groups

Deploying Always On availability groups requires a Windows Server Failover Cluster (WSFC). To be enabled for Always On availability groups, an instance of SQL Server must reside on a WSFC node, and the WSFC and node must be online. Furthermore, each availability replica of a given availability group must reside on a different node of the same WSFC. The only exception is that while being migrated to another WSFC, an availability group can temporarily straddle two clusters.

Always On availability groups relies on the Windows Server Failover Cluster (WSFC) to monitor and manage the current roles of the availability replicas that belong to a given availability group and to determine how a failover event affects the availability replicas. A WSFC resource group is created for every availability group that you create. The WSFC monitors this resource group to evaluate the health of the primary replica.

The quorum for Always On availability groups is based on all nodes in the WSFC regardless of whether a given cluster node hosts any availability replicas. In contrast to database mirroring, there is no witness role in Always On availability groups.

The overall health of a WSFC is determined by the votes of quorum of nodes in the cluster. If the WSFC goes offline because of an unplanned disaster, or due to a persistent hardware or communications failure, manual administrative intervention is required. A Windows Server or WSFC administrator will need to force a quorum and then bring the surviving cluster nodes back online in a non-fault-tolerant configuration.

Important

Always On availability groups registry keys are subkeys of the WSFC. If you delete and re-create a WSFC, you must disable and re-enable the Always On availability groups feature on each instance of SQL Server that hosted an availability replica on the original WSFC.

For information about running SQL Server on WSFC nodes and about WSFC quorum, see Windows Server Failover Clustering (WSFC) with SQL Server.

SQL Server Failover Cluster Instances (FCIs) and Availability Groups

You can set up a second layer of failover at the server-instance level by implementing SQL Server and FCI together with the WSFC. An availability replica can be hosted by either a standalone instance of SQL Server or an FCI instance. Only one FCI partner can host a replica for a given availability group. When an availability replica is running on an FCI, the possible owners list for the availability group will contain only the active FCI node.

Always On availability groups does not depend on any form of shared storage. However, if you use a SQL Server failover cluster instance (FCI) to host one or more availability replicas, each of those FCIs will require shared storage as per standard SQL Server failover cluster instance installation.

For more information about additional prerequisites, see the "Prerequisites and Restrictions for Using a SQL Server Failover Cluster Instance (FCI) to Host an Availability Replica" section of Prerequisites, Restrictions, and Recommendations for Always On Availability Groups (SQL Server).

Comparison of Failover Cluster Instances and Availability Groups

Regardless of the number of nodes in the FCI, an entire FCI hosts a single replica within an availability group. The following table describes the distinctions in concepts between nodes in an FCI and replicas within an availability group.

Nodes within an FCI Replicas within an availability group
Uses WSFC Yes Yes
Protection level Instance Database
Storage type Shared Non-shared

While the replicas in an availability group do not share storage, a replica that is hosted by an FCI uses a shared storage solution as required by that FCI. The storage solution is shared only by nodes within the FCI and not between replicas of the availability group.
Storage solutions Direct attached, SAN, mount points, SMB Depends on node type
Readable secondaries No* Yes
Applicable failover policy settings WSFC quorum

FCI-specific

Availability group settings**
WSFC quorum

Availability group settings
Failed-over resources Server, instance, and database Database only

*Whereas synchronous secondary replicas in an availability group are always running on their respective SQL Server instances, secondary nodes in an FCI actually have not started their respective SQL Server instances and are therefore not readable. In an FCI, a secondary node starts its SQL Server instance only when the resource group ownership is transferred to it during an FCI failover. However, on the active FCI node, when an FCI-hosted database belongs to an availability group, if the local availability replica is running as a readable secondary replica, the database is readable.

**Failover policy settings for the availability group apply to all replicas, whether it is hosted in a standalone instance or an FCI instance.

Note

For more information about Number of nodes within FCIs and Always On Availability Groups for different editions of SQL Server, see Features Supported by the Editions of SQL Server 2012 (https://go.microsoft.com/fwlink/?linkid=232473).

Considerations for hosting an Availability Replica on an FCI

Important

If you plan to host an availability replica on a SQL Server Failover Cluster Instance (FCI), ensure that the Windows Server 2008 host nodes meet the Always On prerequisites and restrictions for Failover Cluster Instances (FCIs). For more information, see Prerequisites, Restrictions, and Recommendations for Always On Availability Groups (SQL Server).

SQL Server Failover Cluster Instances (FCIs) do not support automatic failover by availability groups, so any availability replica that is hosted by an FCI can only be configured for manual failover.

You might need to configure a WSFC to include shared disks that are not available on all nodes. For example, consider a WSFC across two data centers with three nodes. Two of the nodes host a SQL Server failover cluster instance (FCI) in the primary data center and have access to the same shared disks. The third node hosts a stand-alone instance of SQL Server in a different data center and does not have access to the shared disks from the primary data center. This WSFC configuration supports the deployment of an availability group if the FCI hosts the primary replica and the stand-alone instance hosts the secondary replica.

When choosing an FCI to host an availability replica for a given availability group, ensure that an FCI failover could not potentially cause a single WSFC node to attempt to host two availability replicas for the same availability group.

The following example scenario illustrates how this configuration could lead to problems:

Marcel configures a WSFC with two nodes, NODE01 and NODE02. He installs a SQL Server failover cluster instance, fciInstance1, on both NODE01 and NODE02 where NODE01 is the current owner for fciInstance1.
On NODE02, Marcel installs another instance of SQL Server, Instance3, which is a stand-alone instance.
On NODE01, Marcel enables fciInstance1 for Always On availability groups. On NODE02, he enables Instance3 for Always On availability groups. Then he sets up an availability group for which fciInstance1 hosts the primary replica, and Instance3 hosts the secondary replica.
At some point fciInstance1 becomes unavailable on NODE01, and the WSFC causes a failover of fciInstance1 to NODE02. After the failover, fciInstance1 is a Always On availability groups-enabled instance running under the primary role on NODE02. However, Instance3 now resides on the same WSFC node as fciInstance1. This violates the Always On availability groups constraint.
To correct the problem that this scenario presents, the stand-alone instance, Instance3, must reside on another node in the same WSFC as NODE01 and NODE02.

For more information about SQL Server FCIs, see Always On Failover Cluster Instances (SQL Server).

Restrictions on Using The WSFC Failover Cluster Manager with Availability Groups

Do not use the Failover Cluster Manager to manipulate availability groups, for example:

  • Do not add or remove resources in the clustered service (resource group) for the availability group.

  • Do not change any availability group properties, such as the possible owners and preferred owners. These properties are set automatically by the availability group.

  • Do not use the Failover Cluster Manager to move availability groups to different nodes or to fail over availability groups. The Failover Cluster Manager is not aware of the synchronization status of the availability replicas, and doing so can lead to extended downtime. You must use Transact-SQL or SQL Server Management Studio.

Warning

Using the Failover Cluster Manager to move a failover cluster instance hosting an availability group to a node that is already hosting a replica of the same availability group may result in the loss of the availability group replica, preventing it from being brought online on the target node. A single node of a failover cluster cannot host more than one replica for the same availability group. For more information on how this occurs, and how to recover, see the blog Replica unexpectedly dropped in availability group.

Related Content

See Also

Overview of Always On Availability Groups (SQL Server)
Enable and Disable Always On Availability Groups (SQL Server)
Monitor Availability Groups (Transact-SQL)
Always On Failover Cluster Instances (SQL Server)