Lead Hosts and Cluster Management (AppFabric 1.1 Caching)

An Microsoft AppFabric 1.1 for Windows Server cache cluster is a dynamic group of servers working together to provide a single unified logical cache for your application's data. To make this happen, there is some overhead required to orchestrate the cluster operations between the cache hosts. The cluster management role is responsible for managing the cache hosts, and ultimately, the cache cluster.

There are two main options for how the cache cluster management role is performed. The first option is for this role to be performed by special cache hosts, referred to as lead hosts. This is also referred to as "onloading". The second option is for this role to be performed by SQL Server. This is also referred to as "offloading", because the responsibility is offloaded to SQL Server instead of the cache cluster itself.

If you store your cache cluster configuration data in a shared network folder (XML), you must use onloading with lead host management. Note that lead hosts perform the same caching duties as other cache hosts not designated as lead hosts, but they have the additional responsibilities of working with other lead hosts to perform the cluster management role.

In Windows Server AppFabric 1.0, the default for a cache cluster that stored its configuration data in SQL Server was to offload the cache cluster management role to SQL Server. This has changed in AppFabric 1.1. The default for new cache clusters is to always use onloading where lead hosts to manage the cache cluster. This improves the availability of the cache cluster, because the cache cluster can remain partially functional if the configuration store becomes unavailable whether that store is an XML file or a SQL Server database. Note that operations that inspect or change the configuration of the cache cluster will be unavailable during this situation.

Note

If you upgrade an existing AppFabric 1.0 cache cluster to AppFabric 1.1, the upgrade does not change the cache cluster management behavior. If the upgraded cache cluster uses offloading and you want to change it, you'll have to recreate the cache cluster using Windows PowerShell commands. For more information and examples, see Automated Installation (AppFabric 1.1 Caching). To recreate the cluster more easily, you can use the Export-CacheClusterConfig and Import-CacheClusterConfig commands. However, you must ensure that the leadHostManagement attribute is set to "true". For more information, see Lead Hosts and Cluster Management (AppFabric 1.1 Caching).

It is still possible to offload all cache cluster management responsibilities to SQL Server. First, you must manually create the cache cluster with the New-CacheCluster command and set the Offloading parameter to "true". The other requirement is that the Provider must be SQL Server (System.Data.SqlClient).

The following table shows how your choice at installation time relates to your options for cluster management. For more information about choosing which of these configuration options is right for you, see Cluster Configuration Storage Options.

Cluster configuration storage type Cluster configuration storage location Cluster management

XML file

shared network folder

lead hosts

SQL Server database

SQL Server

SQL Server or lead hosts (default)

Custom provider

custom store

lead hosts

Cluster Management Role Duties

There are two main configuration settings that determine how the cluster functions with respect to cluster management:

  • leadHostManagement: This cluster-level setting determines what will perform the cluster management role. When true, lead hosts perform the cluster management role. If you have chosen to store your cluster configuration settings in a shared network folder, true is the only valid value for this setting. False indicates that either SQL Server or a custom provider will perform the cluster management role. When using SQL Server or a custom provider to store cluster configuration settings, you can set this setting to true and let lead hosts perform the cluster management role.

  • leadHost: This cache host-level setting determines which cache hosts will be lead hosts when lead hosts perform the cluster management role. Even if SQL Server is going to perform the cluster management role, the installation program designates lead hosts, in case you later change the leadHostManagement setting.

For more information about changing these settings, see Set the Cluster Management Role and Lead Host Designations (AppFabric 1.1).

When Lead Hosts Perform the Cluster Management Role

When the leadHostManagement and leadHost settings are true, the cache host is elevated to a level of increased responsibility in the cluster and designated as a lead host. In addition to the normal cache host's operations related to caching data, the lead host also works with other lead hosts to manage the cluster operations.

When a Lead Host Fails

For the cache cluster to remain available, a majority of lead hosts must remain available. This is more of a risk in small clusters than it is in large ones because it takes fewer server failures to cause the cluster to shut itself down.

Note

When lead hosts perform the cluster management role, if a majority of lead hosts fail, the entire cache cluster shuts down.

For example, consider the six-server cache cluster shown in the following diagram. In this example, lead hosts perform the cluster management role and two cache hosts have been designated to be lead hosts.

Cache cluster lead hosts

If any of the normal cache hosts in the cluster fail, the cluster could keep running. Data on the non-lead hosts would be lost (assuming high availability was not enabled), but the rest of the cluster could continue serving and storing data. In fact, the cluster could keep on functioning if it lost all four cache hosts not designated as lead hosts.

If just one of those lead hosts failed, the entire cache cluster would shut itself down because there would no longer be a majority of lead hosts running. To mitigate this issue, you do have the option of designating additional lead hosts.

Designating Additional Lead Hosts

You can designate additional lead hosts after installation. However, it is important to consider that assigning too many lead hosts can also be a problem:

  • There must always be a majority of lead hosts available for the cache cluster to remain running. The more hosts designated as lead hosts, the fewer server failures the cluster will be able to sustain and remain operable.

  • In small clusters, where one or two lead host failures could cause the cluster to fail, we recommend that you designate more lead hosts.

  • In large clusters, five to seven lead hosts should be sufficient to ensure that a cluster in the range of 50 cache servers is responsive.

For more information about changing lead host designations, see Set the Cluster Management Role and Lead Host Designations (AppFabric 1.1).

Changes in Microsoft AppFabric 1.1 for Windows Server

To increase the availability of the cache cluster, AppFabric 1.1 has changed the process used to designate default lead hosts. AppFabric 1.1 will automatically set each cache host added to the cache cluster as a lead host up to a maximum of seven lead hosts. You are still able to designate additional lead hosts using the Set-CacheHostConfig command with the IsLeadHost parameter set to "true". It is also possible to remove a cache host from the lead host role by setting IsLeadHost to "false".

When SQL Server Performs the Cluster Management Role

When the cache cluster was created with offloading enabled, the leadHostManagement setting is false. In this scenario, regardless of the leadHost setting, each cache host only performs its normal non-lead host responsibilities related to caching data. The instance of SQL Server that is used for storing cluster configuration settings is also used to perform the cluster management role.

When a Server Failure Occurs

For the cluster to remain available when SQL Server performs the cluster management role (offloading), one or more cache hosts must to be able to access the SQL Server database.

For example, consider the six-server cache cluster shown in the following diagram.

Cluster Management Role Set to SQL Server

In this example, SQL Server is performing the cluster management role, and all six cache hosts can dedicate their resources to data access for the cache clients.

If any one of the cache hosts in the cluster fails, the data on those servers is lost (assuming that high availability is not enabled), but the cluster keeps running. Data on the other cache hosts continues to be available to the cache clients. In fact, in this scenario, the cluster could keep functioning if it lost five of the six cache hosts.

If SQL Server fails, the entire cluster shuts down within a few minutes. To mitigate this issue, we highly recommend that you use Microsoft Windows Server 2008 Failover Clustering (https://go.microsoft.com/fwlink/?LinkId=130692) to host a "clustered" database resource for the cache cluster configuration storage location and cluster management role.

See Also

Concepts

AppFabric Caching Physical Architecture Diagram (AppFabric 1.1 Caching)
AppFabric Caching Logical Architecture Diagram (AppFabric 1.1 Caching)

  2012-09-12