匯出 (0) 列印
全部展開
本主題尚未接受評分 - 為這個主題評分

Scaling Rules

更新日期: 2011年10月

Author: http://msdn.microsoft.com/zh-tw/library/hh307529

參考影像

Learn more about RBA Consulting.

Scaling Rules

Scaling rules provide the scaling engine with actions that are appropriate for a particular set of metrics. To offer greater flexibility, the scaling engine uses a provider pattern for both the scaling rules and the metrics. The MetricProvider class polls to collect data, and alerts the scaling engine of any new metrics. The scaling engine receives the metric data and passes it to an implementation of the IScalingLogicProvider interface. The logic provider class contains the logic that analyzes the metrics and determines the appropriate scaling action. Both of these providers offer extensibility points for the scaling engine. These extensibility points give developers the option to write additional providers that are specific to their applications.

Time-Based Scaling Rules

Time-based scaling rules do not require diagnostic data. Instead, they use DateTime values to determine when to increase and decrease scale. Applications that implement time-based scaling rules tend to conform to the Predictable Bursting pattern. This pattern is common when increases in workloads, as well as the duration of those increases, can be expressed as schedules that represent peak and off-peak times. Time-based scaling rules typically define start times, duration, and instance counts.

Load-Based Scaling Rules

Scaling rules that are based on the workload are suitable for most situations because their metrics reflect the health of the system itself. This means that you can apply load-based scaling rules to loads that fit the Growing Fast, Predicable Bursting, and Unpredictable Bursting patterns. These rules use metrics such as data from performance counters, and also use DateTime values to determine durations. The duration is an important piece of the equation that determines scaling actions, and it must be set appropriately. On one hand, a longer duration at a specified threshold is likely to indicate that the system is indeed encountering higher loads. On the other hand, waiting too long to determine what to do can minimize the effectiveness of the scaling action. Load-based scaling rules often define a threshold range, duration, and the metric data. They may also define other parameters, such as the instance count range and, in extreme cases, the number of instances that constitute an increment. In most cases, increasing the count one instance at a time is sufficient.

It is important to note that load-based scaling rules are not limited to performance counter data or time for their metrics. Developers can use the scaling engine's extensibility points to choose other sources for metrics or events. For example, new database rows can trigger scaling, such as when a batch job writes completion rows that trigger the next batch job. If the next job is more intensive than the previous one, scaling may be required.

Ensuring Accuracy

The final point about scaling rules is how to ensure that they are accurate. Each new Windows Azure instance can take up to 5 minutes to start. In some load patterns, where the workload only increases for a short period of time, scaling the application through load-based rules may not be fast enough. This is because detecting the increase and determining the correct scaling action can take just as long as the load spike itself. In Windows Azure, pricing is computed and rounded up to one hour increments. Creating a new instance, for even a short time period such as 5 minutes, still incurs a charge of one hour, relative to the instance’s size. In these situations, developers may need to tweak the load-based scaling rules, use a different type of scaling rule, or not use scaling at all.

Implementing the Scaling Engine

Up to this point, the scaling engine has been discussed as a single component, but it is, of course, comprised of classes and interfaces. Some, like the MetricProvider class and the IScalingLogicProvider interface, have already been mentioned. The other pieces are the ActionEngine, ScalingManager and MetricDataStoreManager classes. The ActionEngine class performs the scaling actions provided by the ScalingEngine class, which receives them from the scaling logic providers. The ActionEngine class reads from and writes to the deployed configuration files. The ScalingManager class presents applications with a façade to interact with and instantiates the ScalingEngine class and the configured MetricProviders class. The MetricDataStoreManager class collects and stores the defined metrics. Lastly, the ScalingEngine class forms the bridge between the MetricProvider class and the scaling logic provider.

The following illustration shows the scaling engine's major classes and interfaces. (Note that, for clarity, the term "scaling engine" refers to the entire library of classes.)

參考畫面

A scaling engine should be loosely coupled with its target applications. Loose coupling ensures those new applications only process business-related issues, and that existing applications can be modified with minimal or no changes. The scaling engine itself must operate as a singleton. Although that may seem like an obvious point, it is worth noting when you consider that one of the hosting options discussed in the "Hosting the Scaling Engine" section later in this article is Existing Roles. This option deploys the engine with the target application itself. In this situation, you must make sure that the scaling engine only starts once even though the role may have multiple instances. It is also important to ensure that the scaling engine remains active even when only the first instance (instance 0) remains.

Implementing Diagnostics

This section focuses on load-based scaling rules that require metrics from the system. Specifically, these metrics refer to the performance counter data collected and used by the scaling engine. Because the scaling engine is a singleton, and there can be a number of instances of the target application, performance data must be aggregated. A simple solution to aggregation is to take advantage of the diagnostic data collection paradigm that Windows Azure uses. In a properly configured Windows Azure application, diagnostic data, such as performance counters, accumulate into local storage for each instance. Based on a configured schedule, the data later transfers from local storage into a Windows Azure table named the WADPerformanceCountersTable table.

For web applications, a common metric to use for scalability is the average of the number of requests. The Windows Server® 2008 R2 operating system, and in fact all other modern Windows operating systems, provide a performance counter for this metric that is named Requests/Sec. The counter is under the ASP.NET Apps v4.0.30319 category. To instrument a Windows Azure web application with the Requests/Sec counter, start the DiagnosticMonitor class within the WebRole (RoleEntryPoint) class and provide it with configuration settings that add the counter as a data source and set the transfer schedule. The following code example shows how to do this.

public override bool OnStart()
{
  DiagnosticMonitorConfiguration configuration = DiagnosticMonitor.GetDefaultInitialConfiguration();
  configuration.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
  configuration.PerformanceCounters.DataSources.Add(
    new PerformanceCounterConfiguration
    {
      CounterSpecifier = @"\ASP.NET Apps v4.0.30319(__Total__)\Requests/Sec",
      SampleRate = TimeSpan.FromSeconds(5)
    });
 
  DiagnosticMonitor.Start("DiagnosticConnectionString", configuration);
 
  return base.OnStart();
}

Monitoring Diagnostics

Because the performance counter data exists in the WADPerformanceCountersTable table, you can use classes that derive from the TableServiceContext and TableServiceEntity classes in the Windows Azure SDK to read the rows. The SDK is located at http://msdn.microsoft.com/en-us/windowsazure/cc974146.aspx. Note that the WADPerformanceCountersTable table can contain extraneous rows. You are only interested in new rows, rows from the current deployment, rows for the target role and rows for the specific counter. When it queries for new rows, the scaling engine already has an idea of the range of time it requires. For example, it may query for rows created between the current time and 1 minute before the current time. Because the PartitionKey column for any given row represents the universal time in ticks, it is inserted into the WADPerformanceCountersTable table, and it can be used as a column in the query criteria. Use the RowKey column to filter for the current deployment and target role. It is a composite value that consists of the deployment ID, the role name, the role instance and a unique sequence. However, only the first two of these values are helpful in this particular case. Finally, to filter for the specific counter, use the CounterName column. A formulated query that uses the RowKey and PartitionKey columns has excellent performance.

The following two pieces of code show a simple example of how to retrieve the performance counter data from table storage and then average it. The first code example defines the Windows Azure table context and the entity classes.

class PerformanceCounterDataEntity : TableServiceEntity
{
  public string Role { get; set; }
  public string DeploymentId { get; set; }
  public string RoleInstance { get; set; }
  public string CounterName { get; set; }
  public double CounterValue { get; set; }
}
 
class PerformanceCounterServiceContext : TableServiceContext
{
  public IQueryable<PerformanceCounterDataEntity> PerformanceCounterValues
  {
    get { return CreateQuery<PerformanceCounterDataEntity>("WADPerformanceCountersTable"); }
  }
 
  public PerformanceCounterServiceContext(string baseAddress, StorageCredentials credentials)
    : base(baseAddress, credentials)
  {
  }
}

The second code example shows how to query for the performance counters.

// An example of retrieving the storage account for the dev fabric
// Actual implementation may vary
var account = CloudStorageAccount.Parse("UseDevelopmentStorage=true");
 
// Create the table service context
var context = new PerformanceCounterServiceContext(
  account.TableEndpoint.ToString(), account.Credentials);
 
// Application values
string roleName = "MyApplication.WebApp";
string deploymentId = "2f8551396c69472091bfbf780ff614c4";
DateTime end = DateTime.UtcNow;
DateTime start = end.Subtract(TimeSpan.FromMinutes(1));
 
// Query values
string startTicks = string.Format("0{0}", start.Ticks);
string endTicks = string.Format("0{0}", end.Ticks);
string rowKey = string.Format("{0}__{1}", deploymentId, roleName);
string counterName = @"\ASP.NET Apps v4.0.30319(__Total__)\Requests/Sec";
 
// Retrieve a list of values with the specified criteria
var values = (from pcv in context.PerformanceCounterValues
        where
          pcv.PartitionKey.CompareTo(startTicks) >= 0 &&
          pcv.PartitionKey.CompareTo(endTicks) <= 0 &&
          pcv.RowKey.CompareTo(rowKey) >= 0 &&
          pcv.CounterName.CompareTo(counterName) == 0
        select pcv).ToList();
if (values != null)
{
  // Determine the average of the counter values
  // Actual implementation of this block may vary
  var averages = from v in values
          group v by v.CounterName into g
          select new
          {
            CounterName = g.Key,
            // Round the average to two decimals
            Average = Math.Round(g.Average(v => v.CounterValue), 2)
          };
}

Scaling Rules Revisited

The earlier section, "Scaling Rules," gave an overview of the two different types of rules: time-based scaling rules and load-based scaling rules. Load-based rules are arguably more interesting than time-based ones because of the added complexity required to aggregate metrics and monitor them. In this section, a more detailed discussion of the logic that encapsulates load-based rules concludes the discussion of the major components that comprise the scaling engine. Implementations of the IScalingLogicProvider interface contain the scaling rules logic and return instances of the ScalingActions class to the ScalingEngine class. The scaling providers, such as the provider that assesses performance counter data, evaluate metrics to determine if certain thresholds have been reached for a given duration.

The following code is an example implementation of the IScalingLogicProvider interface. This scaling logic provider examines thresholds. (Note that this code is only an example and portions have been omitted for brevity.)

class PerformanceCounterScalingLogicProvider : IScalingLogicProvider
{
  public string ProviderName { get; private set; }
  public string RoleName { get; private set; }
  public string CounterName { get; private set; }
  public double HighThresholdValue { get; private set; }
  public double LowThresholdValue { get; private set; }
  public int MaximumInstanceCount { get; private set; }
  public int MinimumInstanceCount { get; private set; }
 
  public void Initialize(IProviderConfiguration config)
  {
    var settings = config.Settings;
 
    ProviderName = config.Name;
    RoleName = settings["RoleName"].Value;
    CounterName = settings["CounterName"].Value;
    HighThresholdValue = double.Parse(settings["HighThreshold"].Value);
    LowThresholdValue = double.Parse(settings["LowThreshold"].Value);
    MaximumInstanceCount = int.Parse(settings["MaxInstanceCount"].Value);
    MinimumInstanceCount = int.Parse(settings["MinInstanceCount"].Value);
  }
 
  public ScalingAction Evaluate(IEnumerable<Metric> metrics)
  {
    var action = new ScalingAction
    {
      Count = 1,
      Role = RoleType.Web,
      RoleName = RoleName,
      MinInstanceCount = MaximumInstanceCount,
      MaxInstanceCount = MinimumInstanceCount
    };
 
    var performanceCounterMetrics = (from m in metrics
                     where string.Compare(m.Name, CounterName, true) == 0
                     select m).ToList();
    foreach(var metric in performanceCounterMetrics)
    {
      if (metric.CurrentValue > HighThresholdValue)
      {
        // Logic omitted to determine duration at this above-threshold level
 
        action.Activity = InstanceAction.Increase;
        action.Description = "Above Threshold";
 
        return action;
      }
      else if (metric.CurrentValue < LowThresholdValue)
      {
        // Logic omitted to determine duration at this below-threshold level
 
        action.Activity = InstanceAction.Decrease;
        action.Description = "Below Threshold";
 
        return action;
      }
 
    }
 
    return null;
  }
}

The Initialize method infers that the property values are set from configuration values. The following code is an example of what the configuration of the PerformanceCounterScalingLogicProvider object can look like.

<scalingLogicProviders>
 <scalingLogicProvider
  name="PerformanceCounterScalingLogicProvider" 
  type="MyScalingEngine.ScalingLogicProviders.PerformanceCounterScalingLogicProvider, MyScalingEngine">
  <metricProviders>
   <metricProvider name="PerformanceCounterMetricProvider"/>
  </metricProviders>
  <settings>
   <setting name="RoleName" value="Xipanit.Website"/>
   <setting name="TimeAboveThreshold" value="90"/>
   <setting name="TimeBelowThreshold" value="120"/>
   <setting name="LowThreshold" value="6"/>
   <setting name="HighThreshold" value="12"/>
   <setting name="MinInstanceCount" value="1" />
   <setting name="MaxInstanceCount" value="5" />
   <setting name="CounterName" value="\ASP.NET Apps v4.0.30319(__Total__)\Requests/Sec"/>
  </settings>
 </scalingLogicProvider>
</scalingLogicProviders>

In this configuration, the scaling engine only scales up if the application receives an average of 12 requests per second (or more) for 90 seconds and scales down if the application receives 6 requests per second (or less) for 120 seconds. The HighThreshold value is based on a heuristic that an ASP.NET application can handle 12 concurrent connections per CPU. There is also an assumption that the role is a Small instance with one core. Of course, you can tweak this value based on the application's overall performance.

Performing Scaling Actions

The ScalingAction class contains properties that indicate to the ActionEngine instance whether to increment or decrement the instance counts and by how many. The action originates with the scaling logic providers, is passed to the ScalingEngine instance, and finally comes to the ActionEngine instance. The ActionEngine is a multi-threaded class but it handles actions sequentially, per service (that is, the application MyApplication.Cloud). The sequential logic ensures that each application only processes actions one at a time to prevent in-place upgrade contentions and other problems. In order to accomplish this, each action that causes an in-place upgrade is followed by a blocking call until the upgrade is complete. The blocking call polls the application’s operation state until it returns a Completed value. Other possible values of the operation states are InProgress and Failed. These are the same values you can see in other management tools.

The following code example shows you how to use this tracking ID to retrieve the operation state. Recall that the tracking ID for the configuration changes is contained in the HTTP response header.

// Open the certificates store
var store = new X509Store(StoreName.My, StoreLocation.CurrentUser);
store.Open(OpenFlags.ReadOnly);
 
// Retrieve the certificate with the specified thumbprint
var certificates = store.Certificates.Find(
  X509FindType.FindByThumbprint, "0A6B64403A7835DBB9909DE59B76D09FC2555B76", false);
var certificate = certificates[0];
 
// Create the channel to the Windows Azure Management APIs
var channel = ServiceManagementHelper.CreateServiceManagementChannel(
  new Uri("https://management.core.windows.net."), certificate);
 
using (var scope = new OperationContextScope((IContextChannel)channel))
{
  // The subscription and tracking IDs are examples, only
  string subscriptionId = "2FE830D2-20B2-4C99-9527-775EF0D548BA";
  string trackingId = "16f20cb1bfa140e69abf0108c26c0476";
 
  var operation = channel.GetOperationStatus(subscriptionId, trackingId);
  if (operation != null)
  {
    string status = operation.Status;
 
    // status will be Completed, InProgress or Failed
  }
}

Using the Scaling Engine

After a scaling engine is in place, it is time to revisit the original scalability issues that brought about the need for a scaling engine in the first place. The most challenging part of building a scaling engine may be how to configure the providers that represent the scaling rules. However, first classifying the application’s load into one of the four known patterns (On and Off, Growing Fast, Predictable Bursting, and Unpredictable Bursting) will make this configuration easier.

For on and off loads, a scaling engine is not appropriate because the scaling engine requires a deployed application to manage. If an application with an on and off load runs with a deployed package during inactive periods, then the load itself may be closer to the Predictable Bursting pattern. For fast-growing loads, the PerformanceCounterMetricProvider provider's configuration might include moderate levels that are above the threshold. However, the low threshold could have a very long duration. This is to prevent the application from being scaled down, because it is assumed that the load will continue to increase and never decrease.

On the other hand, both predictable and unpredictable bursting loads can use load-based scaling rules such as the PerformanceCounterScalingLogicProvider. This provider handles both these types of load very well because it can detect the changes in the load. If timing that is more accurate is required and the load is predictable, developers can use a provider that implements time-based scaling rules rather than the PerformanceCounterScalingLogicProvider. These providers can contain more aggressive above-threshold levels because their below-threshold limits allow these applications to scale down.

In order to tweak these configuration values, the scaling logic providers provide diagnostic data. The data includes information about the encountered thresholds, and the durations that prompted the scaling action. This tweaking process may involve some iteration, but once the levels are set, they may not require additional changes until the application performance or the load properties changes.

本文對您有任何幫助嗎?
(剩餘 1500 個字元)
感謝您提供意見
顯示:
© 2014 Microsoft. 著作權所有,並保留一切權利。