4: Autoscaling and Microsoft Azure

Article
10/29/2014

Retired Content
This content and the technology described is outdated and is no longer being maintained. For more information, see Transient Fault Handling.

On this page:
What is Autoscaling? \| What is the Autoscaling Application Block? \| Instance Autoscaling \| Application Throttling \| Rules and Actions - Constraint Rules, Reactive Rules \| Logging \| The Autoscaling Lifecycle - Determine Requirements and Constraints, Specify Rules, Run the Application, Collect and Analyze the Results \| When Should You Use the Autoscaling Application Block? \| You Want Your Application to Respond Automatically to Changes in Demand \| You Want to Manage the Costs Associated with Running Your Application \| You Have Predictable Times When Your Application Requires Additional Resources \| When Should You Not Use the Autoscaling Application Block - Simple Applications, Controlling Costs, Applications That Are Not Scalable \| Using the Autoscaling Application Block \| Adding the Autoscaling Application Block to Your Visual Studio Project \| Hosting the Autoscaling Application Block \| Changes to Your Azure Application \| The Service Information \| Adding Throttling Behavior to Your Application - Using Instance Autoscaling and Throttling Together \| The Autoscaling Rules - Implementing Schedule-based Autoscaling Without Reactive Rules \| Monitoring the Autoscaling Application Block \| Advanced Usage Scenarios - Scale Groups, Using Different Ratios at Different Times, Using Notifications, Integrating with the Application Lifecycle, Extending the Autoscaling Application Block, Custom Actions, Custom Operands, Custom Stores, Custom Logging, Using the WASABiCmdlets \| Sample Configuration Settings - Average Rule Evaluation Period, Long Rule Evaluation Period, Configuring the Stabilizer \| Using the Planning Tool \| How the Autoscaling Application Block Works \| The Metronome \| The Data Collectors \| The Service Information Store \| The Data Points Store \| The Rule Evaluator \| The Rules Store \| The Logger \| The Scaler \| The Tracker \| More Information

What is Autoscaling? | What is the Autoscaling Application Block? | Instance Autoscaling | Application Throttling | Rules and Actions - Constraint Rules, Reactive Rules | Logging | The Autoscaling Lifecycle - Determine Requirements and Constraints, Specify Rules, Run the Application, Collect and Analyze the Results | When Should You Use the Autoscaling Application Block? | You Want Your Application to Respond Automatically to Changes in Demand | You Want to Manage the Costs Associated with Running Your Application | You Have Predictable Times When Your Application Requires Additional Resources | When Should You Not Use the Autoscaling Application Block - Simple Applications, Controlling Costs, Applications That Are Not Scalable | Using the Autoscaling Application Block | Adding the Autoscaling Application Block to Your Visual Studio Project | Hosting the Autoscaling Application Block | Changes to Your Azure Application | The Service Information | Adding Throttling Behavior to Your Application - Using Instance Autoscaling and Throttling Together | The Autoscaling Rules - Implementing Schedule-based Autoscaling Without Reactive Rules | Monitoring the Autoscaling Application Block | Advanced Usage Scenarios - Scale Groups, Using Different Ratios at Different Times, Using Notifications, Integrating with the Application Lifecycle, Extending the Autoscaling Application Block, Custom Actions, Custom Operands, Custom Stores, Custom Logging, Using the WASABiCmdlets | Sample Configuration Settings - Average Rule Evaluation Period, Long Rule Evaluation Period, Configuring the Stabilizer | Using the Planning Tool | How the Autoscaling Application Block Works | The Metronome | The Data Collectors | The Service Information Store | The Data Points Store | The Rule Evaluator | The Rules Store | The Logger | The Scaler | The Tracker | More Information

What is Autoscaling?

One of the key benefits that the Microsoft Azure™ technology platform delivers is the ability to rapidly scale your application in the cloud in response to changes in demand.

Scalability is a key feature of Microsoft Azure.

When you deploy an application to Azure, you deploy roles: web roles for the externally facing portions of your application and worker roles to handle back-end processing. When you run your application in Azure, your roles run as role instances (you can think of role instances as virtual machines). You can specify how many role instances you want for each of your roles; the more instances you have, the more computing power you have available for that role, but the more it will cost you. There are, of course, some specific design requirements if your roles are to operate correctly when there are multiple instances of that role, but Azure looks after the infrastructure requirements for you. For more information about design requirements, see "Building a Scalable, Multi-Tenant Application for Azure."

Bharath Says:
	`Scaling by adding additional instances is often referred to as <em>scaling out</em>. Azure also supports <em>scaling up</em> by using larger role instances instead of more role instances.</td>`

You can specify the size and the number of instances you require for each of your roles when you first deploy an application to Azure. You can also add or remove role instances on the fly while the application is running, either manually through the Azure portal, or programmatically by using the Azure Management API.

By adding and removing role instances to your Azure application while it is running, you can balance the performance of the application against its running costs. You can add new instances when demand is high, and remove instances when you no longer need them in order to reduce running costs.

If you rely on manual interventions to scale your application, you may not always achieve the optimal balance between costs and performance; an operator may respond late, or underestimate the number of role instances that you need to maintain throughput.

Poe Says:
	`You also need to consider the cost of having human operators performing this task, especially if you have hundreds or even thousands of role instances running in Azure data centers around the globe.</td>`

Poe Says:

                You also need to consider the cost of having human operators performing this task, especially if you have hundreds or even thousands of role instances running in Azure data centers around the globe.</td>

An autoscaling solution reduces the amount of manual work involved in dynamically scaling an application. It can do this in two different ways: either preemptively by setting constraints on the number of role instances based on a timetable, or reactively by adjusting the number of role instances in response to some counter(s) or measurement(s) that you can collect from your application or from the Azure environment.

You will still need to evaluate the results of your autoscaling solution on a regular basis to ensure that it is delivering the optimal balance between costs and performance. Your environment is unlikely to be static; overall, the numbers of users can change, access patterns by users can change, your application may perform differently as it stores more data, or you may deploy your application to additional Azure data centers.

You should evaluate your autoscaling behavior on a regular basis. Even with autoscaling in place, fire-and-forget is not the best practice.

Scaling your application by adjusting the number of role instances may not be the best or only way to scale your application. For example, you may want to modify the behavior of your application in some way during bursts in demand, or to alter the number of Azure queues, or the size of your SQL Azure database. An autoscaling solution may not be limited to just adjusting the number of role instances.

What is the Autoscaling Application Block?

The Autoscaling Application Block ("WASABi") is a part of the Enterprise Library Integration Pack for Microsoft Azure.

Ed Says:
	`The Autoscaling Application Block shares with other Enterprise Library blocks many design features, such as how you configure it and use it in your code.</td>`

The application block allows you to define how your Azure Application can automatically handle changes in the load levels that it might experience over time. It helps you minimize your operational costs, while still providing excellent performance and availability to your users. It also helps to reduce the number of manual tasks that your operators must perform.

The application block works through a collection of user-defined rules, which control when and how your application should respond when the load varies. Rules are either constraint rules that set limits on the minimum and maximum number of role instances in your Azure application, or reactive rules that adjust the current number of role instances based on counters or measurements that you collect from your application.

Poe Says:
	`Rules are stored in XML documents. This makes them easy to edit. It also makes it possible to build custom editors for the rules in your application. The Tailspin Surveys application shows how this can be done.</td>`

Poe Says:

                Rules are stored in XML documents. This makes them easy to edit. It also makes it possible to build custom editors for the rules in your application. The Tailspin Surveys application shows how this can be done.</td>

Constraint rules can have an associated timetable that specifies the times when the rule is active. Constraint rules enable you to proactively set the number of role instances that your application can use; the minimum number of role instances helps you to meet your service level agreement (SLA) commitments, the maximum number of role instances helps you to control the running costs of your Azure application.

Reactive rules use values that are derived either from system metrics such as CPU utilization, or from business metrics such as the number of unprocessed documents in the application. The application block collects these metrics and saves them as data points. A data point is simply the value of a metric with an associated timestamp to indicate when the application block collected the value. A reactive rule uses an aggregate value (such as average, maximum, minimum, or last) calculated from data points over a specified period. A reactive rule compares the current aggregate value to a threshold value, and based on the result performs one or more actions; for example, adding two new web role instances and notifying an operator. Reactive rules help your application respond to unexpected bursts (or collapses) in your application's workload.

The Autoscaling Application Block supports the following techniques for handling varying load levels:

Instance Scaling. The Autoscaling Application Block varies the number of role instances to accommodate variations in the load on the application.
Throttling. The Autoscaling Application Block limits or disables certain (relatively) expensive operations in your application when the load is above certain thresholds.

These two autoscaling techniques are not mutually exclusive, and you can use both to implement a hybrid autoscaling solution in your application.

Bharath Says:
	`In Azure, changing the number of role instances takes time, so to respond quickly you may want to throttle your application until the new role instances are available.</td>`

Figure 1 shows the relationship between the Autoscaling Application Block and your Azure application.

Hh680945.C981BB3005CFCB3BB209719E691B19F6(en-us,PandP.50).png

Figure 1

The Autoscaling Application Block and Azure

This diagram illustrates how the Autoscaling Application Block collects data from your Azure environment and uses that data in rules to determine if it should initiate any scaling actions in your Azure application.

Note

The Autoscaling Application Block can be hosted either in Azure or on premises.

Instance Autoscaling

The Autoscaling Application Block allows you to automatically scale out the number of Azure role instances (web and worker roles) to closely match the demands of your application. This is an effective technique for controlling the running costs of your application, because in Azure, you only pay for instances that you actually use.

Jana Says:
	`It's important to control the costs of running the application, and keeping the number of role instances to a minimum helps us achieve that goal.</td>`

Of course, it is important to set explicit boundaries for the autoscaling behavior in your Azure application. Because you are billed for each provisioned role instance (regardless whether running or stopped), you must set a maximum number of instances for each role type in your application. Otherwise, an application error that causes your number of role instances to increase could result in a significant (and unexpected) cost at the end of the month. You can also set a minimum number of role instances to ensure that your application runs and is resilient in the face of any failures.

Note

You must have a minimum of two role instances to be eligible for the Azure SLA guarantees.

You shouldn't expect the Autoscaling Application Block to be able to start new role instances instantaneously; it takes Azure a finite time to launch (or terminate) a role instance. The time taken is typically in the order of 10 minutes (at the time of writing this guide), but this can vary depending on a number of factors; for example, the number of role instances you are adding, the size of the role instances you are adding, and the current level of activity within the Azure data center.

Note

At the time of this writing, partial compute instance hours are billed as full compute hours for each clock hour an instance is deployed. For example, if you deploy a Small compute instance at 10:50 and delete the deployment at 11:10, then you will be billed for two Small compute hours, one compute hour for usage during 10:50 to 11:00 and another compute hour for usage during 11:00 to 11:10. Therefore, it makes sense to keep new instances alive for the remainder of the clock hour during which they were started. For more information, see "Usage Charge Details for Azure Bills."
The stabilizer takes this into account for reactive rules (explained below), but you should consider designing your constraint rules (also explained below) so that they scale down just before the end of the clock hour.

Application Throttling

Instead of adjusting the number of role instances in response to changes in demand, you can use the Autoscaling Application Block to change the way your application behaves under various conditions. This technique allows you to specify modes of operation that are appropriate to certain load levels or times of day or user type.

Jana Says:
	`It's important to choose carefully what you throttle. Users will expect the core functionality of your application to be available at all times.</td>`

For example, you can define different modes of operation for normal operation, for when there is very little load on your application, or for extreme bursts in activity.

When the load on your application is very low, you might want to perform certain background processing tasks that are not time critical, but that might be resource intensive, such as exporting data or calculating statistics. If your SLA requires you to have a minimum of two instances to run your application, then you can use this technique to better utilize these instances by occupying them with background processing tasks.
Under normal load, you might want to avoid executing background tasks, but otherwise run your application as normal.
When an extreme burst in activity occurs, you might want to disable certain functionality so that your application remains usable. For example, you can disable autocomplete functionality, switch to a lightweight version of your user interface or disable some functionality for trial users while still providing full support for paying customers.

You can use application throttling very effectively in combination with instance scaling. It can take up to 10 minutes for Azure to add a new role instance, so when a sudden burst of activity occurs, you can use application throttling to help reduce the load on your application while the new role instances start. However, if you have a large number of instances, it can take time for the configuration change to propagate to all the instances, by which time your new instances may have started. In addition, if your application is already scaled to the maximum number of role instances permitted by your constraint rules, then application throttling can help to provide the maximum performance for the core functionality in your application.

Poe Says:
	`Depending on the size and complexity of your application, throttling may not happen faster than adding a new instance. Therefore, you must test it in your environment. You should also remember that not all of your instances will react to throttling at the same time.</td>`

Poe Says:

                Depending on the size and complexity of your application, throttling may not happen faster than adding a new instance. Therefore, you must test it in your environment. You should also remember that not all of your instances will react to throttling at the same time.</td>

Rules and Actions

The Autoscaling Application Block uses rules and actions to determine how your application should respond to changes in demand. As described earlier, there are two types of rules: constraint rules and reactive rules, each with their own actions.

Constraint Rules

For many applications, the load pattern is predictable. For example, in a business application, the highest load is during office hours; on a consumer website, the highest load is between 18:00 and 20:00. In these scenarios, you can proactively scale your Azure application to meet the anticipated additional workload. You can use constraint rules to address this scenario.

Constraint rules consist of one or more actions to set minimum and maximum values for the number of instances of a target, a rank, and optionally a timetable that defines when the rule is in effect. If there is no timetable, the rule is always in effect.

Poe Says:
	`You should set the minimum value to ensure that you continue to meet your SLAs. You should set the maximum value to limit your costs and meet your budgetary goals.</td>`

You can use a timetable to control the number of role instances that should be available at particular times. For example, you could create a rule to increase the minimum and maximum number of web and worker role instances in your application between 9:00 and 11:00 on Monday mornings when you know that demand for your application will be higher than usual.

You can also specify default rules that are always active and that specify default maximum and minimum values for the number of role instances for each web and worker role type in your application. Importantly, constraint rules always take precedence over reactive rules, to ensure that these reactive rules cannot continue to add new role instances above a maximum value or remove role instances below a minimum level.

Note

By default (at the time of this writing), Azure subscriptions are permitted to use up to 20 CPU cores. This value can be increased on request. For more information, see the Azure Support page.

It is possible that multiple constraint rules are in effect at the same time because of overlapping times in their timetables. In this case, the Autoscaling Application Block uses the rank of the rules to determine which rule takes precedence. Higher-ranked rules override lower ranked rules.

Poe Says:
	`One is the lowest rank. You should use it for all your default rules. You should always assign a rank to your constraint rules so that it's clear which one should take precedence.</td>`

Here are some examples of constraint rules:

For web role A, the default minimum number of instances is set to two and the maximum to four. This rule uses the default rank.
On relatively busy Fridays, the minimum number of instances is set to four and the maximum to eight for web role A. This rule uses a higher rank.
For worker role B, the default constraint is a minimum of two instances and a maximum of four instances.
On Saturdays and Sundays between 2:00 PM and 6:00 PM, for worker role B, set the minimum number of instances to three and the maximum to six.
On the last Friday of every month, for scale group A, set the minimum number of instances to three and the maximum to six (scale groups are described later in this chapter).

Figure 2 illustrates the behavior of the Autoscaling Application Block when you have multiple constraint rules defined. The scenario displayed in the diagram uses three separate constraint rules to determine the number of instances of worker role A in your application. There are no reactive rules in this simple scenario.

Figure 2

Using multiple constraint rules and no reactive rules

The three constraint rules in effect in this scenario work as follows:

The first constraint rule is always active. It sets the maximum and minimum number of instances for worker role A to two. This rule has the lowest rank.
The second constraint rule is active between 8:00 AM and 10:00 AM every day. In the diagram, label A shows when this rule became active on August 7, and label B shows when this rule became inactive. It sets the maximum and minimum number of instances for worker role A to four. It has the highest rank of the three rules, so when it overlaps with any other rules it takes precedence.
The third constraint rule is active every Friday (in the diagram, 12 August is a Friday, and labels C and D show when this rule became active and inactive). It sets the maximum and minimum number of instances for worker role A to three. It has a lower rank than the second rule, so between 8:00 AM and 10:00 AM on Fridays, the second rule overrides this rule; in the diagram, label E shows when this happened.

Figure 3 shows the effect of using multiple constraint rules with different maximum and minimum values, but without any reactive rules.

Figure 3

Using constraint rules with maximum and minimum values without reactive rules

This scenario uses a number of different constraint rules with different maximum and minimum values. You can see how the instance count always remains between the minimum and maximum limits.

The reconciliation algorithm that the Autoscaling Application Block uses when it evaluates constraint rules works as follows:

If the current instance count is less than the minimum instance count specified by the constraint rules at the current time, then increase the current instance count to the minimum value. This occurs at label A on the diagram.
If the current instance count is greater than the maximum instance count specified by the constraint rules at the current time, then decrease the current instance count to the maximum value. This occurs at label B on the diagram.
Otherwise, leave the current instance count unchanged. This occurs at label C on the diagram.

Reactive Rules

It is not always possible to predict when demand will increase for your application or when there will be temporary bursts of demand. The Autoscaling Application Block also allows you to create reactive rules that trigger a scaling action when an aggregate value derived from a set of data points exceeds a certain threshold.

The Autoscaling Application Block can monitor the value of performance counters, Azure queue lengths, instance counts, and any custom-defined business metrics to scale the application when those values exceed specified thresholds. The application block refers to these values as operands, where an operand defines three things:

The counter or metric
The aggregate function, such as average or maximum
The time interval over which the application block calculates the aggregate function

For example, the Autoscaling Application Block can monitor the CPU usage of your web role instances. When the CPU usage performance counter average for the last hour goes above a threshold of 80%, the rule will perform an action to add new web role instances to handle this load, which should cause the average CPU usage levels to drop (assuming the load does not increase significantly). It will continue to add web role instances until the average CPU usage falls below the threshold. The reverse works as well. For example, if the average CPU usage over the last hour falls below a threshold of 40% then the rule will perform an action to remove web role instances until the average CPU usage is above the threshold value. Reactive rules can adjust the role instance account by an absolute number or by a proportion.

Typically, reactive rules are paired with one rule to scale up/out and another to scale down/in.

Reactive rules use an expression to specify a condition to evaluate to determine whether the rule should perform a scaling action. The actions that a reactive rule can trigger include:

Changing the instance count value of the rule's target. The action can increment or decrement the count by a number or by a proportion.
Changing the configuration of a hosted service. This action provides new values for entries in the application's ServiceConfiguration.cscfg file.
Sending a notification to an operator.
Switching to a different operating mode when you have configured your application to use application throttling.
Executing a custom action.

An action generates a notification if the action fails.

Note

If your application uses multiple web and worker roles, you will need to define an action for each web and worker role that you want to scale. You can use scale groups to simplify this task.

Example reactive rules include:

If the CPU utilization performance counter, averaged over the last hour for worker role A (across all instances) is greater than 80%, then perform an action.
If the minimum length of an Azure queue over the last six hours was greater than 50, then perform an action.

Rules can have a simple Boolean expression that compares a single value to a threshold, or a complex expression that includes a Boolean combination of multiple comparisons based on multiple operands. An example rule with a complex expression is:

If CPU utilization averaged for the last hour was growing, and the queue length remained above 120 for the last two hours, then perform an action.

Figure 4 illustrates the behavior of the Autoscaling Application Block when you have a reactive rule defined in addition to multiple constraint rules. The scenario displayed in the diagram uses three separate constraint rules to determine the minimum and maximum number of instances of worker role A in your application.

Figure 4

Constraint rules interacting with reactive rules

The three constraint rules that are in effect for worker role A are as follows:

The first constraint rule is always active. It sets the minimum number of instances for worker role A to two and the maximum to five. It has the lowest ranking.
The second constraint rule is active between 8:00 AM and 10:00 AM every day (Label A on the diagram shows when this rule becomes active for the first time). It sets the minimum number of instances for worker role A to four and the maximum to six. It has the highest ranking of the three rules, so when it overlaps with any other rules it takes precedence.
The third constraint rule is active every Friday (in the diagram, 12 August is a Friday, Label B on the diagram shows when this rule becomes active). It sets the minimum number of instances for worker role A to three and the maximum to five. It has a lower ranking than the second rule, so between 8:00 AM and 10:00 AM on Fridays, the second rule overrides this rule.

In addition, there are two reactive rules that can adjust the instance count of worker role A:

If the minimum number of unprocessed documents during the last hour was greater than 10, then increase the instance count of worker role A by one.
If the maximum number of unprocessed documents during the last hour was less than 10, then decrease the instance count of worker role A by one.

In the scenario shown in Figure 4, you can see how the constraint rules always limit the number of instances, providing absolute floor and ceiling values that cannot be crossed. The reactive rules can adjust the number of role instances within these limits. In the diagram, labels C and D show times when the first constraint rule enforced limits on the number of instances that the reactive rule proposed. Labels E and F show times when the second constraint rule enforced limits on the number of instances that the reactive rule proposed; at these times, the second constraint rule overrides the first constraint rule.

Note

If there is no constraint rule active for a role when the rule evaluation process runs, and a reactive rule tries to change the number of role instances, then the Autoscaling Application Block will log a message that it cannot perform any scaling actions on the role. The block will not change the current number of role instances.

Poe Says:
	`A rule can perform one or more actions.</td>`

Multiple reactive rules can trigger different, conflicting actions at the same time. In this case, the Autoscaling Application Block can reconcile the conflicting actions.

Poe Says:
	`You should be careful about assigning ranks to your reactive rules. It is better to rely on the reconciliation process to determine which scaling action should be performed.</td>`

For more information about how the block reconciles conflicting rules, see the topic "Understanding Rule Ranks and Reconciliation" on MSDN.

Logging

Whether you use instance autoscaling, application throttling, or a combination of the two approaches, the Autoscaling Application Block can log information about its activities. For example, it can write a log entry:

When it starts or stops new instances, and include information about why the Autoscaling Application Block added this instance.
When the application switches between various modes of operation, and include information about what triggered the throttling behavior.

You can use this information to help analyze your Azure costs, and to identify predictable patterns in the utilization levels of your application.

The Autoscaling Lifecycle

Figure 5 illustrates the lifecycle of the autoscaling process from the perspective of operations personnel.

Figure 5

The lifecycle of the autoscaling process

The lifecycle of the autoscaling process consists of four stages that operations staff can iterate over multiple times as they refine the autoscaling behavior of your application.

Determine Requirements and Constraints

The first stage is to determine the requirements and constraints for autoscaling behavior in your application. To determine the two types of requirements, you must:

Identify any predictable patterns of demand for your application's services.
Specify how you want your application to respond to unpredicted bursts and collapses in demand for its services.

The possible constraints you will face include:

Budgetary constraints on the running costs of your Azure application.
Any commitments to an SLA with your application's users.

Specify Rules

Based on the requirements and constraints that you identified in the previous step, you must formulate a set of rules to specify the autoscaling behavior of the application within your constraints. You can use constraint rules to define the behavior of the application in response to predictable changes in demand, and reactive rules to define the behavior of the application in response to unpredictable changes in demand.

Run the Application

After you have configured the rules, the Autoscaling Application Block can evaluate the rules and execute the autoscaling actions in your application as the application faces real changes in demand. The Autoscaling Application Block will log the rule evaluation results and the autoscaling actions that it performs.

Collect and Analyze the Results

You should regularly analyze the information that the Autoscaling Application Block logs about its activities in order to evaluate how well your rules are meeting your initial requirements and working within the constraints. For example, you may discover that your rules do not always enable your application to scale sufficiently to meet demand or that the rules are preventing you from meeting your SLA commitments in all circumstances. In these cases, you should re-evaluate your requirements and constraints to ensure that they are still valid and, if necessary, adjust your rules. You may be able to identify new, predictable usage patterns that will allow you to preemptively scale your application rather than relying on reactive rules.

You should continue to iterate over this process because usage patterns for your application will change over time and the existing set of rules may become sub-optimal for your requirements and constraints.

When Should You Use the Autoscaling Application Block?

This section describes three scenarios in which you should consider using the Autoscaling Application Block in your Azure solution.

You Want Your Application to Respond Automatically to Changes in Demand

The Autoscaling Application Block helps you to manage two competing requirements in your Azure applications. The first is to maintain the performance levels of your application in the face of changing levels of demand. If your application's web or worker roles experience changes in their workload over time, varying significantly by the hour, the day, or the week, and you want your application to respond to these changes in demand automatically, then the Autoscaling Application Block can increase or decrease the number of role instances automatically based on pre-configured rules.

Beth Says:
	`To keep users using your application, it must always be responsive.</td>`

New role instances can take at least 10 minutes to start up, so you can also use the application throttling feature of the Autoscaling Application Block when you need to respond quickly (within seconds or minutes) to a burst in activity.

You Want to Manage the Costs Associated with Running Your Application

The second, competing requirement is to minimize the running costs of your Azure application. Although additional web and worker role instances will enable your application to maintain response times for users and maintain throughput for background tasks when there is a burst in activity, these additional role instances cost money. Azure bills for web and worker role instances by the hour, and these compute costs are typically a large proportion of the running costs of an Azure application. For a more detailed discussion of how you can estimate your Azure running costs, see the chapter "How Much Will It Cost?" in the book "Moving Applications to the Cloud."

Beth Says:
	`The profitability of the application is directly affected by its running costs.</td>`

The Autoscaling Application Block helps to manage costs by removing unnecessary role instances and by allowing you to set maximum values for the number of role instances. However, there may be circumstances in which your application sees an additional burst in activity when it is already running the maximum configured number of instances. In this case, your application can respond by using application throttling. The throttling rules can define when your application should switch to an operating mode that is less resource intensive or disable non-critical functionality. In this way, the application can maintain the responsiveness of its UI or the throughput of critical processes without starting additional role instances.

You Have Predictable Times When Your Application Requires Additional Resources

The rules used by the Autoscaling Application Block allow you to define when the number of role instances should increase or decrease. When you know in advance that there will be a burst in demand, you can start additional role instances before the burst takes place by using autoscaling rules to define a timetable that specifies the number of roles that should be available at particular times.

Poe Says:
	`You can collect and analyze historical data and use your knowledge of external factors that trigger changes in demand to help predict workloads.</td>`

When Should You Not Use the Autoscaling Application Block

There are some scenarios in which you should not use the Autoscaling Application Block in your Azure application.

Simple Applications

Autoscaling does not often add much value for relatively simple applications or applications that have a limited number of users. For example, many small web applications never need more than two web role instances, even during bursts of activity.

Adding the Autoscaling Application Block to your application increases the complexity of your application. Therefore, you should evaluate whether or not the benefits of adding autoscaling behavior outweigh the additional complexity to the design of your application.

Jana Says:
	`You should consider designing your application to be scalable, even if your application does not require scalability right now. Usually, you cannot make an existing application scalable without having to re-engineer it.</td>`

Jana Says:

                You should consider designing your application to be scalable, even if your application does not require scalability right now. Usually, you cannot make an existing application scalable without having to re-engineer it.</td>

Controlling Costs

If you want to treat some of the costs of your Azure application as fixed costs, then you may want to fix the number of role instances in your application. This way, you can predict the exact cost of this portion of your monthly Azure bill. You cannot treat all Azure costs as fixed costs: for example, data transfer costs and Azure storage costs will always vary based on the quantity of data you are transferring and storing.

Applications That Are Not Scalable

Autoscaling only makes sense for applications that you design to be scalable. If your application is not horizontally scalable, because its design is such that you cannot improve its performance by adding additional instances, then you should not use the Autoscaling Application Block to perform instance autoscaling. For example, a simple web role may not be scalable because it uses a session implementation that is not web farm friendly. For a discussion of session state in Azure applications, see Storing Session State in the book "Moving Applications to the Cloud, 2nd Edition." For a discussion of some of the design issues associated with scalable worker roles, see Scaling Applications by Using Worker Roles in the book "Developing Applications for the Cloud, 2nd Edition."

Ed Says:
	`The Autoscaling Application Block automates the scaling process for applications that are already scalable. Using the Autoscaling Application Block does not automatically make your application scalable. </td>`

Ed Says:

                The Autoscaling Application Block automates the scaling process for applications that are already scalable. Using the Autoscaling Application Block does not automatically make your application scalable. </td>

Using the Autoscaling Application Block

Using the Autoscaling Application Block includes tasks that developers perform and tasks that IT pros perform. Figure 6 relates the key tasks to the actions of the Autoscaling Application Block in Azure.

Figure 6

Using the Autoscaling Application Block

This section describes, at a high level, how to use the Autoscaling Application Block. It is divided into the following main sub-sections. The order of these sections reflects the order in which you would typically perform the associated tasks. Developers will perform some of these tasks and administrators will perform others. The description of each task suggests who, in practice, is likely to perform each one.

Adding the Autoscaling Application Block to your Visual Studio Project. This section describes how you, as a developer, can prepare your Microsoft Visual Studio® development system solution to use the block.
Hosting the autoscaling application block. This section describes how you, as a developer, can host the Autoscaling Application Block in your Azure application.
Changes to your Azure application. This section describes the changes that you need to make in your Azure application so that it works with the Autoscaling Application Block.
The service information. This section describes how you, as a developer, define your application's service information.
Adding throttling behavior to your application. This section describes how you, as a developer, can modify your application so that it can be throttled by your autoscaling rules.
The autoscaling rules. This section describes how you, as an administrator, can define your autoscaling rules.
Monitoring the Autoscaling Application Block. This section describes how you, as an administrator, can monitor your autoscaling rules and how to use the data that you collect.

Advanced usage scenarios. This section describes some additional scenarios, such as using scale groups and extending the Autoscaling Application Block.

Bharath Says:
	You would typically perform these tasks when you are creating the host application for the application block, and work with the IT Pro to determine the required functionality.

Adding the Autoscaling Application Block to Your Visual Studio Project

As a developer, before you can write any code that uses the Autoscaling Application Block, you must configure your Visual Studio project with all of the necessary assemblies, references, and other resources that you'll need. For information about how you can use NuGet to prepare your Visual Studio project to work with the Autoscaling Application Block, see the topic "Adding the Autoscaling Application Block to a Host" on MSDN.

Markus Says:
	`NuGet makes it very easy for you to configure your project with all of the prerequisites for using the Autoscaling Application Block. <br />You can download the NuGet package, extract the DLLs and add them to your project manually, or download the source code for the block and build it yourself.</td>`

Markus Says:

                NuGet makes it very easy for you to configure your project with all of the prerequisites for using the Autoscaling Application Block. <br />You can download the NuGet package, extract the DLLs and add them to your project manually, or download the source code for the block and build it yourself.</td>

Hosting the Autoscaling Application Block

You can host the Autoscaling Application Block in an Azure role or in an on-premises application such as a simple console application or a Windows service. This section discusses some of the reasons that you might choose one or the other of these approaches and provides links to resources that explain how to write the code that hosts the Autoscaling Application Block.

Jana Says:
	`You must decide where you will host the block: either in Azure or in an on-premises application.</td>`

The Autoscaling Application Block enables you to add autoscaling behavior to your Azure applications, and as such, it must be able to communicate with Azure to make changes to the number of role instances that make up your application. Your Azure application might be a simple application made up of a small number of roles, all running in the same hosted service in the same data center, or have hundreds of different roles running in multiple hosted services in multiple data centers. Whatever the structure of your application and wherever you choose to host the Autoscaling Application Block, it must be able to interact with your application by using the Azure Service Management API, and it must be able to access diagnostic data such as performance counter values in order to evaluate reactive rules.

The Autoscaling Application Block is designed to work with very large Azure applications with hundreds of different roles.

If you host the Autoscaling Application Block in Azure, then you do not need to transfer any of the data that the application block uses out of the cloud. However, you may need to transfer diagnostic data between data centers if you host parts of your application in other geographical locations. The advantages of hosting the Autoscaling Application Block in Azure are the same as for hosting any application in the cloud: reliability and scalability. However, you will need to pay to host the role that contains the application block in Azure. You could host the application block in a worker role that also performs other tasks, but from the perspective of manageability and security you should host the application block in its own worker role or even in its own hosted service. For information about how to host the Autoscaling Application Block in Azure, see the topic "Hosting the Autoscaling Application Block in a Worker Role" on MSDN.

Ed Says:
	`Using the Autoscaling Application Block in code is very similar to using the other Enterprise Library application blocks. The topic "<a href="ff664560(v=pandp.50).md">Using Enterprise Library in Applications</a>" in the main Enterprise Library documentation describes how to reference the Enterprise Library assemblies, how Enterprise Library handles dependencies, and how to work with Enterprise Library objects.</td>`

Ed Says:

                Using the Autoscaling Application Block in code is very similar to using the other Enterprise Library application blocks. The topic "<a href="ff664560(v=pandp.50).md">Using Enterprise Library in Applications</a>" in the main Enterprise Library documentation describes how to reference the Enterprise Library assemblies, how Enterprise Library handles dependencies, and how to work with Enterprise Library objects.</td>

If you choose to host the application block in Azure, and plan to scale the role instance that hosts it for added reliability, you must make sure that you configure the application block to use a blob execution lease in the advanced configuration settings. This setting ensures that only a single instance of the application block is able to evaluate rules at any point in time. For information about how to make this configuration setting, see the topic "Entering Configuration Information" on MSDN.

Note

The default configuration settings assume that you will have a single instance of the worker role that hosts the application block. You must change this if you plan to scale the role that hosts the Autoscaling Application Block.

Hosting the application block on-premises means that the block must remotely access the diagnostic data from your Azure application that it needs for reactive rules. An advantage of hosting the application block locally is that it may simplify integration with other tools and processes that run on premises. It may also be convenient to have the Autoscaling Application Block running locally when you are developing and testing your Azure application. For information about how to host the Autoscaling Application Block in an on-premises application, see the topic "Hosting the Autoscaling Application Block in an On-Premises Application" on MSDN.

Changes to Your Azure Application

The Autoscaling Application Block is designed to minimize the changes you need to make to your Azure application. The application block can add and remove role instances from your application by using the Azure Service Management API. This does not require any changes in your application.

However, reactive rules can use performance counter data to determine whether the application block should change the current number of role instances. If you are using performance counters in your reactive rules, then you must take steps to ensure that your application saves the performance counter data to Azure storage where the application block's data collection process can access it.

Markus Says:
	`You can also instrument your Azure application with custom performance counters to use in your reactive rules.</td>`

For more information about the code changes you must make in your Azure application to enable it to save performance counter data, see the topic "Collecting Performance Counter Data" on MSDN.

You can also use the Azure Diagnostics Configuration File (diagnostics.wadcfg) to configure your performance counters. For more details, see "How to Use the Azure Diagnostics Configuration File" on MSDN.

The Service Information

Before the Autoscaling Application Block can perform any autoscaling operations on your Azure application, you need to configure the service information that describes your Azure application. By default, this service information is stored in an XML document in an Azure blob that is accessible to the application block.

Ed Says:
	`The service information defines the aspects of your Azure application that are relevant to the Autoscaling Application Block.</td>`

The service information includes the following information about the Azure features that make up your application.

For each Azure subscription that contains resources that you want to be able to scale automatically, the service information contains the subscription ID, certificate thumbprint, and details of where the application block can find the management certificate it needs to be able to issue scaling requests.
For each Azure hosted service that contains resources that you want to be able to scale automatically, the service information contains the names of the deployment slots where the application to be scaled is running.
The application block can only use the Azure roles that are listed in the service information as sources of performance counter data or as targets for autoscaling. For each role listed in the service information, the service information identifies the storage account where Azure saves the role's diagnostic data. The application block reads the performance counter data that the reactive rules use from this storage account.
The names of any queues whose length the application block monitors.
The definitions of the scale groups. These are described later in this chapter.

The Autoscaling Application Block rules can only operate on targets (roles and scale groups) that are identified in the application block's service information. For further information see the topic "Storing Your Service Information Data" on MSDN.

The service information also enables you to control how aggressively you want to autoscale your Azure application by specifying cool-down periods. A cool-down period is the period after a scaling operation has taken place during which the application block will not perform any further scaling operations. A cool-down period is enabled via the optimizing stabilizer feature of the application block. You can define different cool-down periods for scale-up and scale-down operations, and specify default cool-down periods that individual roles can override. The shorter the cool-down period, the more aggressive the application block will be in issuing scaling requests. However, by setting short cool-down periods for both scale-up and scale-down operations, you risk introducing an oscillation whereby the application block repeatedly scales up and then scales down a role. If not specified, the application block uses a default of 20 minutes for the cool-down period.

Bharath Says:
	`There is no point in setting cool-down periods to less than ten minutes. Azure can often take ten minutes to complete a scaling operation on a role, during which time it will not accept any additional scaling requests for that role anyway.</td>`

Bharath Says:

                There is no point in setting cool-down periods to less than ten minutes. Azure can often take ten minutes to complete a scaling operation on a role, during which time it will not accept any additional scaling requests for that role anyway.</td>

The service information also enables you to configure when, during the hour, you want to allow scaling operations to take place. Because Azure bills by the clock hour, you may want to use role instances for as long as possible within an hour. To achieve this, you can specify that scale up operations can only take place during the first X minutes of the hour and that scale down operations can only take place during the last Y minutes of the hour.

Note

You need to allow enough time for the scale down operations to complete before the end of the hour. Otherwise, you will be billed for the next hour.

With the exception of scale groups, which are a convenience when it comes to authoring rules, the developers of the application typically define the service information; they know about the structure of the application, and what can and cannot be safely scaled.

Using the Autoscaling Application Block does not automatically make your Azure roles scalable. Although Azure provides the infrastructure that enables your applications to scale, you are responsible for ensuring that your web and worker roles will run correctly when there is more than one instance of the role. For example, it may not be possible to parallelize some algorithms.

Jana Says:
	`You should ensure that your service information data only references roles that are scalable.</td>`

Note

See the section "The Map Reduce Algorithm" in the book Developing Applications for the Cloud for information about a technique for parallelizing large calculations across multiple role instances.

For web roles to be scalable, they should be "web farm friendly." In particular, if they make use of session state, then the session state provider either shares or synchronizes your session state data across your role instances. In Azure, you can use the session state provider that stores session state in the shared cache. For more information, see "Session State Provider" on MSDN.

Note

To minimize the risk of disclosing sensitive information, you should encrypt the contents of the service information store. For more information, see the topic "Encrypting the Rules Store and the Service Information Store" on MSDN.

Adding Throttling Behavior to Your Application

The Autoscaling Application Block enables you to use two different autoscaling mechanisms in your Azure applications. You can either use autoscaling rules to change the number of role instances or use autoscaling rules to modify the behavior of your application, typically by throttling the application so that it uses fewer resources. Examples of throttling behavior include temporarily disabling some non-essential features in your application, and switching to a less resource-intensive version of the UI.

Ed Says:
	`While using instance autoscaling requires minimal changes to your application because the Autoscaling Application Block scales your application by adding or removing role instances, using throttling will require more extensive changes to your application.</td>`

Ed Says:

                While using instance autoscaling requires minimal changes to your application because the Autoscaling Application Block scales your application by adding or removing role instances, using throttling will require more extensive changes to your application.</td>

There are two scenarios in which you might decide to use throttling.

You can use throttling instead of instance autoscaling for some or all of the roles in your application. You might chose to do this if your role does not support running with multiple instances or because you can achieve better scaling results by changing the behavior of the role rather than adding or removing new instances.
You want your application to respond almost immediately to a burst in demand. With throttling, you can change the behavior of the application as soon as the application block executes a reactive rule action without having to wait for Azure to start a new role instance. Depending on the size and complexity of your application, throttling may not take effect faster than instance scaling.

To add throttling behavior to your Azure application you must modify your application to respond to requests for it. For more information about how your Azure application can detect a request for throttling behavior, see the topic "Implementing Throttling Behavior" on MSDN.

You must also create a set of reactive rules that use the changeSetting action to notify your application that it should enable or disable some throttling behavior. For information about how to define the throttling autoscaling rules, see the topic "Defining Throttling Autoscaling Rules" on MSDN.

For a complete example of how the Tailspin Surveys application uses throttling behavior, see Chapter 5, "Making Tailspin Surveys More Elastic" in this guide.

Using Instance Autoscaling and Throttling Together

You can use instance autoscaling exclusively or throttling exclusively in your Azure application, or use them together.

If you decide to use them together, you need to take into account how they will interact. You should be aware of a number of differences between them when you are creating your autoscaling rules.

Instance autoscaling rules can take up to ten minutes to have an effect because of the time taken by Azure to launch new role instances. Throttling autoscaling rules can affect the behavior of your application almost immediately.
Instance autoscaling rules are limited by the configurable cool-down periods that set a minimum time before the application block can scale the same role again; there are no cool-down periods for throttling autoscaling rules.
Instance autoscaling rules are always limited by constraint rules. Throttling autoscaling rules are not limited by constraint rules.

A single reactive rule can have an action that performs instance autoscaling and an action that performs throttling.

You can use rule ranks to control the precedence of reactive rules that perform instance autoscaling and reactive rules that perform throttling.

The Autoscaling Rules

Autoscaling actions take place in response to rules that define when the Autoscaling Application Block should scale roles up or down. By default, these rules are stored in an XML document in an Azure blob that is accessible to the application block.

The Autoscaling Application Block supports two types of rules that define autoscaling behavior: constraint rules and reactive rules. Typically, the administrators of the application are responsible for creating, monitoring, and maintaining these rules. They can perform these tasks by editing the XML document that contains the rules, or through a user interface (UI) created by the developers of the application.

Poe Says:
	`The next chapter describes an Azure application with an example of a web-based UI for managing rules.</td>`

When you are creating your autoscaling rules, you can only create rules for the roles that the developers listed in the service information. You should plan your rules in three stages:

Design the default (or "baseline") constraint rules.
Design any additional constraint rules.
Design your reactive rules.

You should create a default constraint rule for every role that is listed in the service information. A default rule does not have a timetable, so it is always active; it has a rank of zero, so it can be overridden by any other constraint rules; it should have minimum and maximum role instance values that define the default values you want when no other constraint rules are active. These default rules should ensure that you always have the minimum number of role instances that you need to meet your SLA commitments, and that you don't go over budget by running too many role instances.

Bharath Says:
	`Default rules guard your SLAs!</td>`

Beth Says:
	`Default rules guard your wallet!</td>`

Note

The application block will log an error if a reactive rule attempts to scale a target that does not have a constraint rule. In this scenario, the application block will not perform the scaling action on the target.

After you have created your default rules, you can define any additional constraint rules that handle expected periods of above or below normal workload for your application. These additional constraint rules have timetables that specify when the rule is active, a rank that is greater than one to ensure that they override the default constraint rules, and appropriate values for the maximum and minimum role instance counts. For example, your application might experience increased workloads at 9:00 AM to 10:00 AM every morning, or on the last Friday of every month, or decreased workloads between 1:00 PM and 5:00 PM every day, or during the month of August.

If you want to specify a fixed number of instances for a role, you can use a constraint rule with the maximum and minimum values set to the same number. In this case, reactive rules will not have any effect on the number of role instances.

The constraint rules enable you to plan for expected changes in workload.

Poe Says:
	`You may already have a good idea about when your application's workload changes based on your knowledge and experience of the application and your organization. However, you will gain a deeper insight by monitoring and analyzing your application.</td>`

Poe Says:

                You may already have a good idea about when your application's workload changes based on your knowledge and experience of the application and your organization. However, you will gain a deeper insight by monitoring and analyzing your application.</td>

The reactive rules enable you to plan for unexpected changes in your application's workload. A reactive rule works by monitoring an aggregate value derived from a set of data points such as performance counter values, and then performing a scaling operation when the aggregate value reaches a threshold. The challenge with reactive rules is knowing which aggregates and data points, or combination of aggregates and data points, you should use in your reactive rules. For example, if you have a reactive rule that monitors CPU utilization, but your application is I/O bound, the reactive rule won't trigger a scaling action at the correct time. Another example is if you have a reactive rule that monitors the length of an Azure queue. If it doesn't matter to the functionality of the application if the queue is emptied later, you will be wasting resources if you scale up to clear the queue earlier than is necessary.

Poe Says:
	`You must monitor and analyze the behavior of your application to understand what data points and aggregates, or combination of data points and aggregates, work best as a proxy measure of your application's performance. You may need multiple rules because different aspects of your application may have different performance characteristics.</td>`

Poe Says:

                You must monitor and analyze the behavior of your application to understand what data points and aggregates, or combination of data points and aggregates, work best as a proxy measure of your application's performance. You may need multiple rules because different aspects of your application may have different performance characteristics.</td>

If performance counters or Azure queue lengths don't work well as ways of measuring your application's performance, the application's developers can instrument the application to generate custom business metrics to use in rules.

If your reactive rules use performance counter data from your Azure application, you must make sure that your application transfers the performance counter data that the rules consume to Azure Diagnostics storage. For an example of how to do this, see the section "Collecting Performance Counter Data from Tailspin Surveys" in Chapter 5, "Making Tailspin Surveys More Elastic" of this guide.

For information about defining rules, see the section "Rules and Actions" earlier in this chapter.

Note

To minimize the risk of disclosing sensitive information, you should encrypt the contents of the rules store. For more information, see the topic "Encrypting the Rules Store and the Service Information Store" on MSDN.

Implementing Schedule-based Autoscaling Without Reactive Rules

In some scenarios, you may want to use a schedule to precisely control the number of role instances at different times. You may want to do this because you want your running costs to be more predictable, or because you do not anticipate any unexpected bursts in demand for your application. You can achieve this goal by using only constraint rules and not reactive rules. Furthermore, your constraint rules should each have the maximum instance count equal to the minimum instance count.

The following snippet shows a simple set of rules that implement schedule-based autoscaling without using reactive rules. The default constraint rule sets the role instance count to two, the peak-time rule sets the role instance count to four.

<rules 
  xmlns=https://schemas.microsoft.com/practices/2011/entlib/autoscaling/rules
  enabled="true">
  <constraintRules>
    <rule name="Default" description="Always active" 
          enabled="true" rank="1">
      <actions>
        <range min="2" max="2" target="RoleA"/>
      </actions>
    </rule>
    
    <rule name="Peak" description="Active at peak times"
          enabled="true" rank="100">
      <actions>
        <range min="4" max="4" target="RoleA"/>
      </actions>
      <timetable startTime="08:00:00" duration="02:00:00">
        <daily/>
      </timetable>
    </rule>
  </constraintRules>
  
  <reactiveRules/>

  <operands/>
</rules>

Monitoring the Autoscaling Application Block

Over time, usage patterns for your application will change. The overall number of users will go up or down, people will start to use the application at different times, the application may gain new features, and people will use some parts of the application less and some parts more. As a consequence, your constraint and reactive rules may no longer deliver the optimal balance of performance and cost.

The Autoscaling Application Block logs detailed information about the behavior of your rules so you can analyze which rules were triggered and at what times. This information, in combination with other performance monitoring data that you collect, will help you analyze the effectiveness of your rule set and determine what changes you should make to re-optimize the rules.

Ed Says:
	The Autoscaling Application Block can use the Enterprise Library Logging Block logger, System.Diagnostics logging, or a custom logger.

How frequently you analyze your rule behavior depends on how dynamic the environment is within which your application operates.

Poe Says:
	Keeping your autoscaling rules optimized for your specific requirements is an ongoing task that you must plan for.

For information about the logging data that the application block generates, see the topic "Autoscaling Application Block Logging" on MSDN.

Figure 7 shows the data sources you may want to use when you are analyzing the behavior of the Autoscaling Application Block and your application.

Hh680945.E349EA9E8374EE9370B8847CE34299B5(en-us,PandP.50).png

Figure 7

Monitoring your autoscaling behavior

The application block provides interfaces that enable you to read its configuration information, the autoscaling rules, and the service information from the stores. You can also access the data points collected by the application block, such as performance counters and queue lengths that it uses when it evaluates the reactive rules. The application block also provides some methods that help you read and parse log messages that it has generated and written to the Azure Diagnostics log table using the system diagnostics logging infrastructure.

For more information about reading from the rules store, see the IRulesStoreinterface in the API documentation.

For more information about reading from the service information store, see the IServiceInformationStoreInterface in the API documentation.

For more information about reading from the rules store, see the IDataPointsStoreInterface in the API documentation.

For more information about reading and parsing the Autoscaling Application Block log messages, see the topic "Reading the Autoscaling Application Block Log Messages."

For a complete example of using the different data sources to visualize the Autoscaling Application Block activities, see the section "Visualizing the Autoscaling Actions" in the chapter, "Making Tailspin Surveys More Elastic."

Advanced Usage Scenarios

This section provides guidance about when you should use some of the advanced features of the Autoscaling Application Block.

Scale Groups

In an application with many web and worker roles, you may find it difficult to create and manage the large number of rules you need to define the autoscaling behavior in your application. In this scenario, scale groups provide a convenient way to define rules that can act on multiple roles at once. You should define the scale groups you need before you start creating rules.

Poe Says:
	Scale groups are a convenience. They help to minimize the number of rules you need to create and manage.

To define a scale group, you must identify the roles that will make up the scale group, and assign each role type in the scale group a ratio. The block uses these ratios to calculate the number of instances of each member of the scale group when it performs a scaling action. The following table shows a small example scale group; in practice, scale groups are likely to consist of many more targets.

Target	Ratio
Target A (Worker role A in Service Host A)	2
Target B (Worker role A in Service Host B)	1
Target C (Web role A in Service Host A)	3

Note

A scale group can include targets that refer to roles in different hosted services.

The application block does not use transactions to perform operations on the members of a scale group and scale groups do not guarantee to preserve the ratios between the role instances. For example, a constraint rule may limit the number of instances suggested by a reactive rule for some roles in a scale group, or an operator may manually change the number of instances of one or more roles independently of your autoscaling rules.

A reactive rule can use a scale group as its target. The following table shows the effect of scaling the scale group by an increment of two using the ratios shown in the previous table.

Target	Initial instance count	Instance count after scaling
Target A	4	8
Target B	2	4
Target C	6	12

The result is calculated as follows:

(Initial instance count) + (Increment * Ratio)

The following table shows the effect of a reactive rule scaling the scale group by 50%.

Target	Initial instance count	Instance count after scaling
Target A	4	8
Target B	2	3
Target C	6	15

The result is calculated as follows:

(Initial instance count) + (Increment * Ratio * Initial instance count)

You can also use scale groups when you specify constraint rules. A constraint rule uses the ratios to determine the maximum and minimum values of the role instance counts. An example constraint rule specifies a maximum count of five and a minimum count of two. The following table shows the maximum and minimum instance count values of the individual roles that make up our example scale group.

Target	Minimum instance count	Maximum instance count
Target A	4	10
Target B	2	5
Target C	6	15

Using Different Ratios at Different Times

You can use multiple scale groups with the same members to apply different ratios at different times. For example, you could define the two scale groups shown in the following tables:

Scale Group A

Target	Ratio
Target A (Worker role A in Service Host A)	2
Target B (Worker role A in Service Host B)	1
Target C (Web role A in Service Host A)	3

Scale Group B

Target	Ratio
Target A (Worker role A in Service Host A)	3
Target B (Worker role A in Service Host B)	1
Target C (Web role A in Service Host A)	1

You could then define two rules, as shown in the following table:

Rule name	Timetable	Rank	Target
Default rule	Always active	1	Scale Group A
Batch processing rule	Sundays between 2:00 and 4:00	20	Scale Group B

Both rules target the same roles, but apply different ratios. The batch processing constraint rule will override the default constraint rule on Sundays between 2:00 and 4:00 in the morning and use the ratios defined for scale group B.

Poe Says:
	Don't make things too complicated by putting a role into too many scale groups. It will make it more difficult for you to understand why the Autoscaling Application Block has set a particular instance count value.

Using Notifications

You may decide that you want to preview any scaling operations suggested by the Autoscaling Application Block before the application block sends them to Azure. This may be useful when you are testing the block and want to double check the scaling operations before they happen, or if you want to use the operator's knowledge of the application to refine your autoscaling rules and "tweak" the scaling actions.

Note

The application block can send notifications and performs scaling actions at the same time so that operators are notified of the scaling operations that the application block performs.

You can configure the application block to send an email message to one or more operators/administrators. The email message provides full details of all of the scaling operations that the application block suggested based on the current set of autoscaling rules.

Poe Says:
	You can use notifications while you are evaluating or testing the application block. The notifications can tell you what operations the application block would perform given the current set of rules.

For more information about how to configure notifications, see the topic "Using Notifications and Manual Scaling" on MSDN.

Integrating with the Application Lifecycle

When you deploy an application to Azure, you can deploy to either the staging or the production deployment slot. Typically, you deploy to the staging environment first, where you can perform any final tests before promoting the contents of the staging deployment to the production environment.

If you are testing your autoscaling behavior, you will need to have separate service information definitions and rules for each slot or modify the service information definition when you promote from staging to production.

The following code snippet from a service information definition file shows how the roles and scale groups are defined for the different deployment slots.

<?xml version="1.0" encoding="utf-8"?>
<serviceModel ... >
  <subscriptions>
    <subscription name="Autoscaling Sample" ...>
      <services>
        <service dnsPrefix="stagingautoscalingservice" slot="Staging">
          <roles>
            <role alias="Staging.AutoScaling.WebApp"
                  roleName="AutoScaling.WebApp" ... />
          </roles>
        </service>
        <service dnsPrefix="productionautoscalingservice" slot="Production">
          <roles>
            <role alias="Production.AutoScaling.WebApp"
                  roleName="AutoScaling.WebApp" ... />
          </roles>
        </service>
        <service dnsPrefix="stagingscalegroup" slot="Staging">
          <roles>
            <role alias="Staging.Autoscaling.Scalegroup.Billing"
                  roleName="Autoscaling.Scalegroup.Billing" ... />
            <role alias="Staging.Autoscaling.Scalegroup.BillProcessor"
                  roleName="Autoscaling.Scalegroup.BillProcessor" ... />
            <role alias="Staging.Autoscaling.Scalegroup.InvoiceReporting"
                  roleName="Autoscaling.Scalegroup.InvoiceReporting" ... />
          </roles>
        </service>
        <service dnsPrefix="productionscalegroup" slot="Production">
          <roles>
            <role alias="Production.Autoscaling.Scalegroup.Billing"
                  roleName="Autoscaling.Scalegroup.Billing" ... />
            <role alias="Production.Autoscaling.Scalegroup.BillProcessor"
                  roleName="Autoscaling.Scalegroup.BillProcessor" ... />
            <role alias="Production.Autoscaling.Scalegroup.InvoiceReporting"
                  roleName="Autoscaling.Scalegroup.InvoiceReporting" ... />
          </roles>
        </service>
      </services>
      <storageAccounts>
        ...
      </storageAccounts>
    </subscription>
  </subscriptions>

  <scaleGroups>
    <scaleGroup name="StagingScaleGroupA">
      <roles>
        <role roleAlias="Staging.Autoscaling.Scalegroup.Billing" ... />
        <role roleAlias="Staging.Autoscaling.Scalegroup.BillProcessor" ... />
        <role roleAlias="Staging.Autoscaling.Scalegroup.InvoiceReporting" ... />
      </roles>
    </scaleGroup>
    <scaleGroup name="ProductionScaleGroupA">
      <roles>
        <role roleAlias="Production.Autoscaling.Scalegroup.Billing" ... />
        <role roleAlias="Production.Autoscaling.Scalegroup.BillProcessor" ... />
        <role roleAlias="Production.Autoscaling.Scalegroup.InvoiceReporting" .../>
      </roles>
    </scaleGroup>
  </scaleGroups>
</serviceModel>

Poe Says:
	All role aliases and scale group names must be unique within the service information.

Extending the Autoscaling Application Block

In Enterprise Library, pretty much everything is extensible. The Autoscaling Application Block is no exception. It offers five key extension points if you want to extend or modify its functionality.

Ed Says:
	You can also download the source code and make any changes you want. The license permits this.

For more information, see the topic "Extending and Modifying the Autoscaling Application Block" on MSDN.

For more information, see the "Extensibility Hands-on Labs for Microsoft Enterprise Library 5.0."

Custom Actions

If you need to add a new action to the existing scaling and throttling actions, you can create a custom action. There are three steps to creating a custom action.

Create code that implements the action.
Create code that can deserialize your custom action from the rules store. If you are using the built-in rules store, this will require deserialization from XML.
Configure the application block to use the custom action.

For more information about custom actions, see the topic "Creating a Custom Action" on MSDN.

For an example of a custom action, see Chapter 5, "Making Tailspin Surveys More Elastic."

Custom Operands

In a reactive rule, an operand defines an aggregate value calculated from data points that the application block collects. If you need to add a new operand to the existing performance counter and queue length operands, you can create a custom operand. There are three steps to creating a custom operand.

Create code that implements a custom data collector.
Create code that can deserialize your custom operand from the rules store. If you are using the built-in rules store, this will require deserialization from XML.
Configure the application block to use the custom operand.

Beth Says:
	By using custom operands, you can use business metrics in your rule definitions.

For more information about custom operands, see the topic "Creating a Custom Operand" on MSDN.

For an example of a custom operand, see Chapter 5, "Making Tailspin Surveys More Elastic."

Custom Stores

The Autoscaling Application Block uses two stores, one for rules, and one for service information. For each of these stores, the application block includes two implementations: storing the data as XML in an Azure blob or storing the data as XML in a file on the local file system. The first is used when you host the application block in an Azure role, the second for when you host it in an on-premises application.

If you need to be able to manipulate your rules or service information in a different tool, you could replace these implementations with stores that store the data in a different format and in a different location; for example, JSON in a local file or in a SQL Server database.

Jana Says:
	Using SQL Server as a rules store could be useful if your application requires a large number of rules.

For more information about creating a custom rules store, see the topic "Creating a Custom Rules Store" on MSDN.

For more information about creating a custom service information store, see the topic "Creating a Custom Service Information Store" on MSDN.

Custom Logging

The Autoscaling Application Block can use the logger in the System.Diagnostics namespace or the Enterprise Library Logging Application Block to log details of the autoscaling activities it performs.

If you want to use a different logging infrastructure, you can implement a custom logger for the application block. This may be useful if you want to integrate with your existing logging infrastructure to keep all your logs in a single location.

For more information about creating a custom logger, see the topic "Creating a Custom Logger" on MSDN.

Using the WASABiCmdlets

You can use the WASABiCmdlets Windows PowerShell® Cmdlets to perform operations on the Autoscaling Application Block from a Windows PowerShell script. With the WASABiCmdlets, you can enable and disable rules and rule evaluation, modify rule ranks, adjust the behavior of the stabilizer, and more.

In combination with the Windows Azure PowerShell Cmdlets, and System Center Operations Manager (SCOM) or other manageability tools, you can implement a powerful custom autoscaling solution.

For more information about the WASABiCmdlets, see the topic "Using the WASABiCmdlets Windows PowerShell Cmdlets" on MSDN.

For more information about the Windows Azure PowerShell Cmdlets, see "Windows Azure PowerShell Cmdlets."

For more information about SCOM, see "System Center Operations Manager" on TechNet.

Sample Configuration Settings

The Autoscaling Application Block has a large number of configuration settings that you can use to control how it performs autoscaling for your application. This section describes some sample configurations to illustrate how you can configure the application block to address specific requirements. These illustrations are only guidelines: you should analyze your own requirements and the behavior of your Azure application to determine the optimal configuration settings for your application.

These sample configurations refer to the Autoscaling Application Block configuration settings and to the autoscaling rules.

For more information about configuring the Autoscaling Application block, see the topic "Entering Configuration Information" on MSDN.

For more information about writing autoscaling rules, see the topics "Defining Constraint Rules" and "Defining Reactive Rules" on MSDN.

Determining the optimum set of timing values for a solution is usually an iterative process. During those iterations, you should take the timing values shown in the following table into consideration. The table shows the default values for the key configuration settings.

Configuration item	Location	Default value
Instance count collection interval	Hardcoded	Two minutes
Performance counter collection interval	Hardcoded	Two minutes
Queue length collection interval	Hardcoded	Two minutes
Rule evaluation interval	Configuration file	Four minutes
Tracking interval	Configuration file	Five minutes
Rules store monitoring interval: specifies how often the application block checks for changes to the rules	Configuration file	30 seconds
Service information store monitoring interval: specifies how often the application block checks for changes to the service information	Configuration file	30 seconds
Periods specified by constraint rules	Rules store	None
Operand timespan used to calculate aggregate values	Rules store	None
Cool-down periods (scaling up and down) used by the stabilizer	Service information store	20 minutes
Periods at the start and end of the hour when scaling operations do not occur	Service information store	None
Azure related timings such as Azure diagnostics transfer rate and performance counter sampling rates.	Azure configuration or application code	None

Figure 8 illustrates the relationships between some of the timing configuration settings in the previous table.

Figure 8

Timing relationships

The following list explains what is happening at each point on the diagram:

Your Azure application captures performance monitoring data. You typically configure these counters either in the OnStart method of your Azure roles, or through Azure configuration. This data is captured in memory.
Your Azure application saves the performance monitoring data to the Azure diagnostics tables. You typically configure transfer periods either in the OnStart method of your Azure roles, or through Azure configuration.
The Autoscaling Application Block collects performance counter data from the Azure diagnostics tables and saves it in the data points store. This happens every two minutes. This value is hardcoded in the application block.
The Autoscaling Application Block collects instance count, queue length, and any custom metrics data from your Azure application and saves it in the data points store. This happens every two minutes. This value is hardcoded in the application block.
The rules evaluator runs and identifies the autoscaling rules that apply at the current point in time. The frequency at which the rules evaluator runs is specified in the configuration file. The default value is four minutes.
The rules evaluator retrieves the data points that it needs from the data points store. The amount of data for each rule is determined by the time span of the operand associated with the rule. For example, if the operand specifies an average over ten minutes, the rules evaluator retrieves data from the last ten minutes from the data points store.
The stabilizer may prevent certain scaling operations from happening. For example, the stabilizer may specify cool-down periods after the application block has performed a scaling operation or limit scaling operations to certain periods in the hour.

Poe Says:
	For Azure infrastructure-related timings such as Azure diagnostics transfer periods and performance counter sampling rates you need to determine timings for your application scenario. Don't assume that a one minute transfer period is best for all scenarios.

The following sections suggest some configuration settings for specific scenarios.

Average Rule Evaluation Period

You have a web application that gets busy at some times that are hard to predict. If you begin to see an increase in the number of rejected web requests over the past five minutes, you want to take action now, in order to ensure that within the next 20 minutes you will have enough resources available to handle the increase in the number of requests. You are willing to accept that some requests may continue to be rejected for the next 20 minutes.

Bharath Says:
	Remember that Azure takes time to start up new role instances, so in this scenario you must expect some requests to be rejected while this is happening.

You also have a predictable usage pattern, so you will also use constraint rules to ensure that you reserve enough instances at the times when you know there will be a higher number of requests.

The following table shows some sample configuration values for this scenario.

Configuration item	Default value
Rule evaluation interval	Five minutes
Tracking interval	Five minutes
Instance count collection interval	Two minutes (hardcoded)
Performance counter collection interval	Two minutes (hardcoded)
Queue length collection interval	Two minutes (hardcoded)
Cool-down period (both for scaling up and down)	20 minutes
Operand: ASP.NET application restarts	30 minutes
Operand: ASP.NET Requests queued	15 minutes
Operand: ASP.NET requests rejected	Five minutes

Long Rule Evaluation Period

Typically, your application has a very stable and constant level of demand, but it does occasionally encounter moderate increases in usage. Therefore, you decide to evaluate your autoscaling rules every 30 minutes and look for higher than average CPU utilization. You also have a scale-down rule that runs when CPU utilization starts to fall back to its normal level.

The following table shows some sample configuration values for this scenario.

Configuration item	Default value
Rule evaluation interval	30 minutes
Tracking interval	Five minutes
Instance count collection interval	Two minutes (hardcoded)
Performance counter collection interval	Two minutes (hardcoded)
Queue length collection interval	Two minutes (hardcoded)
Cool-down period (both for scaling up and down)	20 minutes
Operand: CPU Utilization %	30 minutes

Configuring the Stabilizer

The stabilizer performs two functions for the Autoscaling Application Block: it helps to prevent fast oscillations (the "yo-yo effect") in the number of role instances by defining cool-down periods, and it helps to optimize costs by limiting scaling-up operations to the beginning of the hour and scaling-down operations to the end of the hour.

The following snippet from a service information definition shows an example of the stabilizer configuration.

<stabilizer scaleUpCooldown="00:20:00" scaleDownCooldown="00:30:00" 
     scaleUpOnlyInFirstMinutesOfHour="15" scaleDownOnlyInLastMinutesOfHour="10" 
     notificationsCooldown="00:25:00">
     <role roleAlias="BillingWorkerRole" scaleUpCooldown="00:18:00" 
        scaleDownCooldown="00:18:00" />
</stabilizer>

Note

You can configure global stabilizer settings and override them for specific roles.

In this example, the scaleUpCooldown setting prevents the application block from scaling up a role for 20 minutes after any change in the instance count for that role. Similarly, the scaleDownCooldown setting prevents the application block from scaling down a role for 30 minutes after any change in the instance count for that role.

The scaleUpOnlyInFirstMinutesOfHour setting ensures that the application block only performs scale up operations during the first 15 minutes of the hour, and the scaleDownOnlyInLastMinutesOfHour setting ensures that scale-down operations only happen in the last 10 minutes of the hour. These two settings enable you to optimize the use of your role instances based on the Azure billing mechanism.

Note

At the time of this writing, partial compute instance hoursare billed as full compute hours for each clock hour an instance is deployed. For example, if you deploy a Small compute instance at 10:50 and delete the deployment at 11:10, then you will be billed for two Small compute hours, one compute hour for usage during 10:50 to 11:00 and another compute hour for usage during 11:00 to 11:10. Therefore, it makes sense to keep new instances alive for the remainder of the clock hour during which they were started. For more information, see "Usage Charge Details for Azure Bills."

The two sets of settings interact with each other. In this example, scale-up operations are only allowed during the first 15 minutes of the hour. If the application block scales up a role a five minutes past the hour, the cool-down period will not allow any additional scale-up operations on that role for another 20 minutes. Because of the scaleUpOnlyInFirstMinutesOfHour setting, this means that the stabilizer will not allow additional scale-up operations on this role within this clock hour.

Using the Planning Tool

This worksheet helps you to understand the interactions between different timing values that govern the overall autoscaling process. You can download this worksheet from the Enterprise Library Community site on CodePlex.

You can observe how different values can interact with each other by entering the values related to your environment.

Take the example in Figure 9 where Operands 1 and 2 are performance counters and Operand 3 is a custom business metric. You are evaluating the rules every 60 minutes.

Hh680945.5E842162DC3E1346E4BF345A9A85C2BF(en-us,PandP.50).png

Figure 9

Planning sheet inputs

The planning sheet shows the results in Figure 10.

Figure 10

Planning results

This example demonstrates why you must be careful with your Autoscaling Application Block configuration settings. In the first hour, you can see how the timing of the data transfer means that you don't use the last values for Operands 1 and 2. You may decide to change the aggregation interval for the operands, change the log transfer interval, or decide that this behavior is appropriate for the data you collect from your application.

The real value of this tool becomes evident if you have a large number of operands.

Note

The tool works by generating a set of data on a hidden sheet. If you unhide the worksheet, you will observe many #N/A values. These are deliberate and prevent the chart from showing jagged lines.

How the Autoscaling Application Block Works

This section summarizes how the Autoscaling Application Block works. If you're not interested in the internal workings of the block, you can skip this section. Figure 11 illustrates how the key components in the Autoscaling Application Block relate to each other, to Azure, and to external components.

Figure 11

Overview of the Autoscaling Application Block

The Metronome

The work that the Autoscaling Application Block performs starts with the Metronome. The Metronome generates a "tick" count that enables it to run other activities on a regular schedule. In the Autoscaling Application Block, it runs each Data Collector activity every two minutes, the Rule Evaluator activity every t1 seconds and the Tracker activity every t2 seconds. The default value for t1is four minutes and for t2 is five minutes, and you can override this in the application block's configuration settings.

The Data Collectors

Before the Autoscaling Application Block can execute any reactive rules, it must collect the data point values from which it calculates the aggregate values used in the reactive rules. A data point is the value of a metric, such as CPU utilization, at a specific point in time. The Data Collector activities retrieve data points from your Azure environment. The following table lists the possible sources of data points.

Monitoring Data Source	Description
Azure Diagnostics tables	These are the tables that Azure Diagnostics uses when it persists diagnostic data collected at run time from the Azure environment. The block retrieves performance counter data from the WADPerformanceCountersTable.
Azure Storage API	The data collector can query your application's Azure storage, including queues, blobs, and tables for custom data points. The block retrieves Azure queue lengths using this API.
Azure Storage Analytics API	The data collector can use this API to obtain data points related to your storage account, such as transaction statistics and capacity data.
Application data	The data collector can retrieve custom data points from your application. For example, the number of orders saved in an Azure table.

The Data Collector activities write the data point values they collect to the Data Points Store.

Although not shown on the diagram, the application block creates the Data Collector activities after reading the Rules Store to determine the data points that the reactive rules will use. In effect, the application block infers which data points it should collect from the rule definitions in the rules store.

The service information store holds the information about your Azure application that the Data Collector activities need to be able to access your roles and storage.

The Service Information Store

The Service Information Store stores the service information for your Azure application. This service information includes all of the information about your Azure application that the block needs to be able to collect data points and perform scaling operations.

The Data Points Store

The Data Collector activities populate the Data Points Store, which is populated with data points. The Rule Evaluator activity queries the Data Points Store for the data points that it needs to evaluate the reactive rules.

By default, the Autoscaling Application Block uses Azure table storage for the Data Points Store.

Ed Says:
	The application block does not support hosting the Data Points Store in the local Azure storage emulator. The application block uses an Azure API call that is not supported by the local storage emulator.

The Rule Evaluator

In addition to running the Data Collector activity, the Metronome also periodically runs the Rule Evaluator activity. When the Rule Evaluator task runs, it queries the Rules Store to discover which autoscaling rules it should apply at the current time. The Rules Store caches the rules in memory, but checks at a configurable period whether the rules have changed and if so, reloads the rules from the backing store. It then queries the data points in Data Points Store to calculate the aggregate values that it needs to evaluate the reactive rules. It also reconciles any conflicts between the rules before it executes the actions triggered by the rules.

For more information about how the application block reconciles conflicting rules and actions, see the topic "Understanding Rule Ranks and Reconciliation."

The Rules Store

The Rules Store holds a list of all of the autoscaling rules that you have defined for your Azure application. As a reminder, these rules can be constraint rules or reactive rules.

By default, the Autoscaling Application Block uses Azure table storage for the Rules Store.

A rule can trigger one or more actions. The following table describes three types of actions that the Autoscaling Application Block supports.

Action type	Description
Scale action	Performs instance autoscaling or sends a notification to an operator.
Throttling action	Performs application throttling.
Custom action	Performs a custom, user-defined action.

The Autoscaling Application Block can also propose scaling actions to an operator via a notification mechanism.

The Logger

The Logger component optionally uses the Enterprise Library Logging Application Block to save diagnostics information from the Autoscaling Application Block. You can also configure the Logger component to use other logging components such as the System.Diagnostics namespace.

For more information about the logging information that the application block generates, see the topic "Autoscaling Application Block Logging" on MSDN.

For more information about configuring the logger, see the topic "Entering Configuration Information" on MSDN.

The Scaler

The Scaler is responsible for communicating with Azure to add or remove role instances based on the rule actions. It also incorporates a stabilizer component to prevent the Scaler from repeatedly adding and removing role instances.

Scaling operations may take some time to complete. The Scaler initiates scaling operations and adds a message to the tracking queue to record the fact that the application block has requested a scaling operation.

The Scaler can send notification email messages to an operator detailing proposed scaling actions instead of performing the actions directly.

The Tracker

The Tracker activity tracks all the scaling operations initiated by the Scaler. The Metronome runs the Tracker activity by default every minute. The Tracker activity then checks to see which of the scaling operations in the tracking queue have completed successfully or failed. It logs details of completed scaling operations, including any error information if the operation failed, and then removes the entry from the queue.

More Information

For more information about design requirements, see "Building a Scalable, Multi-Tenant Application for Azure" on MSDN:
https://msdn.microsoft.com/en-us/library/ff966483.aspx

For more information about compute hours in Azure, see "Usage Charge Details for Azure Bills":
https://go.microsoft.com/fwlink/?LinkID=234626

For more information about Azure subscriptions, see the Azure Support page:
https://www.microsoft.com/windowsazure/support/

For more information about how the Autoscaling Application Block reconciles conflicting rules, see "Understanding Rule Ranks and Reconciliation" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680923(v=PandP.50).aspx

For a more detailed discussion of how you can estimate your Azure running costs, see the chapter "How Much Will It Cost?" in the book "Moving Applications to the Cloud":
https://msdn.microsoft.com/en-us/library/ff803375.aspx

For a discussion of session state in Azure applications, see "Storing Session State" in the book "Moving Applications to the Cloud":
https://msdn.microsoft.com/en-us/library/ff803373.aspx#sec11

For a discussion of some of the design issues associated with scalable worker roles, see "Scaling Applications by Using Worker Roles" in the book "Developing Applications for the Cloud":
https://msdn.microsoft.com/en-us/library/hh534484.aspx#sec14

For information about how you can use NuGet to prepare your Visual Studio project to work with the Autoscaling Application Block, see the topic "Adding the Autoscaling Application Block to a Host" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680920(v=PandP.50).aspx

For information about how to host the Autoscaling Application Block in Azure, see the topic "Hosting the Autoscaling Application Block in a Worker Role" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680914(v=PandP.50).aspx

For information about how to reference the Enterprise Library assemblies, how Enterprise Library handles dependencies, and how to work with Enterprise Library objects, see "Using Enterprise Library in Applications" in the main Enterprise Library documentation on MSDN:
https://msdn.microsoft.com/en-us/library/ff664560(PandP.50).aspx

If you choose to host the Autoscaling Application Block in Azure, and plan to scale the role instance that hosts the block for added reliability, you must make sure that you configure the application block to use a blob execution lease in the advanced configuration settings. For information about how to make this configuration setting, see the topic "Entering Configuration Information" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680915(v=PandP.50).aspx

For information about how to host the Autoscaling Application Block in an on-premises application, see the topic "Hosting the Autoscaling Application Block in an On-Premises Application" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680882(v=PandP.50).aspx

For more information about the code changes you must make in your Azure application to enable it to save performance counter data, see the topic "Collecting Performance Counter Data" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680886(v=PandP.50).aspx

The Autoscaling Application Block rules can only operate on targets (roles and scale groups) that are identified in the block's service information. For more information, see the topic "Storing Your Service Information Data" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680878(v=PandP.50).aspx

For information about a technique for parallelizing large calculations across multiple role instances, see the section "The Map Reduce Algorithm" in the book Developing Applications for the Cloud:
https://msdn.microsoft.com/en-us/library/ff966483.aspx#sec18

In Azure, you can use the session state provider that stores session state in the shared cache. For more information, see the page "Session State Provider" on MSDN:
https://msdn.microsoft.com/en-us/library/gg185668.aspx

For more information about how your Azure application can detect a request for throttling behavior, see the topic "Implementing Throttling Behavior" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680896(v=PandP.50).aspx

For information about how to define the throttling autoscaling rules, see the topic "Defining Throttling Autoscaling Rules" on MSDN:

https://msdn.microsoft.com/en-us/library/hh680908(v=PandP.50).aspx

For a complete example of how the Tailspin Surveys application uses throttling behavior, see Chapter 5, "Making Tailspin Surveys More Elastic" in this guide.

For information about the logging data that the Autoscaling Application Block generates, see the topic "Autoscaling Application Block Logging" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680883(v=PandP.50).aspx

For more information about reading and parsing the Autoscaling Application Block log messages, see the topic "Reading the Autoscaling Application Block Log Messages":
https://msdn.microsoft.com/en-us/library/hh680909(v=PandP.50).aspx

For more information about configuring the Autoscaling Application Block and configuring the logger, see the topic "Entering Configuration Information" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680915(v=PandP.50).aspx

For more information about reading from the rules store, see the "IRulesStore interface" in the API documentation on MSDN:
https://go.microsoft.com/fwlink/?LinkID=234680

For more information about reading from the service information store, see the "IServiceInformationStore interface" in the API documentation on MSDN:
https://go.microsoft.com/fwlink/?LinkID=234681

For more information about reading from the rules store, see the "IDataPointsStore interface" in the API documentation on MSDN:
https://go.microsoft.com/fwlink/?LinkID=234682

For more information about how to configure notifications, see the topic "Using Notifications and Manual Scaling" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680885(v=PandP.50).aspx

For more information extending and modifying the Autoscaling Application Block, see the topic "Extending and Modifying the Autoscaling Application Block" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680889(v=PandP.50).aspx

For more information about custom actions, see the topic "Creating a Custom Action" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680921(v=PandP.50).aspx
For an example of a custom action and a custom operand, see Chapter 5, "Making Tailspin Surveys More Elastic."
For more information about custom operands, see the topic "Creating a Custom Operand" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680912(v=PandP.50).aspx
For more information about creating a custom rules store, see the topic "Creating a Custom Rules Store" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680933(v=PandP.50).aspx
For more information about creating a custom service information store, see the topic "Creating a Custom Service Information Store" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680884(v=PandP.50).aspx
For more information about creating a custom logger, see the topic "Creating a Custom Logger" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680926(v=PandP.50).aspx

For more information about extending the Enterprise Library, see the "Extensibility Hands-on Labs for Microsoft Enterprise Library 5.0":
https://go.microsoft.com/fwlink/?LinkId=209184

For more information about the WASABiCmdlets, see the topic "Using the WASABiCmdlets Windows PowerShell Cmdlets" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680938(v=PandP.50).aspx

For more information about the Windows Azure PowerShell Cmdlets, see "Windows Azure PowerShell Cmdlets":
https://msdn.microsoft.com/en-us/library/azure/jj156055.aspx

For more information about SCOM, see "System Center Operations Manager" on TechNet:
https://technet.microsoft.com/en-us/systemcenter/om/default.aspx

For more information about writing autoscaling rules, see the topics "Defining Constraint Rules" and "Defining Reactive Rules" on MSDN:
https://msdn.microsoft.com/en-us/library/hh680917(v=PandP.50).aspx
https://msdn.microsoft.com/en-us/library/hh680897(v=PandP.50).aspx

For more information about billing details in Azure, see "Usage Charge Details for Azure Bills":
https://go.microsoft.com/fwlink/?LinkID=234626

The Autoscale Planner worksheet helps you to understand the interactions between different timing values that govern the overall autoscaling process. You can download this worksheet from the Enterprise Library Community site on CodePlex:
https://go.microsoft.com/fwlink/?LinkID=234704

Next Topic | Previous Topic | Home

Last built: June 7, 2012

4: Autoscaling and Microsoft Azure

What is Autoscaling?

What is the Autoscaling Application Block?

Instance Autoscaling

Application Throttling

Rules and Actions

Constraint Rules

Reactive Rules

Logging

The Autoscaling Lifecycle

Determine Requirements and Constraints

Specify Rules

Run the Application

Collect and Analyze the Results

When Should You Use the Autoscaling Application Block?

You Want Your Application to Respond Automatically to Changes in Demand

You Want to Manage the Costs Associated with Running Your Application

You Have Predictable Times When Your Application Requires Additional Resources

When Should You Not Use the Autoscaling Application Block

Simple Applications

Controlling Costs

Applications That Are Not Scalable

Using the Autoscaling Application Block

Adding the Autoscaling Application Block to Your Visual Studio Project

Hosting the Autoscaling Application Block

Changes to Your Azure Application

The Service Information

Adding Throttling Behavior to Your Application

Using Instance Autoscaling and Throttling Together

The Autoscaling Rules

Implementing Schedule-based Autoscaling Without Reactive Rules

Monitoring the Autoscaling Application Block

Advanced Usage Scenarios

Scale Groups

Using Different Ratios at Different Times

Using Notifications

Integrating with the Application Lifecycle

Extending the Autoscaling Application Block

Custom Actions

Custom Operands

Custom Stores

Custom Logging

Using the WASABiCmdlets

Sample Configuration Settings

Average Rule Evaluation Period

Long Rule Evaluation Period

Configuring the Stabilizer

Using the Planning Tool

How the Autoscaling Application Block Works

The Metronome

The Data Collectors

The Service Information Store

The Data Points Store

The Rule Evaluator

The Rules Store

The Logger

The Scaler

The Tracker

More Information

Additional resources