How Sage Reduces Windows Azure Hosting Costs Using Autoscaling
Sage is a leading global supplier of business management software and services for small and midsized businesses, supporting more than 6 million customers worldwide.
Sage has recently launched a new product, Sage Construction Anywhere, hosted in Windows Azure that marks a step in their plan to embrace the benefits of cloud computing such as reliability, scalability, and availability. Sage Construction Anywhere is a cloud-hosted, multi-tenant add-on to their on-premises, market-leading, ERP product "Sage 300 Construction and Real Estate" (Sage 300). Sage Construction Anywhere uses the Autoscaling Application Block (which you may have heard referred to as Wasabi) from the Microsoft patterns & practices group to automatically scale Windows Azure roles based on both anticipated levels of demand and on variations in current levels of traffic. The goal of this project was to launch a new product running in Windows Azure, and to achieve a number of significant benefits for users of the Sage 300 product:
- Increase data mobility. To make data that is currently stored in back office systems more readily available to users who are working on-site.
- Expand access to data. To make reports from back office systems available to a wider collection of stakeholders such as clients and subcontractors who are not themselves users of the on-premises Sage 300 product.
- Reduce server costs. In the longer term, as Sage adds more functionality to the cloud-hosted application, Sage's customers will be able to reduce their server requirements, using fewer server resources and therefore reduce their on-premises running costs.
"Customers today, especially on construction sites, are looking for ways to get access to the data that's locked in their back office," says Chad Busche, Principal Software Architect at Sage.
The team working on the project had some additional requirements:
- Control costs. To minimize the costs associated with hosting the product online without compromising the performance of the application.
- Gain cloud experience. To learn more about using the cloud and to understand how best to manage the trade-offs between costs and performance in a cloud environment.
This case study describes how the Autoscaling Application Block helped Sage to manage the costs of running Sage Construction Anywhere in the cloud.
Sage Construction Anywhere
Version 1 of Sage Construction Anywhere provides access to the reports that customers need to run their day-to-day operations on construction sites. Before Sage Construction Anywhere, users at a construction site who needed such information from Sage 300 would typically telephone someone at their head office and ask for a particular report. The person at head office would then generate the report from Sage 300, and email it to the user at the construction site. This inefficient, manual process is also subject to error if the user at the construction site doesn't clearly specify the reports that he needs.
Sage Construction Anywhere enables users at a construction site to specify the information they need on a website hosted in Windows Azure. The website adds this request to a queue for processing. The customer's on-premises Sage 300 installation regularly checks the queue in the cloud application for outstanding requests, automatically downloads the report request, generates the report, and then uploads the report to the cloud application where it becomes available to the user who originally requested it. Users can access the reports from mobile phones, tablets, and laptops. This approach streamlines the workflow.
The Role of the Autoscaling Application Block
"Although Windows Azure enables elasticity, without the Autoscaling Application Block there would have been only three options for dynamically scaling the Sage Construction Anywhere Windows Azure roles: manually adding and removing role instances, handing the Management API key to a third-party scaling service, or writing the scaling infrastructure from scratch," says Dr. Grigori Melnik, Sr. Program Manager, Microsoft patterns & practices group.
The Autoscaling Application Block is designed to enable automatic scaling behavior in Windows Azure applications and to make it easy to benefit from the elastic supply of resources in Windows Azure: in this case, the Windows Azure role instances hosting the Sage Construction Anywhere application. The Autoscaling Application Block can scale Windows Azure applications either in or out based on two different sets of rules: constraint rules that set limits on the maximum and minimum number of role instances, and reactive rules that dynamically add or remove role instances based on metrics that the Autoscaling Application Block collects from the running role instances. Through these configurable rules, the block can balance specific application performance requirements against running costs by dynamically controlling the number of running role instances.
Rules may be modified while a Windows Azure application is running in order to fine-tune the autoscaling behavior and make adjustments in response to changes in access patterns. The Autoscaling Application Block generates comprehensive diagnostic information that helps with the analysis of its behavior and with further optimization of the rule set.
"Applications that are designed to dynamically grow and shrink their resource use in response to actual and anticipated demand are not only less expensive to operate, but are significantly more efficient with their use of IT resources than traditional applications," says Mark Aggar, Senior Director, Environmental Sustainability at Microsoft.
Adding the Autoscaling Application Block to Sage Construction Anywhere
The team at Sage already had a version of Sage Construction Anywhere running in Windows Azure before they integrated the Autoscaling Application Block. These are the key steps they took to add the block.
- Modified their existing web and worker roles to generate the performance counter data that they planned to use in the reactive rules.
- Generated a certificate to enable the Autoscaling Application Block to securely access the Sage Construction Anywhere application roles using the Windows Azure Service Management API.
- Created an extra small worker role instance to host the Autoscaling Application Block itself.
- Created an initial set of constraint and reactive rules for the application.
"Integrating the block and getting it up and running took a couple of hours. Creating our first rule set was straight-forward taking about 20 or 30 minutes," says Chad Busche, Principal Software Architect at Sage.
The application did not require any changes to be able to scale-down safely because the web roles were designed to be "web-farm friendly": the web roles are stateless and do not assume any node affinity which means that each user request can be handled by any one of the active web role instances. Because the Autoscaling Application Block uses the Windows Azure Service Management API when it stops role instances, those instances go through the standard shutdown sequence allowing any work to complete before the instance terminates. The team at Sage did not need to make any changes to the application to support either the scale up or the scale down operations from the Autoscaling Application Block.
For further information about designing stateless applications for Windows Azure, see the MSDN magazine article Patterns For High Availability, Scalability, And Computing Power With Windows Azure and the section Session Data Storage in the book Developing Applications for the Cloud available on MSDN.
The Autoscaling Application Block rules and their effects
Sage Construction Anywhere uses three different role types: one web role type handles the UI, another web role type hosts a web service to handle communications between Sage 300 running in the customer's back office and the cloud, and a worker role to manage the heavy-lifting in the application.
When the system initially went live, the number of customers started out low, and the team configured four small instances of each role type to run around the clock. "We could have easily over-provisioned and in fact that's how we started; paying a lot of money to have machines provisioned but just sitting there doing nothing a lot of the time," says Chad Busche, Principal Software Architect at Sage.
When the team at Sage added the Autoscaling Application Block, they used constraint rules to set the minimum number of instances for each role type to two, the maximum number of instances for the web role to ten, and the maximum number of instances for the remaining role types to six as shown below:
<rule name="SCA.WebSite.Rule.Constraint" enabled="true" rank="1"> <actions> <range min="2" max="10" target="SCA.Role.WebSite"/> </actions> </rule> <rule name="SCA.ConnectorWebService.Rule.Constraint" enabled="true" rank="1"> <actions> <range min="2" max="6" target="SCA.Role.ConnectorWebService"/> </actions> </rule> <rule name="SCA.Worker.Rule.ConstraintRule" enabled="true" rank="1"> <actions> <range min="2" max="6" target="SCA.Role.Worker"/> </actions> </rule>
A minimum of two instances of each role type remain running at all times because for the Windows Azure SLA to apply there must be at least two instances of a role type. Sage has set the maximum values to cap on the number of role instances that the Autoscaling Application Block can start and therefore control the potential operating costs of the application by ensuring that the Autoscaling Application Block cannot continue to add role instances indefinitely.
To handle spikes in demand, the team at Sage originally had someone monitor the application during business hours to add additional role instances reactively and then remove those instances when traffic fell back to normal levels. This approach is tedious, inefficient, and error-prone.
The Autoscaling Application Block enabled them to automate this process by using reactive rules based on CPU usage monitoring. As CPU usage rises, additional instances are added, as CPU usage falls those instances are removed. The number of instances cannot exceed the maximum value specified in the active constraint rule or fall below the minimum value in the active constraint rule.
The reactive rules are paired: one rule scales a role out when average CPU usage across the running role instances exceeds a certain threshold; the other role scales the role in when average CPU usage across the running role instances falls below a certain threshold. For example, the reactive rules that scale the web role add two new instances if average CPU usage for the web role rises above 75% and remove a single instance when average CPU usage falls below 40%. Notice how the scaling up rule is more aggressive by adding two instances than the scaling down rule that removes one instance in the following examples for the Sage Construction Anywhere web role:
Scale up rule
<rule name="SCA.WebSite.Rule.ScaleUp" rank="10" enabled="true" > <when> <any> <greaterOrEqual operand="SCA_WebSite_Counter_CPU" than="75"/> </any> </when> <actions> <scale target="SCA.Role.WebSite" by="2"/> </actions> </rule>
Scale down rule
<rule name="SCA.WebSite.Rule.ScaleDown" rank="10" enabled="true" > <when> <all> <less operand="SCA_WebSite_Counter_CPU" than="40"/> </all> </when> <actions> <scale target="SCA.Role.WebSite" by="-1"/> </actions> </rule>
To test the way that the Autoscaling Application Block responded to spikes in demand, the team at Sage used Visual Studio Load Test to simulate different loads on the application. They used this information to create the initial set of reactive rules for the application.
The rules shown here are the initial set of rules created to support a limited number of early adopter customers. The rules will be revised to work with higher traffic volumes as the number of active customers grows.
These estimates are based on a comparison of the operating costs for Sage Construction Anywhere before the team at Sage added the Autoscaling Application Block, with the operating costs after the team added the Autoscaling Application Block. Both the before and after estimates are based on the application supporting a limited number of early adopter customers.
Before adding the block, Sage configured four instances of each of the three role types to run 24 hours per day. This level of provisioning was designed to accommodate any possible spikes in demand.
After adding the block, the constraint rules kept the number of instances of each role type to two except when the reactive rules responded to spikes in demand and temporarily added more instances. In practice, with the number of users the system had, the reactive rules were triggered very infrequently. The following calculations ignore the impact of the reactive rules on the number of instance hours, and therefore the cost estimate after adding the block is slight under-estimation.
Before adding autoscaling
After adding autoscaling
Total role instances
Instance hours per month
Cost per month ($0.12/small instance hour**)
Instance hours per year
Cost per month ($0.12/small instance hour**)
* These are under-estimates that don't take into account the impact of the reactive rules.
** For current Windows Azure pricing, see http://www.windowsazure.com/en-us/pricing/details/
The bottom line is that using Autoscaling Application Block resulted in cutting the cost of the role instances at least by half, while meeting their SLA.
The impact of this set of autoscaling rules is to reduce the operating costs of the application by enabling Sage to use two instances of each role type for most of the time, while being confident that the system will scale out automatically to handle any unexpected spikes in usage and then scale down as soon as the spike is over.
This initial implementation of autoscaling for the Sage Construction Anywhere application provided the team at Sage with many pointers that will be useful as they expand the functionality of their cloud-hosted applications. These lessons include:
- Monitor traffic: It is important to monitor website traffic to be able to analyze it and identify any usage patterns. Constraint rules can pre-emptively add or remove instances based on usage patterns. The team looked at historical data and were able to clearly identify the usage patterns of the application.
- Fine-tune the rules: Revisit the rules regularly to ensure that they are performing optimally. Usage patterns and volumes on websites are expected to change over time and the team at Sage will adjust their autoscaling rules accordingly.
- Use load tests: The team at Sage were able to test the autoscaling behavior of the reactive rules by simulating loads on the application.
- Use a combination of constraint and reactive rules: Sage uses constraint rules to set upper and lower limits on the number of role instances. Sage uses reactive rules to enable the Sage Construction Anywhere application to respond to unanticipated spikes in demand and to scale the application back once the spike is over.
- Design applications to be stateless: Applications that are stateless and don't rely on node-affinity can be safely scaled back by the Autoscaling Application Block.
This initial set of rules was designed to accommodate the number of early adopter users of the Sage Construction Anywhere application. As the number of users increases, the team at Sage plans to make the following changes to their autoscaling rules.
The majority of users access the Sage Construction Anywhere application during US working hours (6AM East Coast time to 6PM West Coast time). As the number of users increases, the team at Sage plans to add constraint rules to use a timetable that increases the minimum number of role instances during these working hours, and keeps the minimum instances at two outside of these working hours. These new constraint rules will have a rank of greater than one to ensure that they override the existing constraint rules; the existing constraint rules will continue to provide default maximum and minimum values for role instance counts.
Sage also plans to investigate whether the reactive rules would better detect load in the application by monitoring the length of the application's pending jobs queue rather than the average CPU usage of the role instances.
In the longer term, as Sage adds more features to Sage Construction Anywhere, it expects application usage patterns to change; this will require changes to the constraint rules and their associated timetables.
Additionally, this first version of Sage Construction Anywhere does not provide any scope for customers to reduce their server requirements because all of the report generation is still performed on premises. However, subsequent versions of the product will make some functionality, such as project management, that is currently provided on-premises by Sage 300 available in the cloud. This will enable Sage's customers to realize cost savings as they require fewer server resources in their datacenters.
Sage is using the Autoscaling Application Block to proactively manage the number of role instances used by the Sage Construction Anywhere application in Windows Azure.
"We found it very easy to add the Autoscaling Application Block to Sage Construction Anywhere — we didn't need to make any significant changes to our application. Using it means that we will be saving money by only provisioning the resources that we actually need," says Chad Busche, Principal Software Architect at Sage.
Based on the impact of the constraint rules in force and given its initial number of users, Sage will potentially save around $6,000 per year in Windows Azure operating costs (based on the $0.12 cost per hour of a small role instance). These savings will rise as the number of users of the application increases and Sage adds more role instances during peak hours to support them.
The Autoscaling Application Block is already actively scaling the Sage Construction Anywhere application, ensuring that there is adequate capacity to meet varying levels of demand around the clock, and making sure that spare capacity is released as soon as demand falls. In doing this, the Autoscaling Application Block is managing the trade-off between the need for capacity to handle spikes in demand and the desire to reduce the running costs of the application.