Building a Scalable, Multi-Tenant Application for Windows Azure
This chapter examines architectural and implementation issues in the Surveys application from the perspective of building a multi-tenant application. Questions such as how to partition the application and how to bill customers for their usage are directly relevant to a multi-tenant architecture. Questions such as how to make the application scalable and how to handle the on-boarding process for new subscribers are relevant to both single-tenant and multi-tenant architectures, but they involve some special considerations in multi-tenant model.
This chapter describes how Tailspin resolved these questions for the Surveys application. For other applications, different choices may be appropriate.
Partitioning the Application
Chapter 5, "Data Storage in the Surveys Application," describes how the Surveys application data model partitions the data by subscriber. This section describes how the Surveys application uses MVC routing tables and areas to make sure that a subscriber sees only his or her own data.
The Solution
The developers at Tailspin decided to use the path in the application's URL to indicate which subscriber is accessing the application. For the Subscriber website, users must authenticate before they can access the application, for the public Surveys website, the application doesn't require authentication.
The following are three sample paths on the Subscriber website:
- /survey/adatum/newsurvey
- /survey/adatum/newquestion
- /survey/adatum
The following are two example paths on the public Surveys website:
- /survey/adatum/launch-event-feedback
- /survey/adatum/launch-event-feedback/thankyou
The application uses the first element in the path to indicate the different areas of functionality within the application. All the preceding examples relate to surveys, but other areas relate to on-boarding and security. The second element indicates the subscriber name, in these examples "Adatum," and the last element indicates the action that is being performed, such as creating a new survey or adding a question to a survey.
You should take care when you design the path structure for your application that there is no possibility of name clashes that result from a value entered by a subscriber. In the Surveys application, if a subscriber creates a survey named "newsurvey," the path to this survey is the same as the path to the page subscribers use to create new surveys. However, the application hosts surveys on an HTTP endpoint and the page to create surveys on an HTTPS endpoint, so there is no name clash in this particular case.
Note: |
|---|
| The third example element of the public Surveys website, "launch-event-feedback," is a "sluggified" version of the survey title, originally "Launch Event Feedback," to make it URL friendly. |
Markus says: |
|---|
A slug name is a string where all whitespace and invalid characters are replaced with a hyphen (-). The term comes from the newsprint industry and has nothing to do with those things in your garden! |
Inside the Implementation
Now is a good time to walk through the code that handles the request routing within the application in more detail. As you go through this section, you may want to download the Microsoft® Visual Studio® development system solution for the Tailspin Surveys application from http://wag.codeplex.com/.
The implementation uses a combination of ASP.NET routing tables and MVC areas to identify the subscriber and map requests to the correct functionality within the application.
The following code example shows how the public Surveys Web site uses routing tables to determine which survey to display based on the URL.
using System.Web.Mvc;
using System.Web.Routing;
public static class AppRoutes
{
public static void RegisterRoutes(RouteCollection routes)
{
routes.MapRoute(
"Home",
string.Empty,
new { controller = "Surveys", action = "Index" });
routes.MapRoute(
"ViewSurvey",
"survey/{tenant}/{surveySlug}",
new { controller = "Surveys", action = "Display" });
routes.MapRoute(
"ThankYouForFillingTheSurvey",
"survey/{tenant}/{surveySlug}/thankyou",
new { controller = "Surveys", action = "ThankYou" });
}
}
The code extracts the tenant name and survey name from the URL and passes them to the appropriate action method in the SurveysController class. The following code example shows the Display action method that handles HTTP Get requests.
[HttpGet]
public ActionResult Display(string tenant, string surveySlug)
{
var surveyAnswer = CallGetSurveyAndCreateSurveyAnswer(
this. surveyStore, tenant, surveySlug);
var model = new
TenantPageViewData<SurveyAnswer>(surveyAnswer);
model.Title = surveyAnswer.Title;
return this.View(model);
}
If the user requests a survey using a URL with a path value of /survey/adatum/launch-event-feedback, the value of the tenant parameter will be "adatum" and the value of the surveySlug parameter will be "launch-event-feedback." This action method uses the parameter values to retrieve the survey definition from the store, populate the model with this data, and pass the model to the view that renders it to the browser.
Markus says: |
|---|
There is also a Display action to handle HTTP POST requests. This controller action is responsible for saving the filled out survey data. |
The Subscriber website is more complex because it must handle authentication and on-boarding new subscribers in addition to enabling subscribers to design new surveys and analyze survey results. Because of this complexity, it uses MVC areas as well as a routing table. The following code from the AppRoutes class in the TailSpin.Web project shows how the application maps top-level requests to the controller classes that handle on-boarding and authentication.
public static void RegisterRoutes(RouteCollection routes)
{
routes.MapRoute(
"OnBoarding",
string.Empty,
new { controller = "OnBoarding", action = "Index" });
routes.MapRoute(
"FederationResultProcessing",
"FederationResult",
new { controller = "ClaimsAuthentication",
action = "FederationResult" });
routes.MapRoute(
"FederatedSignout",
"Signout",
new { controller = "ClaimsAuthentication",
action = "Signout" });
}
…
}
The application also defines an MVC area for the core survey functionality. MVC applications register areas by calling the RegisterAllAreas method. In the TailSpin.Web project, you can find this call in the Application_Start method in the Global.asax.cs file. The RegisterAllAreas method searches the application for classes that extend the AreaRegistration class, and then it invokes the RegisterArea method. The following code example shows a part of this method in the SurveyAreaRegistration class.
Markus says: |
|---|
MVC areas enable you to group multiple controllers together within the application, making it easier work with large MVC projects. Each area typically represents a different function within the application. |
public override void RegisterArea(
AreaRegistrationContext context)
{
context.MapRoute(
"MySurveys",
"survey/{tenant}",
new { controller = "Surveys", action = "Index" });
context.MapRoute(
"NewSurvey",
"survey/{tenant}/newsurvey",
new { controller = "Surveys", action = "New" });
context.MapRoute(
"NewQuestion",
"survey/{tenant}/newquestion",
new { controller = "Surveys", action = "NewQuestion" });
context.MapRoute(
"AddQuestion",
"survey/{tenant}/newquestion/add",
new { controller = "Surveys", action = "AddQuestion" });
…
}
Notice how all the routes in this routing table include the tenant name that MVC passes as a parameter to the controller action methods.
On-Boarding for Trials and New Customers
Whenever a new subscriber signs up for the Surveys service, the application must perform configuration tasks to enable the new account. Tailspin wants to automate as much of this process as possible to simplify the on-boarding process for new customers and minimize the costs associated with setting up a new subscriber. The on-boarding process touches many components of the Surveys application, and this section describes how the on-boarding process affects those components.
Basic Subscription Information
The following table describes the basic information that every subscriber provides when they sign up for the Surveys service.
Apart from credit card details, all this information is stored in Windows Azure™ storage; it is used throughout the on-boarding process and while the subscription is active.
Authentication and Authorization Information
The section, "Authentication and Authorization," in Chapter 3, "Accessing the Surveys Application," of this book describes the three alternatives for managing access to the application. Each of these alternatives requires different information from the subscriber as part of the on-boarding process, and each alternative is associated with a different subscription type. For example, the Individual subscription type uses a social identity provider, such as Windows Live® ID or Google ID, for authentication, and the Premium subscription type uses the subscriber's own identity provider.
Provisioning a Trust Relationship with the Subscriber's Identity Provider
One of the features of the Premium subscription type is integration with the subscriber's identity provider. The on-boarding process collects the information needed to configure the trust relationship between subscriber's Security Token Service (STS) and the Tailspin federation provider (FP) STS. The following table describes this information.
The Surveys application will use this data to add the appropriate configuration information to the Tailspin FP STS. The on-boarding process will also make the Tailspin FP federation metadata available to the subscriber because the subscriber may need it to configure the trust relationship in their STS.
Jana says: |
|---|
The application does not yet implement this functionality. Tailspin could decide to use ADFS, ACS, or a custom STS as its federation provider. As part of the on-boarding process, the Surveys application will have to programmatically create the trust relationship between the Tailspin FP STS and the customer's identity provider, and programmatically add any claims transformation rules to the Tailspin STS. |
Note: |
|---|
| Note: For more information, see the section, "Setup and Physical Deployment," on page 97 of the book, A Guide to Claims-Based Identity and Access Control. You can download a PDF copy of this book at http://msdn.microsoft.com/en-us/library/ff423674.aspx. |
Provisioning Authentication and Authorization for Basic Subscribers
Subscribers to the Standard subscription type cannot integrate the Surveys application with their own STS. Instead, they can define their own users in the Surveys application. During the on-boarding process, they provide details for the administrator account that will have full access to everything in their account, including billing information. They can later define additional users who are members of the Survey Creator role, who can only create surveys and analyze the results.
Provisioning Authentication and Authorization for Individual Subscribers
Individual subscribers use a third-party, social identity, such as a Windows Live ID, OpenID, or Google ID, to authenticate with the Surveys application. During the on-boarding process, they must provide details of the identity they will use. This identity has administrator rights for the account and is the only identity that can be used to access the account.
Geo Location Information
During the on-boarding process, the subscriber selects the geographic location where the Surveys application will host their account. The list of locations to choose from is the list of locations where there are currently Windows Azure data centers. This geographic location identifies the location of the Subscriber website instance that the subscriber will use and where the application stores all the data associated with the account. It is also the default location for hosting the subscriber's surveys, although the subscriber can opt to host individual surveys in alternate geographical locations.
Poe says: |
|---|
You could automatically suggest a location based on the user's IP address by using a service such as http://ipinfodb.com/ip_location_api.php. |
Database Information
During the sign-up process, subscribers can also opt to provision a SQL Azure™ database to store and analyze their survey data. The application creates this database in the same geographical locations as the subscribers' accounts. The application uses the subscriber alias to generate the database name and the database user name. The application also generates a random password. The application saves the database connection string in Windows Azure storage, together with the other subscriber account data.
Note: |
|---|
| The SQL Azure database is still owned and paid for by Tailspin. Tailspin charges subscribers for this service. For more information about how the Surveys application uses SQL Azure, see the section, "Using SQL Azure," in Chapter 5, "Working with Data in the Surveys Application," of this book. |
Billing Customers
Tailspin plans to bill each customer a fixed monthly fee to use the Surveys application. Customers will be able to subscribe to one of several packages, as outlined in the following table.
The advantage of this approach is simplicity for both Tailspin and the subscribers, because the monthly charge is fixed for each subscriber. Tailspin must undertake some market research to estimate the number of monthly subscribers at each level so that they can set appropriate charges for each subscription level.
Bharath says: |
|---|
Tailspin must have good estimates of expected usage to be able to estimate costs, revenue, and profit. |
In the future, Tailspin wants to be able to offer extensions to the basic subscription types. For example, Tailspin wants to enable subscribers to extend the duration of a survey beyond the current maximum, or to increase the number of active surveys beyond the current maximum. To do this, Tailspin will need to be able to capture usage metrics from the application to help it calculate any additional charges incurred by a subscriber. Tailspin expects that forthcoming Windows Azure APIs that expose billing information and storage usage metrics will simplify the implementation of these extensions.
Note: |
|---|
| At the time of writing, the best approach to capturing usage metrics is via logging. Several log files are useful. You can use the Internet Information Services (IIS) logs to determine which tenant generated the web role traffic. Your application can write custom messages to the WADLogsTable. The sys.bandwidth_usage view in the master database of each SQL Azure server shows bandwidth consumption by database. |
Customizing the User Interface
A common feature of multi-tenant applications is enabling subscribers to customize the appearance of the application for their customers. The current version of the Surveys application enables subscribers to customize the appearance of their account page by using a custom logo image. Subscribers can upload an image to their account, and the Surveys application saves the image as part of the subscriber's account data in BLOB storage.
Tailspin plans to extend the customization options available to subscribers in future versions of the application. These extensions include customizing the survey pages with the logo and enabling subscribers to upload a cascading style sheets (.css) file to customize the appearance of their survey pages to follow corporate branding schemes.
Tailspin are evaluating the security implications of allowing subscribers to upload custom .css files and plan to limit the cascading style sheets features that the site will support. They will implement a scanning mechanism to verify that the .css files that subscribers upload do not include any of the features that the Surveys site does not support.
Poe says: |
|---|
Cascading style sheets Behaviors are one feature that the Surveys site will not support. |
The current solution allows subscribers to upload an image to a public BLOB container named logos. As part of the upload process, the application adds the URL for the logo image to the tenant's BLOB data stored in the BLOB container named tenants. The TenantController class retrieves the URL and forwards it on to the view.
Scaling Applications by Using Worker Roles
Scalability is an issue for both single-tenant and multi-tenant architectures. Although it may be acceptable to allow certain operations at certain times to utilize most of the available resources in a single-tenant application (for example, calculating aggregate statistics over a large dataset at 2:00 A.M.), this is not an option for most multi-tenant applications where different tenants have different usage patterns.
You can use worker roles in Windows Azure to offload resource-hungry operations from the web roles that handle user interaction. These worker roles can perform tasks asynchronously when the web roles do not require the output from the worker role operations to be immediately available.
Example Scenarios for Worker Roles
The following table describes some example scenarios where you can use worker roles for asynchronous job processing. Not all of these scenarios come from the Surveys application; but, for each scenario, the table specifies how to trigger the job and how many worker role instances it could use.
Note: |
|---|
| You can scale the Update Survey Statistics scenario described in the preceding table by using one queue and one worker role instance for every tenant or even every survey. What is important is that only one worker role instance should process and update data that is mutually exclusive within the dataset. |
Looking at these example scenarios suggests that you can categorize worker roles that perform background processing according to the scheme in the following table.
Triggers for Background Tasks
The trigger for a background task could be a timer or a signal in the form of a message in a queue. Time-based background tasks are appropriate when the task must process a large quantity of data that trickles in little by little. This approach is cheaper and will offer higher throughput than an approach that processes each piece of data as it becomes available. This is because you can batch the operations and reduce the number of storage transactions required to process the data.
If the frequency at which new items of data becomes available is lower and there is a requirement to process the new data as soon as possible, using a message in a queue as a trigger is appropriate.
You can implement a time-based trigger by using a Timer object in a worker role that executes a task at fixed time interval. You can implement a message-based trigger in a worker role by creating an infinite loop that polls a message queue for new messages. You can retrieve either a single message or multiple messages from the queue and execute a task to process the message or messages.
Markus says: |
|---|
You can pull multiple messages from a queue in a single transaction. |
Execution Model
In Windows Azure, you process background tasks by using worker roles. You could have a separate worker role type for each type of background task in your application, but this approach means that you will need at least one separate worker role instance for each type of task. Often, you can make better use of the available compute resources by having one worker role handle multiple types of tasks, especially when you have high volumes of data because this approach reduces the risk of under-utilizing your compute nodes. This approach, often referred to as role conflation, involves two trade-offs. The first trade-off balances the complexity of and cost of implementing role conflation against the potential cost savings that result from reducing the number of running worker role instances. The second trade-off is between the time required to implement and test a solution that uses role conflation and other business priorities, such as time-to-market. In this scenario, you can still scale out the application by starting up additional instances of the worker role. The diagrams in Figure 1 show these two scenarios.
Figure 1
In the scenario where you have multiple instances of a worker role that can all execute the same set of task types, you need to distinguish between the task types where it is safe to execute the task in multiple worker roles simultaneously, and the task types where it is only safe to execute the task in a single worker role at a time.
To ensure that only one copy of a task can run at a time, you must implement a locking mechanism. In Windows Azure, you could use a message on a queue or a lease on a BLOB for this purpose. The diagram in Figure 2 shows that multiple copies of Tasks A and C can run simultaneously, but only one copy of Task B can run at any one time. One copy of Task B acquires a lease on a BLOB and runs; other copies of Task B will not run until they can acquire the lease on the BLOB.
Figure 2
The MapReduce Algorithm
For some Windows Azure applications, being limited to a single task instance for certain large calculations may have a significant impact on performance. In these circumstances, the MapReduce algorithm may provide a way to parallelize the calculations across multiple task instances in multiple worker roles.
The original concepts behind MapReduce come from the map and reduce functions that are widely used in functional programming languages such as Haskell, F#, and Erlang. In the current context, MapReduce is a programming model (patented by Google), that enables you to parallelize operations on a large dataset. In the case of the Surveys application, you could use this approach to calculate the summary statistics by using multiple, parallel tasks instead of a single task. The benefit would be to speed up the calculation of the summary statistics, but at the cost of having multiple worker role instances.
Jana says: |
|---|
For the Surveys application, speed is not a critical factor for the calculation of the summary statistics. Tailspin is willing to tolerate a delay while this summary data is calculated, so it does not use MapReduce. |
The following example shows how Tailspin could use this approach if it wants to speed up the calculation of the summary statistics.
This example assumes that the application saves survey responses in BLOBs that contain the data shown in Figure 3.
Figure 3
The following table shows the initial set of data from which the application must calculate the summary statistics. In practice, MapReduce is used to process very large datasets; this example uses a very small dataset to show how MapReduce works. This example also only shows the summarization steps for the first multiple-choice question and the first range question found in the survey answers, but you could easily extend the process to handle all the questions in each survey.
The first stage of MapReduce is to map the data into a format that can be progressively reduced until you obtain the required results. Both the map and reduce phases can be parallelized, which is why MapReduce can improve the performance for calculations over large datasets.
For this example, both the map and reduce phases will divide their input into blocks of three. The map phase in this example uses four parallel tasks, each one processes three survey result BLOBs, to build the map shown in the following table.
The next phase reduces this data further. In this example, there will be two parallel tasks, one that processes aggregations 1.X, 2.X, and 3.X, and one that processes aggregation 4.X. It's important to realize that each reduce phase only needs to reference the data from the previous phase and not the original data. The following table shows the results of this reduce phase.
In the next phase, there is only one task because there are only two input blocks. The following table shows the results from this reduction phase.
At this point, it's not possible to reduce the data any further, and the summary statistics for all the survey data that the application read during the original map phase have been calculated.
It's now possible to update the summary data based on survey responses received after the initial map phase ran. You process all new survey data using MapReduce and then combine the results from the new data with the old in the final step.
Scaling the Surveys Application
This section describes how Tailspin designed one functional area of the Surveys application for scalability. Tailspin anticipates that some surveys may have thousands, or even hundreds of thousands of respondents, and Tailspin wants to make sure that the public website remains responsive for all users at all times. At the same time, survey owners want to be able to view summary statistics calculated from the survey responses to date.
Goals and Requirements
In Chapter 3, "Accessing the Surveys Application," you saw how Tailspin uses two websites for the Surveys application: one where subscribers design and administer their surveys, and one where users fill out their survey responses. The Surveys application currently supports three question types: free text, numeric range (values from one to five), and multiple choice. Survey owners must be able to view some basic summary statistics that the application calculates for each survey, such as the total number of responses received, histograms of the multiple-choice results, and aggregations such as averages of the range results. The Surveys application provides a pre-determined set of summary statistics that cannot be customized by subscribers. Subscribers who want to perform a more sophisticated analysis of their survey responses can export the survey data to a SQL Azure instance.
Because of the expected volume of survey response data, Tailspin anticipates that generating the summary statistics will be an expensive operation because of the large number of storage transactions that must occur when the application reads the survey responses. However, Tailspin does not require the summary statistics to be always up to date and based on all of the available survey responses. Tailspin is willing to accept a delay while the application calculates the summary data if this reduces the cost of generating them.
The public site where respondents fill out surveys must always have fast response times when users save their responses, and it must record the responses accurately so that there is no risk of any errors in the data when a subscriber comes to analyze the results.
The developers at Tailspin also want to be able to run comprehensive unit tests on the components that calculate the summary statistics without any dependencies on Windows Azure storage.
Markus says: |
|---|
There are also integration tests that verify the end-to-end behavior of the application using Windows Azure storage. |
The Solution
To meet the requirements, the developers at Tailspin decided to use a worker role to handle the task of generating the summary statistics from the survey results. Using a worker role enables the application to perform this resource-intensive process as a background task, ensuring that the web role responsible for collecting survey answers is not blocked while the application calculates the summary statistics.
Based on the framework for worker roles that the previous section outlined, this asynchronous task is one that will by triggered on a schedule, and it must be run as a single instance process because it updates a single set of results.
The application can use additional tasks in the same worker role to perform any additional processing on the response data; for example, it can generate a list of ordered answers to enable paging through the response data.
To calculate the survey statistics, Tailspin considered two basic approaches. The first approach is for the task in the worker role to retrieve all the survey responses to date at a fixed time interval, recalculate the summary statistics, and then save the summary data over the top of the existing summary data. The second approach is for the task in the worker role to retrieve all the survey response data that the application has saved since the last time the task ran, and use this data to adjust the summary statistics to reflect the new survey results.
The first approach is the simplest to implement, because the second approach requires a mechanism for tracking which survey results are new. The second approach also depends on it being possible to calculate the new summary data from the old summary data and the new survey results without re-reading all the original survey results.
Markus says: |
|---|
You can use a queue to maintain a list of all new survey responses. This task is still triggered on a schedule that determines how often the task should look at the queue for new survey results to process. |
Note: |
|---|
| You can recalculate all the summary data in the Surveys application using the second approach. However, suppose you want one of your pieces of summary data to be a list of the 10 most popular words used in answering a free-text question. In this case, you would always have to process all of the survey answers, unless you also maintained a separate list of all the words used and a count of how often they appeared. This adds to the complexity of the second approach. |
The key difference between the two approaches is cost. The graph in Figure 4 shows the result of an analysis that compares the costs of the two approaches for three different daily volumes of survey answers. The graph shows the first approach on the upper line with the Recalculate label, and the second approach on the lower line with the Merge label.
Figure 4
The graph clearly shows how much cheaper the merge approach is than the recalculate approach after you get past a certain volume of transactions. The difference in cost is due almost entirely to the transaction costs associated with the two different approaches. Tailspin decided to implement the merge approach in the Surveys application.
Note: |
|---|
| The vertical cost scale on the chart is logarithmic. The analysis behind this chart makes a number of "worst-case" assumptions about the way the application processes the survey results. The chart is intended to illustrate the relative difference in cost between the two approaches; it is not intended to give "hard" figures. |
It is possible to optimize the recalculate approach if you decide to sample the survey answers instead of processing every single one when you calculate the summary data. You would need to perform some detailed statistical analysis to determine what proportion of results you need to select to calculate the summary statistics within an acceptable margin of error.
In the Surveys application, it would also be possible to generate the summary statistics by using an approach based on MapReduce. The advantage of this approach is that it is possible to use multiple task instances to calculate the summary statistics. However, Tailspin is willing to accept a delay in calculating the summary statistics, so performance is not critical for this task. For a description of the MapReduce programming model, see the section, "MapReduce," earlier in this chapter.
Inside the Implementation
Now is a good time to walk through the code that implements the asynchronous task that calculates the summary statistics in more detail. As you go through this section, you may want to download the Visual Studio solution for the Tailspin Surveys application from http://wag.codeplex.com/.
Using a Worker Role to Calculate the Summary Statistics
The team at Tailspin decided to implement the asynchronous background task that calculates the summary statistics from the survey results by using a merge approach. Each time the task runs, it processes the survey responses that the application has received since the last time the task ran; it calculates the new summary statistics by merging the new results with the old statistics.
The worker role in the TailSpin.Workers.Surveys project periodically scans the queue for pending survey answers to process.
The following code example from the UpdatingSurveyResultsSummaryCommand class shows how the worker role processes each temporary survey answer and then uses them to recalculate the summary statistics.
private readonly IDictionary<string, SurveyAnswersSummary>
surveyAnswersSummaryCache;
private readonly ISurveyAnswerStore surveyAnswerStore;
private readonly ISurveyAnswersSummaryStore
surveyAnswersSummaryStore;
public UpdatingSurveyResultsSummaryCommand(
IDictionary<string, SurveyAnswersSummary>
surveyAnswersSummaryCache,
ISurveyAnswerStore surveyAnswerStore,
ISurveyAnswersSummaryStore surveyAnswersSummaryStore)
{
this.surveyAnswersSummaryCache =
surveyAnswersSummaryCache;
this.surveyAnswerStore = surveyAnswerStore;
this.surveyAnswersSummaryStore =
surveyAnswersSummaryStore;
}
public void PreRun()
{
this.surveyAnswersSummaryCache.Clear();
}
public void Run(SurveyAnswerStoredMessage message)
{
this.surveyAnswerStore.AppendSurveyAnswerIdToAnswersList(
message.Tenant,
message.SurveySlugName,
message.SurveyAnswerBlobId);
var surveyAnswer =
this.surveyAnswerStore.GetSurveyAnswer(
message.Tenant,
message.SurveySlugName,
message.SurveyAnswerBlobId);
var keyInCache = string.Format(CultureInfo.InvariantCulture,
"{0}_{1}", message.Tenant, message.SurveySlugName);
SurveyAnswersSummary surveyAnswersSummary;
if (!this.surveyAnswersSummaryCache.ContainsKey(keyInCache))
{
surveyAnswersSummary = new
SurveyAnswersSummary(message.Tenant,
message.SurveySlugName);
this.surveyAnswersSummaryCache[keyInCache] =
surveyAnswersSummary;
}
else
{
surveyAnswersSummary =
this.surveyAnswersSummaryCache[keyInCache];
}
surveyAnswersSummary.AddNewAnswer(surveyAnswer);
}
public void PostRun()
{
foreach (var surveyAnswersSummary in
this.surveyAnswersSummaryCache.Values)
{
var surveyAnswersSummaryInStore =
this.surveyAnswersSummaryStore
.GetSurveyAnswersSummary(surveyAnswersSummary.Tenant,
surveyAnswersSummary.SlugName);
surveyAnswersSummary.MergeWith(
surveyAnswersSummaryInStore);
this.surveyAnswersSummaryStore
.SaveSurveyAnswersSummary(surveyAnswersSummary);
}
}
The Surveys application uses the Unity Application Block (Unity) to initialize an instance of the UpdatingSurveyResultsSummaryCommand class and the surveyAnswerStore and surveyAnswersSummaryStore variables. The surveyAnswerStore variable is an instance of the SurveyAnswerStore type that the Run method uses to read the survey responses from BLOB storage. The surveyAnswersSummaryStore variable is an instance of the SurveyAnswersSummary type that the PostRun method uses to write summary data to BLOB storage. The surveyAnswersSummaryCache dictionary holds a SurveyAnswersSummary object for each survey.
Note: |
|---|
| Unity is a lightweight, extensible dependency injection container that supports interception, constructor injection, property injection, and method call injection. You can use Unity in a variety of ways to help decouple the components of your applications, to maximize coherence in components, and to simplify design, implementation, testing, and administration of these applications. For more information about Unity and to download the application block, see the patterns & practices Unity page on CodePlex (http://unity.codeplex.com/). |
The PreRun method runs before the task reads any messages from the queue and initializes a temporary cache for the new survey response data.
The Run method runs once for each new survey response. It uses the message from the queue to locate the new survey response, and then it adds the survey response to the SurveyAnswersSummary object for the appropriate survey by calling the AddNewAnswer method. The AddNewAnswer method updates the summary statistics in the surveyAnswersSummaryStore instance. The Run method also calls the AppendSurveyAnswerIdToAnswersList method to update the list of survey responses that the application uses for paging.
The PostRun method runs after the task reads all the outstanding answers in the queue. For each survey, it merges the new results with the existing summary statistics, and then it saves the new values back to BLOB storage.
The worker role uses some "plumbing" code developed by Tailspin to invoke the PreRun, Run, and PostRun methods in the UpdatingSurveyResultsSummaryCommand class on a schedule. The following code example shows how the Surveys application uses the "plumbing" code in the Run method in the worker role to run the three methods that comprise the job.
public override void Run()
{
var updatingSurveyResultsSummaryJob =
this.container.Resolve
<UpdatingSurveyResultsSummaryCommand>();
var surveyAnswerStoredQueue =
this.container.Resolve
<IAzureQueue<SurveyAnswerStoredMessage>>();
BatchProcessingQueueHandler
.For(surveyAnswerStoredQueue)
.Every(TimeSpan.FromSeconds(10))
.Do(updatingSurveyResultsSummaryJob);
var transferQueue = this.container
.Resolve<IAzureQueue<SurveyTransferMessage>>();
var transferCommand = this
.container.Resolve<TransferSurveysToSqlAzureCommand>();
QueueHandler
.For(transferQueue)
.Every(TimeSpan.FromSeconds(5))
.Do(transferCommand);
while (true)
{
Thread.Sleep(TimeSpan.FromSeconds(5));
}
}
This method first uses Unity to instantiate the UpdatingSurveyResultsSummaryCommand object that defines the job and the AzureQueue object that holds notifications of new survey responses.
The method then passes these objects as parameters to the For and Do "plumbing" methods. The Every "plumbing" method specifies how frequently the job should run. These methods cause the plumbing code to invoke the PreRun, Run, and PostRun method in the UpdatingSurveyResultsSummaryCommand class, passing a message from the queue to the Run method.
The preceding code example also shows how the worker role initializes the task defined in the TransferSurveysToSqlAzureCommand class that dumps survey data to SQL Azure. This task is slightly simpler and only has a Run method.
You should tune the frequency at which these tasks run based on your expected workloads by changing the value passed to the Every method.
Finally, the method uses a while loop to keep the worker role instance alive.
Note: |
|---|
| The For, Every, and Do methods implement a fluent API for instantiating tasks in the worker role. Fluent APIs help to make the code more legible. |
The Worker Role "Plumbing" Code
The "plumbing" code in the worker role enables you to invoke commands of type IBatchCommand or ICommand by using the Do method, on a Windows Azure queue of type IAzureQueue by using the For method, at a specified interval. Figure 5 shows the key types that make up the "plumbing" code.
Figure 5
Figure 5 shows both a BatchProcessingQueueHandler class and a QueueHandler class. The QueueHandler class runs tasks that implement the simpler ICommand interface instead of the IBatchCommand interface. The discussion that follows focuses on the BatchProcessingQueueHandlerTask that the application uses to create the summary statistics.
The worker role first invokes the For method in the static BatchProcessingQueueHandler class, which invokes the For method in the BatchProcessingQueueHandler<T> class to return a BatchProcessingQueueHandler<T> instance that contains a reference to the IAzureQueue<T> instance to monitor. The "plumbing" code identifies the queue based on a queue message type that derives from the AzureQueueMessage type. The following code example shows how the For method in the BatchProcessingQueueHandler<T> class instantiates a BatchProcessingQueueHandler<T> instance.
private readonly IAzureQueue<T> queue;
private TimeSpan interval;
protected BatchProcessingQueueHandler(IAzureQueue<T> queue)
{
this.queue = queue;
this.interval = TimeSpan.FromMilliseconds(200);
}
public static BatchProcessingQueueHandler<T> For(
IAzureQueue<T> queue)
{
if (queue == null)
{
throw new ArgumentNullException("queue");
}
return new BatchProcessingQueueHandler<T>(queue);
}
Bharath says: |
|---|
The current implementation uses a single queue, but you could modify the BatchProcessingQueueHandler to read from multiple queues instead of only one. According to the benchmarks published at http://azurescope.cloupapp.net, the maximum write throughput for a queue is between 500 and 700 items per second. If the Surveys application needs to handle more than approximately 2 million survey responses per hour, the application will hit the threshold for writing to a single queue. You could change the application to use multiple queues, perhaps with different queues for each subscriber. |
Next, the worker role invokes the Every method of the BatchProcessingQueueHandler<T> object to specify how frequently the task should be run.
Next, the worker role invokes the Do method of the BatchProcessingQueueHandler<T> object, passing an IBatchCommand object that identifies the command that the "plumbing" code should execute on each message in the queue. The following code example shows how the Do method uses the Task.Factory.StartNew method from the Task Parallel Library (TPL) to run the PreRun, ProcessMessages, and PostRun methods on the queue at the requested interval.
public virtual void Do(IBatchCommand<T> batchCommand)
{
Task.Factory.StartNew(() =>
{
while (true)
{
this.Cycle(batchCommand);
}
}, TaskCreationOptions.LongRunning);
}
protected void Cycle(IBatchCommand<T> batchCommand)
{
try
{
batchCommand.PreRun();
bool continueProcessing;
do
{
var messages = this.queue.GetMessages(32);
ProcessMessages(this.queue, messages,
batchCommand.Run);
continueProcessing = messages.Count() > 0;
}
while (continueProcessing);
batchCommand.PostRun();
this.Sleep(this.interval);
}
catch (TimeoutException)
{
}
}
The Cycle method repeatedly pulls up to 32 messages from the queue in a single transaction for processing until there are no more messages left.
The following code example shows the ProcessMessages method in the GenericQueueHandler class.
protected static void ProcessMessages(IAzureQueue<T> queue,
IEnumerable<T> messages, Action<T> action)
{
…
foreach (var message in messages)
{
var success = false;
try
{
action(message);
success = true;
}
catch (Exception)
{
success = false;
}
finally
{
if (success || message.DequeueCount > 5)
{
queue.DeleteMessage(message);
}
}
}
}
This method uses the action parameter to invoke the custom command on each message in the queue. Finally, the method checks for poison messages by looking at the DequeueCount property of the message; if the application has tried more than five times to process the message, the method deletes the message.
Note: |
|---|
| Instead of deleting poison messages, you should send them to a dead message queue for analysis and troubleshooting. |
Testing the Worker Role
The implementation of the "plumbing" code in the worker role, and the use of Unity, makes it possible to run unit tests on the worker role components using mock objects instead of Windows Azure queues and BLOBs. The following code from the BatchProcessingQueueHandlerFixture class shows two example unit tests.
[TestMethod]
public void ForCreatesHandlerForGivenQueue()
{
var mockQueue = new Mock<IAzureQueue<StubMessage>>();
var queueHandler = BatchProcessingQueueHandler
.For(mockQueue.Object);
Assert.IsInstanceOfType(queueHandler,
typeof(BatchProcessingQueueHandler<StubMessage>));
}
[TestMethod]
public void DoRunsGivenCommandForEachMessage()
{
var message1 = new StubMessage();
var message2 = new StubMessage();
var mockQueue = new Mock<IAzureQueue<StubMessage>>();
mockQueue.Setup(q =>
q.GetMessages(32)).Returns(
() => new[] { message1, message2 });
var command = new Mock<IBatchCommand<StubMessage>>();
var queueHandler =
new BatchProcessingQueueHandlerStub(mockQueue.Object);
queueHandler.Do(command.Object);
command.Verify(c => c.Run(It.IsAny<StubMessage>()),
Times.Exactly(2));
command.Verify(c => c.Run(message1));
command.Verify(c => c.Run(message2));
}
public class StubMessage : AzureQueueMessage
{
}
private class BatchProcessingQueueHandlerStub :
BatchProcessingQueueHandler<StubMessage>
{
public BatchProcessingQueueHandlerStub(
IAzureQueue<StubMessage> queue) : base(queue)
{
}
public override void Do(
IBatchCommand<StubMessage> batchCommand)
{
this.Cycle(batchCommand);
}
}
The ForCreateHandlerForGivenQueue unit test verifies that the static For method instantiates a BatchProcessingQueueHandler correctly by using a mock queue. The DoRunsGivenCommandForEachMessage unit test verifies that the Do method causes the command to be executed against every message in the queue by using mock queue and command objects.
References and Resources
For more information about ASP.NET routing, see "ASP.NET Routing" on MSDN:
http://msdn.microsoft.com/en-us/library/cc668201.aspx
For more information about the URL Rewrite Module for IIS, see "URL Rewrite" on IIS.net:
http://www.iis.net/download/urlrewrite
For more information about fluent APIs, see the entry for "Fluent interface" on Wikipedia:
http://en.wikipedia.org/wiki/Fluent_interface
For more information about the MapReduce algorithm, see the following:
- The entry for "MapReduce" on Wikipedia:
http://en.wikipedia.org/wiki/MapReduce - The article, "Google patents Map/Reduce," on The H website:
http://www.h-online.com/open/news/item/Google-patents-Map-Reduce-908602.html
For information about the Task Parallel Library, see "Task Parallel Library" on MSDN:
http://msdn.microsoft.com/en-us/library/dd460717.aspx
For information about the advantages of using the Task Parallel library instead of working with the thread pool directly, see the following:
- The article, "Optimize Managed Code for Multi-Core Machines," in MSDN Magazine:
http://msdn.microsoft.com/en-us/magazine/cc163340.aspx - The blog post, "Choosing Between the Task Parallel Library and the ThreadPool," on the Parallel Programming with .NET blog:
http://blogs.msdn.com/b/pfxteam/archive/2009/10/06/9903475.aspx
Note:
A slug name is a string where all whitespace and invalid characters are replaced with a hyphen (-). The term comes from the newsprint industry and has nothing to do with those things in your garden!
The application does not yet implement this functionality. Tailspin could decide to use ADFS, ACS, or a custom STS as its federation provider. As part of the on-boarding process, the Surveys application will have to programmatically create the trust relationship between the Tailspin FP STS and the customer's identity provider, and programmatically add any claims transformation rules to the Tailspin STS.
You could automatically suggest a location based on the user's IP address by using a service such as
Tailspin must have good estimates of expected usage to be able to estimate costs, revenue, and profit.



