August 2015

Volume 30 Number 8

Cloud-Connected Mobile Apps - Create a Web Service with Azure Web Apps and WebJobs

By Kraig Brockschmidt

Today, many mobile apps are connected to one or more Web services that provide valuable and interesting data. When designing and developing such apps, the easiest approach to take is to make direct REST API calls to those services and then process the response in the client. However, this approach has a number of drawbacks. For one, every network call and every bit of client-side processing consumes precious battery power and bandwidth. Furthermore, extensive client-side processing can take a while to perform, especially on lower-end hardware, making the app less responsive. And different Web services might impose throttling limitations, which means that a purely client-side solution will not readily scale to a larger number of users.

As a result, it makes sense in many scenarios—especially ones that pull data from multiple sources—to create your own back end, to which you can offload certain tasks. As part of our work at Microsoft as a team that develops content for ASP.NET, Microsoft Azure and cross-platform development tools in Visual Studio, we created a specific example of this approach.

In this two-part article, we discuss our approach, some of the challenges we encountered and some of the lessons we learned while developing our application. This application, which we named “Altostratus” (an interesting type of cloud), searches Stack Overflow and Twitter for specific topics that we call “conversations.” We chose these two providers because they both have good Web APIs. The application has two main components:

  • A cloud back end, hosted on Azure. The back end periodically makes requests to the providers and aggregates the data into the form that’s best suited for the client. This avoids throttling concerns, reduces any concerns about latency in the providers, minimizes client-side processing, and reduces the number of network requests from the client. One tradeoff is that the WebJob runs every few hours, so you don’t get real-time data.
  • A lightweight mobile client app, created with Xamarin to run on Windows, Android and iOS (see Figure 1). The mobile client fetches the aggregated data from the back end and presents it to the user. It also keeps a synchronized cache of the data in a local database for a good offline experience and faster startup times.

The Xamarin Mobile Client Running on an Android Tablet (Left), a Windows Phone (Middle) and an iPhone (Right)
Figure 1 The Xamarin Mobile Client Running on an Android Tablet (Left), a Windows Phone (Middle) and an iPhone (Right)

Users can optionally sign in to the mobile client using social providers (Google or Facebook). When signed in, users can set preferences that further optimize communications with the client. Specifically, they can select which subject areas to receive and the maximum number of conversations within each subject area. User preferences are stored on the back end, so a user can sign in from any client and get the same experience. To demonstrate this idea, we also created a simple Web client that talks to the same back end.

 As part of this project, we also wanted to use the application lifecycle management (ALM) tools built into Visual Studio and Visual Studio Online to manage sprints and the work backlog, and to perform automated unit testing, continuous integration (CI) and continuous deployment (CD).

This two-part article explores the details of our project. Part 1 focuses on the back end and our use of ALM tools (DevOps). We’ll also talk about some of the challenges we encountered and some of the lessons learned, such as:

  • How to securely automate the deployment of passwords to non-Web apps.
  • How to handle Azure automation time-outs.
  • Efficient and informative background processing.
  • CI/CD limitations and workarounds.

Part 2 will discuss how we used Xamarin to target multiple mobile client platforms, including authentication and maintaining a synchronized client-side cache of the data.

Architecture

Figure 2 shows the high-level architecture for the Altostratus solution.

  • On the back end, we use Azure App Service to host the Web app, and Azure SQL Database to store data in a relational database. We use Entity Framework (EF) for data access.
  • We use Azure WebJobs to run a scheduled background task that aggregates data from Stack Overflow and Twitter and writes it to the database. The WebJob can easily be extended to aggregate data from additional sources.
  • The mobile client is created using Xamarin and communicates with the back end using a simple REST API.
  • The REST API is implemented using ASP.NET Web API, which is a framework for creating HTTP services in the Microsoft .NET Framework.
  • The Web client is a relatively simple JavaScript app. We used the KnockoutJS library for data binding and jQuery for AJAX calls.

Altostratus Architecture
Figure 2 Altostratus Architecture

The Database Schema

We use EF Code First to define the database schema and manage the back-end SQL database. As Figure 3 shows, the database stores the following entities:

  • Provider: A data source, such as Twitter or Stack Overflow.
  • Conversation: An item from a provider. For Stack Overflow, this corresponds to a question with answers. For Twitter, it corresponds to a tweet. (It could also be a tweet with replies, but we didn’t implement that feature.)
  • Category: The subject for a conversation, such as “Azure” or “ASP.NET.”
  • Tag: A search string for a specific category and provider. These correspond to tags in Stack Overflow (“azure-web-sites”) and hash tags in Twitter (“#azurewebsites”). The back end uses these to query the providers. The end user doesn’t see them.
  • UserPreference: Stores per-user preferences.
  • UserCategory: Defines a join table for UserPreference and Category.

The Altostratus Data Model
Figure 3 The Altostratus Data Model

In general, the Create, Read, Update, Delete (CRUD) code in the Altostratus application is typical for EF Code First. One special consideration has to do with handling database seeding and migrations. Code First creates and seeds a new database if no database exists the first time a program tries to access data. Because the first attempt to access data could happen in the WebJob, we have code in the WebJob’s Main method to make EF use the MigrateDatabase­ToLatestVersion initializer:

static void Main()
{
  Task task;
  try
  {
    Database.SetInitializer<ApplicationDbContext>(
      new MigrateDatabaseToLatestVersion<ApplicationDbContext,
        Altostratus.DAL.Migrations.Configuration>());

Without this code, the WebJob would use the CreateDatabaseIfNotExists initializer by default, which doesn’t seed the database. The result would be an empty database and errors when the application tries to read data from empty tables.

Designing the WebJob

WebJobs provides an ideal solution for running background tasks, which is work that previously would have required a dedicated Azure Worker Role to perform. You can run WebJobs on an Azure Web app with no additional cost. You can read about the advantages WebJobs provide over worker roles at Troy Hunts blog post, “Azure WebJobs Are Awesome and You Should Start Using Them Right Now!” (bit.ly/1c28yAk).

The WebJob for our solution periodically runs three functions: get Twitter data, get Stack Overflow data and purge old data. These functions are independent, but they must be run sequentially because they share the same EF context. After a WebJob completes, the Azure Portal shows the status of each function (see Figure 3). If the function completes, it’s marked with a green Success message, and if an exception is thrown, it’s marked with a red Failed message.

Failures are not infrequent in Altostratus because we use the free Stack Overflow and Twitter provider APIs. Queries, in particular, are limited: If you exceed the query limit, the providers return a throttling error. This was a key reason for creating the back end in the first place. Although it would be straightforward for a mobile app to make requests to these providers directly, a growing number of users could quickly reach the throttling limit. Instead, the back end can make just a few periodic requests to collect and aggregate the data.

One issue we encountered was around WebJob error handling. Normally, if any function in the WebJob throws an exception, the WebJob instance is terminated, the remaining functions don’t run and the entire WebJob run is marked as failed. In order to run all the tasks and show a failure at the function level, the WebJob must catch exceptions. If any function in Main throws, we log the last exception and re-throw, so the WebJob gets marked as Failed. The pseudo code in Figure 4 shows this approach. (The project download contains the complete code.)

Figure 4 Catching and Re-Throwing Exceptions

static void Main()
{
  Task task;
  try
  {
    Exception _lastException = null;
    try
    {
      task = host.CallAsync("Twitter");
      task.Wait();
    }
    catch (Exception ex)
    {
      _lastException = ex;
    }
    try
    {
      task = host.CallAsync("StackOverflow");
      task.Wait();
    }
    catch (Exception ex)
    {
      _lastException = ex;
    }
    task = host.CallAsync("Purge Old Data");
    task.Wait();
    if (_lastException != null)
    {
      throw _lastException;
    }
  }
  catch (Exception ex)
  {
  }
}

In Figure 4, only the last exception is shown at the WebJob level. At the function level, every exception is logged, so no exceptions are lost. Figure 5 shows the dashboard for a WebJob that ran successfully. You can drill down into each function to see diagnostic output.

Successful WebJob Run
Figure 5 Successful WebJob Run

Designing the REST API

The mobile app communicates to the back end through a simple REST API, which we implemented using ASP.NET Web API 2. (Note that in ASP.NET 5, Web API is merged into the MVC 6 framework, making it simpler to incorporate both into a Web app.)

Figure 6 summarizes our REST API.

Figure 6 REST API Summary

GET api/categories Gets the categories.
GET api/conversations?from=iso-8601-date Gets the conversations.
GET api/userpreferences Gets the user’s preferences.
PUT api/userpreferences Updates the user’s preferences.

All responses are in JSON format. For example, Figure 7 shows an HTTP response for conversations.

Figure 7 An HTTP Response for Conversations

HTTP/1.1 200 OK
Content-Length: 93449
Content-Type: application/json; charset=utf-8
Server: Microsoft-IIS/8.0
Date: Tue, 21 Apr 2015 22:38:47 GMT
[
  {
    "Url": "https://twitter.com/815911142/status/590317675412262912",
    "LastUpdated": "2015-04-21T00:54:36",
    "Title": "Tweet by rickraineytx",
    "Body": "Everything you need to know about #AzureWebJobs is here.
      <a href=\"https://t.co/t2bywUQoft\"">https://t.co/t2bywUQoft</a>",
    "ProviderName": "Twitter",
    "CategoryName": "Azure"
  },
  // ... Some results deleted for space
]

For conversations, we don’t require the request to be authenticated. That means the mobile app can always show meaningful data without requiring the user to log in first. But we also wanted to demonstrate having the back end perform additional optimizations when a user logs in. Our client app lets the user select which categories of conversations to display, along with a conversation limit. So if a request is authenticated (meaning the user logged into the app), the back end automatically filters the response based on those preferences. This limits the amount of data that must be processed by the client, which, over time, reduces demands on bandwidth, memory, storage and battery power.

The conversations API also takes an optional “from” parameter in the query string. If specified, this filters the results to include only conversations updated after that date:

GET api/conversations?from=2015-04-20T03:59Z

This minimizes the size of the response. Our mobile client uses this parameter to ask for only the data it needs to synchronize its cache.

We could have designed the API to use query strings to communicate a user’s preferences on a per-request basis, meaning those preferences would be maintained only in the client. That would have avoided the need for authentication. While this approach would work fine for basic scenarios, we wanted to provide an example that could be extended to more complicated situations where query strings would be insufficient. Storing preferences on the back end also means they’re automatically applied on every client where the same user logs in.

Data Transfer Objects (DTOs)

The REST API is the boundary between the database schema and the wire representation. We didn’t want to serialize the EF models directly:

  • They contain information the client doesn’t need, like foreign keys.
  • They can make the API vulnerable to over-posting. (Over-posting is when a client updates database fields you didn’t indent to expose for updates. It can occur when you convert an HTTP request payload directly into an EF model without validating the input sufficiently. See bit.ly/1It1wl2 for more information.)
  • The “shape” of the EF models are designed for creating database tables, and aren’t optimal for the client.

Therefore, we created a set of data transfer objects (DTOs), which are just C# classes that define the format for the REST API responses. For example, here’s our EF model for categories:

public class Category
{
  public int CategoryID { get; set; }
  [StringLength(100)]
  public string Name { get; set; }  // e.g: Azure, ASP.NET
  public ICollection<Tag> Tags { get; set; }
  public ICollection<Conversation> Conversations { get; set; }
}

The category entity has a primary key (CategoryID) and navigation properties (Tags and Conversations). The navigation properties make it easy to follow relations in EF—for example, to find all the tags for a category.

When the client asks for the categories, it just needs a list of category names:

[ "Azure", "ASP.NET" ]

This conversion is easy to perform using a LINQ Select statement in the Web API controller method:

public IEnumerable<string> GetCategories()
{
  return db.Categories.Select(x => x.Name);
}

The UserPreference entity is a bit more complicated:

public class UserPreference
{
  // FK to AspNetUser table Id   
  [Key, DatabaseGenerated(DatabaseGeneratedOption.None)]
  public string ApplicationUser_Id { get; set; }
  public int ConversationLimit { get; set; }    
  public int SortOrder { get; set; }            
  public ICollection<UserCategory> UserCategory { get; set; }
  [ForeignKey("ApplicationUser_Id")]
  public ApplicationUser AppUser { get; set; }
}

ApplicationUser_Id is a foreign key to the user table. UserCategory points to a junction table, which creates a many-to-many relation between user preferences and categories.

Here’s how we want this to look to the client:

public class UserPreferenceDTO
{
  public int ConversationLimit { get; set; }
  public int SortOrder { get; set; }
  public ICollection<string> Categories { get; set; }
}

This hides the things that are implementation details of the database schema, like foreign keys and junction tables, and flattens category names into a list of strings.

The LINQ expression to convert UserPreference to UserPreference­DTO is fairly complex, so we used AutoMapper instead. AutoMapper is a library that maps object types. The idea is to define a mapping once, and then use AutoMapper to do the mapping for you.

We configure AutoMapper when the app starts:

Mapper.CreateMap<Conversation, ConversationDTO>();
Mapper.CreateMap<UserPreference, UserPreferenceDTO>()
  .ForMember(dest => dest.Categories,
             opts => opts.MapFrom(
               src => src.UserCategory.Select(
                 x => x.Category.Name).ToList()));
              // This clause maps ICollection<UserCategory> to a flat
              // list of category names.

The first call to CreateMap maps Conversation to ConversationDTO, using AutoMapper’s default mapping conventions. For UserPreference, the mapping is less straightforward, so there’s some extra configuration.

Once AutoMapper is configured, mapping objects is easy:

var prefs = await db.UserPreferences
  .Include(x => x.UserCategory.Select(y => y.Category))  
  .SingleOrDefaultAsync(x => x.ApplicationUser_Id == userId);
var results = AutoMapper.Mapper.Map<UserPreference,
  UserPreferenceDTO>(prefs);

AutoMapper is smart enough to convert this into a LINQ statement, so the SELECTs happen on the database.

Authentication and Authorization

We use a social login (Google and Facebook) to authenticate users. (The authentication process will be described in detail in Part 2 of this article.) After Web API auth­enticates the request, the Web API controller can use this information to authorize requests and to look up the user in the user database.

To restrict a REST API to authorized users, we decorate the controller class with the [Authorize] attribute:

[Authorize]
public class UserPreferencesController : ApiController

Now if a request to api/userpreferences isn’t authorized, Web API automatically returns a 401 error:

HTTP/1.1 401 Unauthorized
Content-Type: application/json; charset=utf-8
WWW-Authenticate: Bearer
Date: Tue, 21 Apr 2015 23:55:47 GMT
Content-Length: 68
{
  "Message": "Authorization has been denied for this request."
}

Notice that Web API added a WWW-Authenticate header to the response. This tells the client what type of authentication scheme is supported—in this case, OAuth2 bearer tokens.

By default, [Authorize] assumes that every authenticated user is also authorized, and only anonymous requests are blocked. You can also limit authorization to users in specified roles (such as Admins), or implement a custom authorization filter for more complex authorization scenarios.

The conversations API is a more interesting case: We allow anonymous requests, but apply extra logic when the request is authenticated. The following code checks whether the current request is authenticated and, if so, gets the user preferences from the database:

if (User.Identity.IsAuthenticated)
{
  string userId = User.Identity.GetUserId();
  prefs = await db.UserPreferences
    .Include(x => x.UserCategory.Select(y => y.Category))
    .Where(x => x.ApplicationUser_Id == userId).SingleOrDefaultAsync();
}

If prefs is non-null, we use the preferences to shape the EF query. For anonymous requests, we just run a default query.

Data Feeds

As mentioned earlier, our app pulls data from Stack Overflow and Twitter. One of our design principles was to take data from multiple, diverse sources and aggregate it into a single, normalized source. This simplifies the interaction between clients and the back end, because clients don’t need to know the data format for any particular provider. In the back end, we implemented a provider model that makes it easy to aggregate additional sources, without requiring any changes in the Web APIs or clients. Providers expose a consistent interface:

interface IProviderAPI
{
   Task<IEnumerable<Conversation>> GetConversationsAsync(
     Provider provider, Category category,
     IEnumerable<Tag> tags, DateTime from, int maxResults, TextWriter logger);
}

The Stack Overflow and Twitter APIs provide plenty of capability, but with that capability comes complexity. We found that the StacMan and LINQtoTwitter NuGet packages made working with the APIs much easier. LINQtoTwitter and StacMan are well-documented, actively supported, open source and are easy to use with C#.

Handling Passwords and Other Secrets

We followed the article, “Best Practices for Deploying Passwords and Other Sensitive Data to ASP.NET and Azure App Service” (bit.ly/1zlNiQI), which mandates never checking passwords in source code. We store secrets only in auxiliary config files on local development machines. To deploy the app to Azure, we use Windows PowerShell or the Azure Portal.

This approach works well for the Web app. We moved the secrets out of the web.config file with the following markup:

<appSettings file="..\..\AppSettingsSecrets.config">
</appSettings>

Moving the file up two levels from the source directory means it’s completely out of the solution directory and won’t get added to source control.

The app.config file used by a console app (WebJobs) doesn’t support relative paths, but it does support absolute paths. You can use an absolute path to move your secrets out of your project directory. The following markup adds the secrets in the C:\secrets\AppSettingsSecrets.config file, and non-sensitive data in the app.config file:

<configuration>
  <appSettings file="C:\secrets\AppSettingsSecrets.config">
    <add key="TwitterMaxThreads" value="24" />
    <add key="StackOverflowMaxThreads" value="24" />
    <add key="MaxDaysForPurge" value="30" />
  </appSettings>
</configuration>

To handle the secrets in our Windows PowerShell scripts we use the Export-CliXml cmdlet to export the encrypted secrets to disk and Import-CliXml to read the secrets.

Automate Everything

A DevOps best practice is to automate everything. We originally wrote Windows PowerShell scripts to create the Azure resources our app requires (Web app, database, storage) and hook up all the resources, such as setting the connection string and the app settings secrets.

The Visual Studio Deployment Wizard does a good job of automating deployment to Azure, but there are still several manual steps to deploy and configure the app:

  • Entering the password of the administrator account on the Azure SQL Database.
  • Entering the app settings secrets for the Web app and WebJob.
  • Entering the WebJob storage account strings to hook up WebJob monitoring.
  • Updating the deployment URL in the Facebook and Google developer consoles, to enable OAuth social logins.

A new deployment URL requires you to update your OAuth provider authentication URL, so there’s no way to automate the last step without using a custom domain name. Our Windows PowerShell scripts create all the Azure resources we need and hook them up, so everything that can be automated is automated.

The first time you deploy a WebJob from Visual Studio, you’re prompted to set up a schedule for running the WebJob. (We run the WebJob every three hours.) At the time of this writing, there’s no way in Windows PowerShell to set a WebJob schedule. After running the Windows PowerShell creation and deployment scripts, you need the one-time additional step of deploying the WebJob from Visual Studio to set up the schedule, or you can set up a schedule on the portal.

We soon discovered that the Windows PowerShell scripts would sometimes time out while attempting to create a resource. Using the traditional Azure PowerShell approach, there’s no simple way to deal with a random resource-creation failure. Deleting any resources that were successfully created requires a non-trivial script that could possibly delete resources you didn’t intend to remove. Even when your script is successful, after you finish testing, you need to delete all the resources the script created. Keeping track of the resources to delete after a test run is non-trivial and error-prone.

Note that resource-creation timeout is not a flaw with Azure. Remotely creating complex resources, such as a data server or Web app, is inherently time-consuming. Cloud apps must be architected from the beginning to deal with timeouts and failure.

Azure Resource Manager (ARM) to the Rescue

ARM allows you to create resources as a group. This lets you easily create all your resources and handle transient faults, so if your script fails to create a resource, you can try again. Additionally, cleanup is easy. You simply delete your resource group, and all the dependent objects are automatically deleted.

Transient faults occur infrequently, and typically only one retry is necessary for the operation to succeed. The following snippet from a Windows PowerShell script shows a simple approach to implement retry logic with linear backoff when using an ARM template:

$cnt = 0
$SleepSeconds = 30$ProvisioningState = 'Failed'while ( [string]::Compare($ProvisioningState, 'Failed', $True) -eq 0 -and ($cnt -lt 4 ) ){   My-New-AzureResourceGroup -RGname $RGname `    -WebSiteName $WebSiteName -HostingPlanName $HostingPlanName
  $RGD = Get-AzureResourceGroupDeployment -ResourceGroupName $RGname  $ProvisioningState = $RGD.ProvisioningState  Start-Sleep -s ($SleepSeconds * $cnt)  $cnt++}

My-New-AzureResourceGroup is a Windows PowerShell function we wrote that wraps a call to the cmdlet New-AzureResourceGroup, which uses an ARM template to create Azure resources. The New-AzureResourceGroup call will almost always succeed, but the creation of resources specified by the template can time out.

If any resource wasn’t created, the provisioning state is Failed, and the script will sleep and try again. During a retry, resources that were already successfully created aren’t recreated. The preceding script attempts three retries. (Four failures in a row almost certainly indicate a non-transient error.)

The idempotence that ARM provides is extremely useful in scripts that create many resources. We’re not suggesting you need this retry logic in all your deployments, but ARM gives you that option when it’s beneficial.  See “Using Azure PowerShell with Azure Resource Manager” (bit.ly/1GyaMzv).

Build Integration and Automated Deployment

Visual Studio Online made it easy to set up continuous integration (CI). Whenever code is checked into Team Foundation Server (TFS), it automatically triggers a build and runs the unit tests. We also employed continuous delivery (CD). If the build and automated unit tests are successful, the app is automatically deployed to our Azure test site. You can read about setting up CI/CD to Azure at the documentation page, “Continuous Delivery to Azure Using Visual Studio Online” (bit.ly/1OkMkaW).

Early in the dev cycle, when there were three or more of us actively checking in source code, we used the default build definition trigger to kick off the build/test/deploy cycle at 3 a.m., Monday through Friday. This worked well, giving us all a chance to do a quick test on the Web site when we started working each morning. As the code base stabilized and check-ins were less frequent but perhaps more critical, we set the trigger to CI mode, so each check-in would trigger the process. When we got to the “Code clean up” phase, with frequent low-risk changes, we set the trigger to Rolling builds, where the build/test/deployment cycle is triggered at most every hour. Figure 8 shows a build summary that includes deployment and test coverage.

A Build Summary
Figure 8 A Build Summary

Wrapping Up

In this article, we looked at some of the considerations when creating a cloud back end that aggregates and processes data and serves it to mobile clients. No individual piece of our sample app is terribly complicated, but there are a lot of moving parts, which is typical of cloud-based solutions. We also looked at how Visual Studio Online made it possible for a small team to run continuous builds and continuous deployment without a dedicated DevOps manager.

In Part 2, we’ll look in detail at the client app and how Xamarin Forms made it easy to target multiple platforms with a minimum of platform-specific code. We’ll also delve into the mysteries of OAuth2 for social login.


Rick Anderson works as a senior programming writer for Microsoft, focusing on ASP.NET MVC, Microsoft Azure and Entity Framework. You can follow him on Twitter at twitter.com/RickAndMSFT.

Kraig Brockschmidt works as a senior content developer for Microsoft and is focused on cross-platform mobile apps. He’s the author of “Programming Windows Store Apps with HTML, CSS and JavaScript” (two editions) from Microsoft Press and blogs on kraigbrockschmidt.com.

Tom Dykstra is a senior content developer at Microsoft, focusing on Microsoft Azure and ASP.NET.

Erik Reitan is a senior content developer at Microsoft. He focuses on Microsoft Azure and ASP.NET. Follow him on Twitter at twitter.com/ReitanErik.

Mike Wasson is a content developer at Microsoft. For many years he documented the Win32 multimedia APIs. He currently writes about Microsoft Azure and ASP.NET.

Thanks to the following technical experts for reviewing this article: Michael Collier, John de Havilland, Brady Gaster, Ryan Jones, Vijay Ramakrishnan and Pranav Rastogi