June 2010

Volume 25 Number 06

Windows Azure Cache - Real-world Usage and Integration

By Andrea Colaci | June 2010

Microsoft Windows Server AppFabric, formerly codenamed “Velocity,” provides a distributed cache that you can integrate into both Web and desktop applications. Windows Azure can improve performance, scalability and availability while, from the developer perspective, behaving like a common memory cache. You can cache any serializable object, including DataSets, DataTables, binary data, XML, custom entities and data transfer objects.

The Windows Azure client API is simple and easy to use, and the server API provides a full-featured Distributed Resource Manager (DRM) that can manage one or more cache servers (with multiple servers comprising a cache cluster). Each server provides its own memory quota, object serialization and transport, region grouping, tag-based search and expiration. The cache servers also support high availability, a feature that creates object replicas on secondary servers.

The June 2009 issue of MSDN Magazine includes a good introduction to Windows Server AppFabric by Aaron Dunnington (msdn.microsoft.com/magazine/dd861287). In this article I’m going to explain how to integrate Windows Azure caching into desktop and Web applications. Along the way, I’ll provide some best practices and give some hints for taking advantage of new features in the Microsoft .NET Framework 4 and ASP.NET 4. You’ll also learn how to solve common issues that arise when using a distributed cache.

All code samples that follow come from a complete demo solution called Velocity Shop, available on Codeplex at velocityshop.codeplex.com.

Note that Windows Server AppFabric, which I’ll discuss in this article, is different from the Windows Azure. For more information about Windows Azure technology, see https://www.windowsazure.com/en-us/home/features/overview/.

Getting Started

You can install the current beta 2 Refresh of Windows Server AppFabric in several ways for development. The Web Platform Installer (microsoft.com/web/downloads) lets you easily set up a variety of Web development applications and frameworks through a single configurable installation. As a bonus, the Web Platform Installer is updated to include the new releases of the supported apps and frameworks.

Those who just want to install Windows Azure will find a link to the latest release on the Windows Server AppFabric page of the Windows Server Developer Center at msdn.microsoft.com/windowsserver/ee695849.

After you complete the setup, Windows Azure caching is almost ready for use. The next step is to create a named cache, a logical container used to store data. You do this through the New-Cache cmdlet in Windows PowerShell:

New-Cache -cacheName Catalog

To start using Windows Azure caching in your application, just add the references to CacheBaseLibrary.dll, CASBase.dll, CASMain.dll and ClientLibrary.dll in your Visual Studio project.

The client library is straightforward. The following code shows how to access the distributed cache to access the named cache and store or retrieve objects:

cacheCluster = new DataCacheServerEndpoint[1];
cacheCluster[0] = new DataCacheServerEndpoint(
  "ServerName", 22233, "DistributedCacheService");
DataCacheFactory factory = 
  new DataCacheFactory(cacheCluster, true, false);
DataCache cache = factory.GetCache("Catalog");
// Putting a product in cache
cache.Put("Product100", myProduct);
// Getting Product from cache
Product p = (Product)cache.Get("Product100");

Before you dive into Windows Azure caching, it’s a good idea to start with a bit of planning. The first step is to consider how you will set up the cache. This determines which of its features will be available to your app.

To start, you’ll want to set up a named cache for your project, as discussed earlier. With that in place, you can set custom expiration and notification policies. Your objects or collections may require different cache durations and should (or perhaps shouldn’t) be gracefully evicted from the cache when memory pressure is high. To set an absolute expiration timeout for a given named cache, use the TTL parameter with the New-Cache cmdlet.

Along with the named cache, you may want to configure regions. These behave like subgroups in the cache and can be used for organizing objects and simplifying the process of finding objects in the cache. For example, say my application uses a catalog of consumer electronics devices. I could create a Catalog cache in which I divide my products into regions called Televisions, Cameras and MP3 Players. To create regions, which you can do only at run time, you use the dataCache.CreateRegion method and supplying a region name:

// Always test if region exists;
try {
  cache.CreateRegion("Televisions", false);
}
catch (DataCacheException dcex) {
  // if region already exists it's ok, 
  // otherwise rethrow the exception
  if (dcex.ErrorCode != DataCacheErrorCode.RegionAlreadyExists) 
    throw dcex;
}

Keep in mind that if a region with same name already exists, a DataCacheException will be thrown, so you must use a proper try-catch block.

But what if you need to search for products on a feature basis? You’ll find another feature of Windows Azure caching useful for that: tag-based search. This feature is available only when using regions, and it lets you attach one or more tags to each item parked in the cache for subsequent searches.

For example, say I want to find all products in the Televisions region that have “LED-Panel” and “46-Inches” tags. I use the GetByAllTags method, specifying a list of DataCacheTags, and that’s it. Here’s an example of tag-based search:

DataCacheServerEndpoint[] cacheCluster = GetClusterEndpoints();
DataCacheFactory factory = 
  new DataCacheFactory(cacheCluster, true, false);
DataCache cache = factory.GetCache("Catalog");
IEnumerable<KeyValuePair<string, object>> itemsByTag = 
  cache.GetObjectsByTag(
  new DataCacheTag("LED-Panel"), "Televisions");

When introducing a cache layer into an existing or new application, there are also common points that you must take into account. It helps to identify and classify data types that are good candidates for caching. Here are three such categories:

  • Reference data, which encompasses read-only data that changes infrequently, if at all—for example, a countries list, catalogs of commonly stocked items or product datasheets.
  • Activity data, which includes any data subject to change on a per-user or per-session basis, such as a shopping cart or a wish list.
  • Resource data, which is information that can vary often and is subject to more than one type of user activity, such as product inventory changes made after customers place orders.

This classification is useful when specifying expiration and notification policies for each named cache so you can obtain efficient and rational resource utilization. Remember that even if you can add cache servers to the cluster, memory always remains a finite resource.

Cache Lifecycle

When your application starts, the cache is empty. Users will be hitting the Web site, though. So how do you warm up the cache?

With the cache-aside pattern, cache-enabled applications must be able to switch to persistent storage, such as a SQL Server database, when the requested data is not in cache. This could be an expensive step in data-intensive scenarios, especially when dealing with large amounts of reference data or when the uniqueness of the cache load isn’t guaranteed. There might be a point when more than one thread, after testing the cache for an object, attempts to load data from storage and parks the data in the cache for subsequent requests. So you can incur performance penalties from both fetching non-cached data and caching data more often than necessary.

This is when something like IIS 7.5 service auto start would come in handy. With Windows Azure you can employ a reader lock to get similar results. To enable service auto start, you have to apply some changes to applicationHost.config, as shown here:

<serviceAutoStartProviders>
  <add name="PrewarmMyApp" 
       type="MyWebApp.PreWarmUp, MyWebApp" />
</serviceAutoStartProviders>

Next you load Reference and Resource data into the named cache by implementing the Preload method with custom code:

using System.Web.Hosting;
namespace DemoApp {
  public class WarmUp : IProcessHostPreloadClient {
    public void Preload(string[] parameters) {
      // Cache preload code here...
    }
  }
}

This makes Web applications available only after the completion of Preload method.

In server-farm scenarios such as that shown in Figure 1, you also have the problem of each cold-starting application hitting the persistent storage to load its cache. The default ASP.NET cache is tied to the appDomain in which the application runs, so you’ll only get the hit of loading the cache once. In Windows Azure, however, the cache is shared across Web servers, and consequently, across Web applications, so concurrent cache-load attempts should be avoided.

image: Windows Azure Cache in a Server Farm
Figure 1 Windows Azure Cache in a Server Farm

There are several options for delegating the cache load to a single server. One feature introduced in Windows Azure beta 1 is the reader lock. This lets you lock a cache-item key before it’s added to the cache. The first preload code that locks the key will be the only one to start loading data associated with it. This is like booking keys before using them during cache load. The technique also enables you to distribute the cache-load operations across Web servers. You can configure each server in advance to load the data associated with specific booked keys.

Knowing when the cache is loaded is a common synchronization issue with distributed resources like cache clusters. With Windows Azure caching there are a number of methods for determining when the cache is ready. One technique is to have servers poll common keys after loading their own booked key data. Other options include subscribing to a cache-ready notification or even performing the preload phase in a separate service or application, because the cache is now distributed and accessible from both Web and desktop applications.

In addition, you can leverage the .NET Framework 4 System.Threading.Tasks.Parallel class to parallelize cache load operations, as shown in Figure 2.

Figure 2 Parallel Cache Loading

// load catalog items from DB
SQLServerCatalogProvider SQLServerCatalogProvider = 
  new SQLServerCatalogProvider();
itemsInCategory = 
  SQLServerCatalogProvider.GetItemsByCategoryId(categoryId);
_helper.CreateRegion(categoryId);
Parallel.ForEach(itemsInCategory, item =>{
  // put each catalog item in cache with tags
  if (item.Tags==string.Empty)
    _helper.PutInCache(item.ProductId, item, categoryId);
  else
    _helper.PutInCache(item.ProductId, item, categoryId, item.Tags);
});
// Code from Helper class
public void PutInCache(string key, object value, 
  string region, string tags) {
  List<DataCacheTag> itemTags = new List<DataCacheTag>();
  foreach (var t in tags.Split(',').ToList())
    itemTags.Add(new DataCacheTag(t.ToLower()));
  _cache.Put(key, value, itemTags , region);
}

A cache-through feature is planned for a future release of Windows Azure caching. This would enable you to automatically run custom code to load the cache when data isn’t present, and conversely, to save data to persistent storage when information has been updated in cache.

ASP.NET Integration

The ASP.NET provider model enables developers to choose from three session providers: InProc, StateServer and SQLServer. With Windows Azure caching, a fourth session provider is technically available, but be careful not to confuse session with cache. Cache is about improving performances, session is about making an application stateful.

The Windows Azure caching session provider for ASP.NET uses its distributed—and potentially highly available—cache as a repository for ASP.NET sessions. This is transparent and available without breaking existing code. Having such a provider enables an ASP.NET session to survive if the Web server crashes or goes offline, because sessions are stored out-of-process in the Windows Azure cache.

Once Windows Azure caching is installed and configured, you must create a named cache for storing ASP.NET sessions. Then you can enable DataCacheSessionStoreProvider by modifying Web.config as shown in Figure 3.

Figure 3 Enabling ASP.NET Sessions in Windows Azure Cache

<?xml version="1.0"?>
<configuration>
  <configSections>
    <section name="dataCacheClient" 
      type="Microsoft.Data.Caching.DataCacheClientSection, CacheBaseLibrary" 
      allowLocation="true" allowDefinition="Everywhere"/>
    <section name="fabric" 
      type="System.Data.Fabric.Common.ConfigFile, FabricCommon" 
      allowLocation="true" allowDefinition="Everywhere"/>
    <!-- Velocity 1 of 3 END -->
  </configSections>
  <dataCacheClient deployment="routing">
    <localCache isEnabled="false"/>
    <hosts>
      <!--List of services -->
      <host name="localhost" cachePort="22233" 
        cacheHostName="DistributedCacheService"/>
    </hosts>
  </dataCacheClient>
  <fabric>
    <section name="logging" path="">
      <collection name="sinks" collectionType="list">
        <!--LOG SINK CONFIGURATION-->
        <!--defaultLevel values: -1=no tracing; 
            0=Errors only; 
            1=Warnings and Errors only; 
            2=Information, Warnings and Errors; 
            3=Verbose (all event information)-->
        <customType 
          className="System.Data.Fabric.Common.EventLogger,FabricCommon" 
          sinkName="System.Data.Fabric.Common.ConsoleSink,FabricCommon" 
          sinkParam="" defaultLevel="-1"/>
        <customType 
          className="System.Data.Fabric.Common.EventLogger,FabricCommon" 
          sinkName="System.Data.Fabric.Common.FileEventSink,FabricCommon" 
          sinkParam="DcacheLog/dd-hh-mm" defaultLevel="-1"/>
        <customType 
          className="System.Data.Fabric.Common.EventLogger,FabricCommon" 
          sinkName="Microsoft.Data.Caching.ETWSink, CacheBaseLibrary" 
          sinkParam="" defaultLevel="-1"/>
      </collection>
    </section>
  </fabric>
<appSettings/>
<connectionStrings/>
<system.web>
  <sessionState mode="Custom" customProvider="Velocity">
    <providers>
      <add name="Velocity" 
        type="Microsoft.Data.Caching.DataCacheSessionStoreProvider, ClientLibrary" 
        cacheName="session"/>
    </providers>
  </sessionState>
...

You can integrate Windows Azure caching into existing ASP.NET 3.5 applications because only the server part requires the .NET Framework 4, but the client library works with .NET Framework 3.5 and 4.

Another ASP.NET 4 extensibility point is output cache. Since ASP.NET 1.0, it has been able to store the generated output of pages and controls in an in-memory cache. Subsequent requests can get this output from memory instead of generating it again. ASP.NET 4 supports configuration of one or more custom output cache providers. An Windows Azure output cache provider will be available after the release of ASP.NET 4, and you can roll your own by extending the OutputCacheProvider class and modifying Web.config to enable it.

ORM Integration

Most popular object-relational mapping (ORM) frameworks provide a feature called second-level cache, a repository that stores entities, collections and query results for a configurable amount of time. When an entity is going to be loaded from persistent storage, the ORM first tests second-level cache to check whether the entity is already loaded. If so, an instance of the requested entity is passed to the calling code without hitting the database.

Each ORM implementation has its own strategy for dealing with entity associations, collections and query results, and this is equally true of second-level cache. Depending on the ORM you employ, you may find that your options for customizing and consuming second-level cache are limited, and that they force you into a particular approach in consuming cache changes. For example, if expiration policy and cache dependencies are difficult to customize, object contention may increase and cache efficiency decreases.

Both NHibernate and Entity Framework can use Windows Azure caching as a second-level cache. Nhibernate enables this feature through nhibernate.caches.velocity (sourceforge.net/projects/nhcontrib/files/NHibernate.Caches/). Entity Framework enables it through EFCachingProvider by Jaroslaw Kowalski (code.msdn.microsoft.com/EFProviderWrappers).

Putting Windows Azure to Work

As you’ve seen, Windows Server AppFabric caching makes it easy to enable cluster-grade caching in any new or existing application. To get the full benefits of a cache, you’ll need to identify the right candidate data objects—but you may have already done that for local server caching. You’ll also need an expiration and notification policy.

Windows Azure is available with a Windows Server 2008 license and is also available for download (msdn.microsoft.com/windowsserver/ee695849). It also will be present as Cache role in Windows Azure.

Download Windows Azure, set up your single or multi-server cache cluster, and if you like, download and run Velocity Shop on CodePlex to quickly evaluate and enjoy all Windows Azure features demonstrated inside the solution.


Andrea Colaci is a consultant and trainer with more than 10 years of experience. He has a strong curiosity and a passion for the latest languages and development tools.

Thanks to the following technical expert for reviewing this article: Scott Hanselman