March 2012

Volume 27 Number 03

Forecast: Cloudy - Exploring Cloud Architecture

By Joseph Fultz | March 2012

Joseph FultzI decided a while back that at least one of my columns would cover general architectural considerations and some base high-level designs for creating cloud-based solutions. I know from all of the sessions I’ve presented at or attended that if a technical audience doesn’t get to play with some code, the sessions aren’t usually well-received. But I’m hoping this article will benefit folks by at least helping to frame and give some basic targets for their thinking about solution architecture.

I’ll be using Platform as a Service (PaaS)—in particular, Azure—as the contextual corkboard on which to hang the ideas and considerations, but the ideas I want to convey should also apply to Infrastructure as a Service (IaaS) and Software as a Service (SaaS) with little translation. I’ll point out some of the key design considerations when creating a cloud solution, categorize the platform tools Azure provides, illustrate some base designs and provide a decision matrix for selecting your design style.

Cloud Considerations

Seasoned developers and architects might get into a habit of identifying some problem characteristics, matching a pattern and pronouncing a design. I’m going to highlight some typical designs for cloud solutions, but I highly encourage taking an active approach to solving problems versus just using a cookie cutter from the previous batch of solutions that were baked up. The best approach is to design a solution that’s ideal and then start overlaying conditions that the cloud adds and conditions of the enterprise ecosystem. Add them one at a time and solve any associated problems, because just like math or proper debugging, if you add too many variables, you’ll introduce errors whose source can’t be easily ascertained.

What does it mean to design a “cloud solution?” I want to start with a strong emphasis that a Web Role queuing to a Worker Role and then picking the result up off a return queue isn’t the defining style of cloud design. In fact, in many regards, using such architecture as a base from which to launch a design might in fact reduce the overall benefit of a cloud implementation. As private cloud infrastructure becomes readily available, the line blurs between cloud and not-cloud because the delineating factor isn’t location. So the focus for differentiation needs to be on what it does rather than what it is or where it is.

Here are the things that using Azure as a means to host my Web or service applications does for me:

  • Frees me from infrastructure acquisition, setup or management
  • Frees me from constrained capacity; computing and storage capacity are available on demand
  • Frees me from steep capital cost and frees me to adjust cost based on consumption

I don’t mean to trivialize it, but that’s it; no need to complicate it. Indeed, there are a lot of nontrivial features within the cloud platform, but those aren’t what differentiate the cloud from a more traditional hosted solution. Individual features (queuing, storage, structured storage and so on) that are added for a platform such as Azure—while new for the platform—aren’t fundamentally new software capabilities that weren’t available before. The challenges addressed directly with the cloud shift are compute and storage elasticity and capital expense, along with ongoing care and feeding of the hardware infrastructure. This is good news for those designing software because it means designs and patterns remain mostly the same, with a little extra emphasis on two considerations: cost and latency. Certainly these two concerns were important before, but the cloud changes the equation enough as to require special attention.

The Cost Factor

As many of you already seasoned from cloud implementations know, it isn’t always cheaper. Fortunately, cost efficiency doesn’t equate to cheap or even free. You can determine the cost to be efficient when the equation can’t be changed to reduce cost without also reducing the services or functionality—meaning that there’s no cost above and beyond what’s necessarily consumed. There’s no doubt that there are cost optimizations to be had for an enterprise switching from a private datacenter to a cloud-based solution set, but to really see that benefit, there needs to be an affinity toward guiding principles: a shift toward cloud-first solutions and design for cost optimization.

There must be a shift toward cloud computing and away from the private hosting scenario. It’s hard to realize the cost benefit of hosting in the cloud while there’s unused or underused infrastructure available to consume that has already been purchased and is in place. Thus, as systems age or new software comes on board, you should look to the cloud first instead of buying more infrastructure bandwidth. There are times when owning the datacenter is indeed cheaper, but for most enterprises that’s not the reality. Contract and bulk rates aside, generally the closer you get to consumption-based cost, the closer you get to an efficient cost model.

With the first step taken, the next step is to ensure you’re not a cloud glutton. Each application needs to be designed around optimizing for cost. While the first point is often difficult from a business perspective, the second point can be a painful pill to swallow in a technical design. That’s because there will be times when designing to optimize for cost means putting it in front of performance or elegance. Usually I find that creating equilibrium between performance and elegance is where I spend a good bit of my effort, but throw cost optimization in the picture and it gets further complicated by turning my 2D problem into a 3D one.

Optimizing cost means being minutely aware of the following:

  • Ingress and egress, both to the end consumer and to the corporate backbone
  • Retention policy for data in storage so as to not use unneeded space
  • How to responsibly scale up and scale down compute resources

If a design takes into account these three considerations, then it’s definitely moving in the right direction.

The Latency Factor

Traditionally, latency becomes an issue when the information that must be served to the end user exists in one or more of these states:

  • It’s in a legacy or otherwise constrained system
  • It exists across systems and must be aggregated first
  • It requires one or more transformations before being handed to the UI
  • It exists outside of the enterprise and must be remotely fetched

For a cloud-based solution, any and all of these items might be true, but it’s the last item that applies without exception to designing enterprise solutions for the cloud. Certainly the other items will likely be factors as well, but a universal truth for enterprises today is that data will have to be somehow synchronized between the cloud solutions and the corporate data stores in the home datacenter.

I’ll cover a few ways of integrating data in the base designs I set forth here, but it really comes down to two primary strategies: data synchronization and real-time service calls back home.

High-Level Design

Thinking about the generalization of design for enterprise solutions, one can usually lump each of the big gears into one of these categories:

  • Compute
  • Storage
  • Integration

This is how I’ll categorize the Azure platform capabilities in this context. The grid in Figure 1 shows a subset of the Azure platform capabilities.

Figure 1 Platform Toolbox Categories

Azure Compute Storage Integration & Services
Compute Web Roles    
  Worker Roles    
  VM Role    
Storage   Blob Storage Queue Storage
    Table Storage  
Formerly AppFabric     Service Bus
      Connect
      Access Control
      Caching
SQL Azure   Database  
  Reporting    
      SQL Azure Data Sync

Focusing on the big bricks of compute, storage and integration, you can easily put together some base designs from which to start and expand or even recombine, as shown in Figure 2.

Basic High-Level Designs
Figure 2 Basic High-Level Designs

While the designs shown in Figure 2 aren’t representative of all the possible permutations, they’re bigger blocks of design that have their purpose:

  • Direct Data Access: This is usually the most basic implementation where the front-end Web interface not only accesses data directly but also carries the burden of all the work.
  • Services Indirection: This is a service-oriented architecture (SOA) design that’s the next level of indirection and workload balancing, moving data access and business logic off of the roles serving the UI. A good SOA design can not only reduce the fragility of the solution but also reduce the load on the back end and optimize the data responses to the client.
  • Queue Indirection: Work is offloaded to background workers via Azure Storage Queues. This is a first step at optimizing the workload but is generally suitable when the result of the offloaded work is accepted as not requiring near-time response to the client.
  • Service Bus Integration: Adding in a pub/sub facility to the services infrastructure can really augment a SOA design by moving the layer of indirection away from the endpoints and moving toward topics of interest. This design can provide the most flexible implementation.

Corporate Backplane Design

The base designs don’t address the corporate backplane specifically. Two of the most important pieces of a Azure solution are how it connects to the corporate systems that contain proprietary business logic and how it accesses corporate data. There are some basic designs for the corporate backplane that are used by themselves or in combination, much like the previous designs, as shown in Figure 3.

Corporate Backplane Basic Designs
Figure 3 Corporate Backplane Basic Designs

One basic way to connect your cloud solutions to your corporate ecosystem is to simply use Azure Connect as discussed by Anna Skobodzinski, one of my colleagues from the Microsoft Technology Center (MTC), at bit.ly/wnyqj9. This is depicted in the VPN Style design in Figure 3. The system can then be used as is, but with a tax of about a second. Additionally, it will be worth doing some work ahead of time to make sure the systems being exposed to the cloud applications are ready for the additional load. If you need it in a hurry, the cloud is a great solution; if you need it in a hurry and it has to integrate with existing interfaces, VPN Style integration might be the fastest and easiest way to mitigate risk to your timeline.

The most common type of integration that I come across is the type depicted under Data Integration Style. This requires some means of synchronizing data in one or both directions. There are numerous ways to accomplish this, each with its own set of pros and cons. Using an extract, transform and load (ETL) tool such as SQL Server Integration Services (SSIS) is probably the lowest barrier to entry for most. SSIS as a means to synchronize data also has a very mature ecosystem for monitoring and management. The next-easiest mechanism is to use SQL Azure Data Sync, which is currently available as a preview (bit.ly/p14qC6). It’s built on Sync Framework technology that I covered in my January 2011 (msdn.microsoft.com/magazine/gg535668) and February 2011 (msdn.microsoft.com/magazine/gg598920) columns. This will require some changes to the tables being synchronized. In both the SSIS and Sync Framework methods, it’s a good idea to make sure to only move the fields of data that are necessary between cloud and corporate premises. By not moving data unnecessarily, the cost of ingress, egress and overall cloud storage usage can be kept down.

Moving the needle forward a little bit and taking some of the work away from the database back end is the Direct Services Style shown in Figure 3, where data needed from corporate systems is fetched via a set of exposed services. The two big upsides here are that the cost of data transport is removed and the data is always current. Less positively, it does introduce latency and put additional strain on systems that might not be ready to be incorporated into a cloud solution. It also implies that most of the intelligence has to live in the corporate systems and services, as it will have the bulk of the data to analyze.

The last of the four patterns in Figure 3 shows the use of a service bus to integrate the cloud and the corporate systems. This Service Bus Style would entail that both the running cloud roles and the corporate systems publish and subscribe to data and events that are relevant to each of them. This has about the same pros and cons as the Direct Services Style of architecture, with the added benefit of maximum flexibility for creating new publishers and consumers of information.

The decision matrix is a tool to help identify potential problem areas when considering the use of one of the basic design styles over another. The decision matrix in Figure 4 isn’t an absolute, but rather reflects my experience in working with folks on cloud architecture. There isn’t necessarily a clear winner, and in many cases a real design will incorporate some mix of these elements. For example, one might deploy a new Web application that has a set of REST services working against a back-end database that’s synchronized via an ETL run twice per day. Figure 5 shows an example of such a combined design.

Figure 4 Decision Matrix

  Time to Solution Solution Complexity Monitoring & Management Agility & Scalability Latency Freshness
Basic Designs            
Direct Data Access            
Queue Indirection            
Services Indirection            
Service Bus            
Backplane Designs            
VPN            
Data Integration            
Services Integration*            
Service Bus            
* Service infrastructure already available in corporate infrastructure

Example Combined Design
Figure 5 Example Combined Design

Additionally, there’s a set of file-based resources in Azure Storage that’s used by the site and is kept in sync by a custom Sync Framework implementation. This high-level design is the end result of collecting information and making some informed guesses, but it’s only the start of the work. It provides the target for end design and implementation but leaves a lot of details to be decided (for example, caching, access control and so on).

Design and Cost Efficiency

I tried to tease out and identify the most important considerations when designing a cloud solution—in particular, one to be incorporated into an existing enterprise. Hopefully, by boiling off the plethora of features and capabilities a bit, it helps to clear the path for design. The real trick to designing a cloud solution is to take an ideal solution and make the design efficient while still including cost efficiency. As the cloud movement puts enterprise-level computing within reach of anyone willing to lean over and flip the switch, the elegance in design will account for consumption cost.

I touched upon a base Web design and a base corporate integration design, but left all of the details for a future drill-down; it would simply take too much space to cover at once. Because this is the only column I’ve written that focuses solely on design and has no implementation associated with it, I hope it will be useful. If you’d like me to drill down into some of the detail design pieces, please comment on this article on msdn.microsoft.com/magazine.


Joseph Fultz is a software architect at Hewlett-Packard Co., working as part of the HP.com Global IT group. Previously he was a software architect for Microsoft, working with its top-tier enterprise and ISV customers defining architecture and designing solutions.

Thanks to the following technical expert for reviewing this article: George Huey