Viewpoint - The Advent of Scale Computing

Oliver Sharp
Microsoft Corporation

November 2008

I believe that we are in the early stages of a transformation in how computing is done.  As I’ve thought about that change, its implications, and how we’re addressing it, the common theme that stands out for me is the idea of “scale” computing.  This is the notion that instead of writing code meant to be deployed on a single machine or a small set of machines that are tightly controlled by the deployer, we are building applications in a different way to run in large scale data centers.

As an industry, we are trying to figure out what this means – how it affects the programming model, the infrastructure, the customer experience, and the business model.  As a division and a company, we’re making a series of product investments that are motivated by this change, and I thought it might be useful to share some thoughts about them.  Our fundamental strategy as a company to address this evolution is “software + services” – providing a platform that spans the premise and the cloud to offer customers something none of our competitors can match.

6 … 12 … 18

These three numbers help capture some of the essence of what is happening.  Let’s start with 6.  Today in the enterprise, there are vast data centers of server machines.  There have been many studies of data center utilization across the industry, and the consensus is that the numbers are amazingly low.  Six percent is a typical answer that shows up (they spike as high as ten or twelve, depending on the study).  That means that the typical enterprise is running well over ten times as many servers as they need for the capacity they are actually using.  It is one of the reasons that they are very passionate about virtualization and consolidating their data centers into sets of fungible machines.

VM technology is taking off, and every vendor (including us) is aggressively investing in it to meet the demand that customers have to transform their backend environments.  You need to write and manage your applications differently if you want to be able to deploy them onto a set of virtualized machines that are flexibly added and removed as needed.  You need virtualized environments and ways to manage them efficiently at (massive) scale without that effort becoming a prohibitive cost – after all, people are almost always vastly more expensive than hardware and software.

The second number, 12, represents the truly awe-inspiring levels of investment that major vendors are making in their data centers.  Between the data centers we have and the ones that are in construction, Microsoft is on its way to having twelve million square feet of server facilities.  Google spent $2.4 billion on data centers in 2007 alone.  Many of our customers do not really aspire to be in the data center business.  The smaller ones are particularly unenthusiastic about it.  And they look at the facilities that companies like ours are constructing, and would like to figure out how they can take advantage of them.

The last number, 18, is something I heard from the manager of one of Microsoft’s highest traffic sites in the early days of the Web when traffic was going up exponentially.  I asked him what kept him up at night, and he answered “18”.  What did he mean by that?  It was the number of days until his servers were maxed out.  Every day he was bringing in new servers, and even more importantly thinking about what to do when he ran out of power and cooling in his current data centers.  He fantasized about the day when he could drive that number to 60 or more, and had enough breathing room to relax a little.  It was a vivid example for me of a life most of our customers don’t want to live.

So these three numbers have a great relevance for a wide range of customers.  Enterprises want to consolidate their backend systems onto a more efficient and more easily managed infrastructure.  Some of them would like to get out of the business of building large data centers and knowing that they are drastically over-spending (the “six” problem) and/or waking up in a cold sweat worrying about the hassle of managing and scaling them (the “eighteen” problem).  The more radical ones (typically the smaller and newer ones with less entrenched infrastructure) have moved entirely to the cloud so that they can let somebody else do the capital investment and manage the operational aspects – they have zero initial cost and can pay more as they need more capacity, knowing that an effectively infinite amount is available on tap.  We’ve seen a few ISVs who were traditional packaged product companies move to the cloud, and we’re seeing many of the new startups adopting that approach, leveraging services like those offered by Amazon and (soon) Azure.

Programming Model

Once upon a time, I was in graduate school and doing research on parallel programming.  It was a problem that was quite popular at the time – how to take an application (dense matrix scientific simulations, in my case) that were written in a traditional way and convert them through compilation to leverage hundreds of processors at runtime.  We were successful in analyzing the code and decomposing it so that it could take advantage of those compute resources, but the process was very brittle.  Tiny changes in the source code would radically and unpredictably change the runtime behavior, because the code access pattern to state would be transformed.  The people working on this problem mostly concluded that it was not a good approach.

I took away from that experience a strong appreciation of the importance of matching your programming model to the nature of the runtime you are targeting.  If you write code with a set of assumptions about that runtime that are radically different than the reality, it is probably going to end badly.  We can’t reason about code well enough (especially if it is imperative code, where seemingly everything interesting is Turing undecidable) to automagically transform it.  I’m convinced that if we are going to write code to run well in this highly scaled environment, we have to write it differently than we do today.

So how are we going to write these apps?  We’re at an early stage in understanding it, but some models have begun to emerge.  One approach is the “federated, idempotent, compensated” application.  A great (extreme) example of this is the Electronic Funds Transfer (EFT) system.  It is one of the first massively scaled applications ever written and is still one of the biggest.  It executes financial transactions across the world, in different continents and currencies and ensures a very high degree of consistency across financial institutions that run on wildly different platforms.  How does it work?  It doesn’t use ACID transactions.  It doesn’t have some massive database center buried under the Siberian ice cap and protected by an international military force.  Instead, it is based on the idea that failures are normal and you need a way to deal with them.  It assumes that messages will be lost or retransmitted routinely.  So the system must be idempotent – a repeated message will be recognized and the repeats ignored.  It relies on a technique called compensation.  The idea is that the different buckets of state around the world periodically exchange messages and agree that the money is available and has been transferred.  If they disagree, they notice the discrepancy and fix it (compensate for it).  This process is what it actually means for a check to “clear”.  The problem is that applications built this way are still hard to write today - the EFT system merited a massive level of effort and investment across the world.  We want to make it possible for relatively unsophisticated developers to build them.

Another model is one pioneered by Google, based on map and reduce operations (their system is called, logically enough, MapReduce).  They took some very powerful ideas that have been around for quite a while in the programming language community (they were implemented decades ago in APL and reused by some of the parallel machine vendors like Thinking Machines, for example).  Google is using them to let people write applications against truly staggering amounts of sparse data stored by unreliable servers.

A third model is for applications to be model-based.  This was a basic concept that enabled the Web – you write your content in a declarative notation (HTML) and then it can be hosted on one server or a hundred or a thousand.  That worked extremely well for content and for session-based state.  We’re trying to enable a similar approach for long-running stateful workflows.  The extensive work we’re doing on the modeling platform helps to enable that model through tools, language, and runtime support.

There are others – this is just a sampling – and I am sure that additional solutions will emerge as we and the industry grapple with the problem of making it easy for people to write high-scale applications.

Heirloom Silver vs. Paper Plates

The way that you build infrastructure for massive scale is also different.  The enterprise has been doing high scale for a long time, but the Internet and the cloud has ushered in a level of scalability like nothing that has ever been seen before.  The traditional model for reliability and scale was to have extremely expensive systems managed by highly expert people.  Think mainframes, EMC disk drives, database systems with geo-scale replication/failover, etc.  These “heirloom silver” systems cost the earth and managing them is very costly – you lovingly protect them and polish them to keep their luster.  They provide extremely high guarantees of consistency, performance, and reliability … and only a few organizations can afford their cost.

The Web model for scalability is very different.  The largest scale web server farms typically have a standard configuration running on quite cheap hardware.  When a machine dies, they dump it and replace it with another.  Nobody thinks hard about individual machines – that would cost way too much.  This is the “paper plate” model.  A phrase I’ve heard, which captures the spirit really well, is “reboot, replate, replace”.  If the machine is misbehaving, you reboot it.  If that doesn’t work, you sand it down to metal and put the image back on it.  If that doesn’t work, you recycle it.

The advantages of the paper plate model are obvious: massively less cost in terms of the machines and the people to manage them, much easier to achieve arbitrary scale (if you get it right).  We have to figure out, as an industry, how to make it easy to write applications that run on paper plate machines and don’t rely on the kinds of features that heirloom silver offered.  An interesting analogy for me is TCP/IP - it offers a very simple programming model on top of packets.  Packets scale beautifully, are very friendly to heterogeneous infrastructure, but are a miserable programming experience.  With TCP on top, we got all the wonderful benefits of packet routing but still kept a programming model that mortals could use.  That’s the kind of thing we need to figure out across the board for writing scale apps in a paper plate world.

Our Approach

So how is Microsoft dealing with this transition?  Our basic approach is “software+services”.  This is the idea of building a platform that works in the cloud, that works on premise, and that lets customers easily write applications involving both together.  Our approach is quite different from companies like Google and Amazon that are trying to build a pure cloud platform without targeting the premise at all.  There are two main reasons we are taking this approach: customer need, and our strategic opportunity.

Let’s start with the customers.  We know that they will be using the premise and the cloud together for decades.  Almost all of their software currently runs on the premise – cloud adoption is still very limited.  While we believe that the cloud is an exciting opportunity and an important change to computing, it will take many years before it represents a major fraction of the computation that our customers are doing.  Continuing to provide better and better solutions on the premise addresses a whole host of urgent customer requirements and is a vital part of continuing to grow our multi-billion dollar server business.

The strategic opportunity is also compelling.  We have very high levels of adoption in the premise as well as massive investments in the cloud, unlike anyone else in the industry.  We have the chance to build a platform that spans both, something that nobody else can do as well as we can.  By taking our current tools and frameworks and programming abstractions to the cloud, we can give customers a huge jumpstart on being productive in building cloud-based solutions.  By sharing those abstractions, we let them build mixed applications (which will be the reality, as per above, for a very long time) in a way that nobody else can match.  And a really important aspect to this is that customers can change their minds about how they want to deploy and host their applications – they can use their own servers on premise, or hire a hosting company, or use Microsoft’s hosting facilities.

We’re at a major transition point for the industry.  It’s very early in the evolution of cloud platforms – along with the other vendors in the industry, we’re figuring out how to build them, operate them at high scale with high reliability, and provide the tools and frameworks that will let customers actually get the benefit from the huge investments we’re collectively making.  New platforms don’t come along all that often, and it will be very interesting to watch this one evolve from a cool idea into the mainstream.

Page view tracker