This documentation is archived and is not being maintained.

Designing for Scalability

Visual Studio .NET 2003

Good design is the foundation of a highly scalable application. At no other point in the lifecycle of an application can a decision have a greater impact on the scalability of an application than during the design phase.

The Scalability Pyramid

As the scalability pyramid indicates, fast hardware, software, and tuning are only a small part of the scalability equation. At the base of the pyramid is design, which has the greatest influence on scalability. As you move up the pyramid through decreasingly important factors, the ability to impact scalability decreases. What the pyramid illustrates is that smart design can add more scalability to an application than hardware.

When designing for scalability the primary goal is to ensure efficient resource management. Designing for scalability is not limited to any particular tier or component of an application. Application architects must consider scalability at all levels, from the user interface to the data store. The five commandments of designing for scalability below can be useful when making design choices.

The Five Commandments of Designing for Scalability

Do Not Wait

A process should never wait longer than necessary. Each time slice that a process is using a resource is a time slice that another process is not able to use that resource. You can place processes into two separate categories, synchronous and asynchronous.

There are times when applications must perform actions synchronously. Some actions may need to wait for an action to return a result before continuing, or they may need to verify that an action was successful to ensure atomicity. That is, all of the actions associated with the operation must completely fail or succeed before performing another operation. However, when applications are limited in this manner, resources become a source of contention that negatively impacts scalability.

One way to achieve scalability is by performing operations in an asynchronous manner. When operating asynchronously, long-running operations are queued for completion later by a separate process.

For example, some e-commerce sites perform credit card validation during the checkout process. This can become a bottleneck on a high-volume e-commerce site if there is difficulty with the validation service. For e-commerce sites that must physically ship a product to fulfill an order, this process is a good candidate for asynchronous operation. Because possession of the product does not shift from the seller to the buyer during the online transaction, the retailer can complete the credit card validation offline. The application can then send e-mail to the customer confirming the order after validation of the credit card transaction.

Do Not Fight for Resources

Contention for resources is the root cause of all scalability problems. It should come as no surprise that insufficient memory, processor cycles, bandwidth, or database connections to meet demand would result in an application that cannot scale.

Regardless of design, all distributed applications possess a finite amount of resources. Besides expediting processes by not waiting on long-running operations, you can take other steps to avoid resource contention.

You should order resource usage from plentiful to scarce. For example, when performing transactions that involve resources that are scarce and thereby subject to contention, use those resources as late as possible. By doing so, transactions that are aborted early will not have prevented or delayed a successful process from using these resources.

Acquire resources as late as possible and then release them as soon as possible. The shorter the amount of time that a process is using a resource, the sooner the resource will be available to another process. For example, return database connections to the pool as soon as possible.

If possible, do not even use a contentious resource. Sometimes a process uses a transaction that does not require a resource when performing a function. For example, methods that require a transaction should be placed in a component separate from ones that do. As a result, you can avoid creating a transaction when it is not needed.

Design for Commutability

Designing for commutability is typically one of the most overlooked ways to reduce resource contention. Two or more operations are said to be commutative if they can be applied in any order and still obtain the same result. Typically, operations that you can perform in the absence of transaction are likely candidates.

For example, a busy e-commerce site that continuously updates the inventory of its products could experience contention for record locks as products come and go. To prevent this, each inventory increment and decrement could become a record in a separate inventory transaction table. Periodically, the database sums the rows of this table for each product and then updates the product records with the net change in inventory.

Design for Interchangeability

Whenever you can generalize a resource, you make it interchangeable. In contrast, each time you add detailed state to a resource, you make it less interchangeable.

Resource pooling schemes take advantage of interchangeable resources. COM+ component pooling and ODBC connection pooling are both examples of resource pooling of interchangeable resources.

For example, if a database connection is unique to a specific user, you cannot pool the connection for other users. Instead, database connections that are to be pooled should use role-based security, which associates connections with a common set of credentials. For connection pooling to work, all details in the connection string must be the same. Also, database connections should be explicitly closed to ensure their return to the pool as soon as possible. Relying on automatic disconnection to return the connection to the pool is a poor programming practice.

The concept of interchangeability supports the argument to move state out of your components. Requiring components to maintain state between method calls defeats interchangeability and, ultimately, scalability is adversely impacted. Instead, each method call should be self-contained. Store state outside the component when it is needed across method calls. A good place to keep state is in a database. When calling a method of a stateless component, any state required by that method can either be passed in as a parameter or read from the database. At the end of the method call, preserve any state by returning it to the method caller or writing it back to the database.

Interchangeability extends beyond resource pooling. Server-side page caching for a Web application will most likely increase its scalability. Although personalization can give a user a unique experience, it comes at the expense of creating a custom presentation that you cannot reuse for another user.

Partition Resources and Activities

Finally, you should partition resources and activities. By minimizing relationships between resources and between activities, you minimize the risk of creating bottlenecks resulting from one participant of the relationship taking longer than the other. Two resources that depend on one another will live and die together.

Partitioning of activities can help ease the load that you place on high cost resources. For example, using SSL entails a significant amount of overhead to provide a secure connection. As such, it is sensible to use SSL only for pages that actually require the increased security. In addition, Web servers dedicated to the task could handle SSL sessions.

Transactions provide another opportunity for partitioning of activities. By separating methods that do not require transactions from those that do, you do not needlessly impose the overhead required for a transaction on methods that do not require one.

However, partitioning is not always a good choice. Partitioning can make your system more complex. Dividing resources that have dependencies can add costly overhead to an operation.

See Also