.gif)
Performance Testing Guidance for Web Applications
J.D. Meier, Carlos Farre, Prashant Bansode, Scott Barber, and Dennis Rea
Microsoft Corporation
September 2007
Objectives
- Understand the key concepts of stress testing.
- Learn how to stress-test a Web application.
Overview
Stress testing is a type of performance testing
focused on determining an application’s robustness, availability, and
reliability under extreme conditions. The goal of stress testing is to identify
application issues that arise or become apparent only under extreme conditions.
These conditions can include heavy loads, high concurrency, or limited computational
resources. Proper stress testing is useful in finding synchronization and
timing bugs, interlock problems, priority problems, and resource loss bugs. The
idea is to stress a system to the breaking point in order to find bugs that
will make that break potentially harmful. The system is not expected to process
the overload without adequate resources, but to behave (e.g., fail) in an acceptable
manner (e.g., not corrupting or losing data).
Stress tests typically involve simulating one or more key
production scenarios under a variety of stressful conditions. For example, you
might deploy your application on a server that is already running a
processor-intensive application; in this way, your application is immediately
“starved” of processor resources and must compete with the other application
for processor cycles. You can also stress-test a single Web page or even a
single item such as a stored procedure or class.
This chapter presents a high-level introduction to
stress-testing a Web application. Stress testing can help you identify
application issues that surface only under extreme conditions.
Examples of Stress Conditions
Examples of stress conditions include:
- Excessive volume in terms of either users or data; examples
might include a denial of service (DoS) attack or a situation where a
widely viewed news item prompts a large number of users to visit a Web
site during a three-minute period.
- Resource reduction such as a disk drive failure.
- Unexpected sequencing.
- Unexpected outages/outage recovery.
Examples of Stress-Related Symptoms
Examples of stress-related symptoms include:
- Data is lost or corrupted.
- Resource utilization remains unacceptably high after the
stress is removed.
- Application components fail to respond.
- Unhandled exceptions are presented to the end user.
How to Use This Chapter
Use this chapter to understand
the key concepts of stress testing and the steps involved in stress-testing a Web
application. To get the most from this chapter:
- Use the “Input” and “Output” sections to
understand the key inputs for stress-testing a Web application and the key
outcomes of this type of testing.
- Use the “Approach for Stress Testing”
section to get an overview of the approach for stress-testing
a Web application, and as quick reference guide for you and your team.
- Use the various steps sections to understand the details
of each step involved in stress-testing a Web application.
- Use the “Usage Scenario for Stress Testing” section to
understand various real-world scenarios where stress testing is employed.
Input
To perform stress testing, you are likely to use as
reference one or more of the following items:
- Results from previous stress tests
- Application usage characteristics (scenarios)
- Concerns about those scenarios under extreme conditions
- Workload profile characteristics
- Current peak load capacity (obtained from load testing)
- Hardware and network architecture and data
- Disaster-risk assessment (e.g., likelihood of blackouts,
earthquakes, etc.)
Output
Output from a stress test may include:
- Measures of the application under stressful conditions
- Symptoms of the application under stress
- Information the team can use to address robustness,
availability, and reliability
Approach for Stress Testing
The following steps are involved in stress-testing a Web
application:
- Step1 - Identify test objectives. Identify the
objectives of stress testing in terms of the desired outcomes of the
testing activity.
- Step 2 - Identify key scenario(s). Identify the
application scenario or cases that need to be stress-tested to identify
potential problems.
- Step 3 - Identify the workload. Identify the
workload that you want to apply to the scenarios identified during the
“Identify objectives” step. This is based on the workload and peak load
capacity inputs.
- Step 4 - Identify metrics. Identify the metrics
that you want to collect about the application’s performance. Base these
metrics on the potential problems identified for the scenarios you
identified during the “Identify objectives” step.
- Step 5 - Create test cases. Create the test cases in
which you define steps for running a single test, as well as your expected
results.
- Step 6 - Simulate load. Use test tools to simulate
the required load for each test case and capture the metric data results.
- Step 7 - Analyze results. Analyze the metric data
captured during the test.
These steps are graphically represented below; the following
sections discuss each step in detail.
.gif)
Figure 18.1 Stress Testing Steps
Step 1 - Identify Test Objectives
Asking yourself or others the following questions can help
in identifying the desired outcomes of your stress testing:
- Is the purpose of the test to identify the ways the
system can possibly fail catastrophically in production?
- Is it to provide information to the team in order to
build defenses against catastrophic failures?
- Is it to identify how the application behaves when
system resources such as memory, disk space, network bandwidth, or processor
cycles are depleted?
- Is it to ensure that functionality does not break under
stress? For example, there may be cases where operational performance
metrics meet the objectives, but the functionality of the application is
failing to meet them — orders are not inserted in the database, the
application is not returning the complete product information in searches,
form controls are not being populated properly, redirects to custom error
pages are occurring during the stress testing, and so on.
Step 2 - Identify Key Scenario(s)
To get the most value out of a stress test, the test needs
to focus on the behavior of the usage scenario or scenarios that matter most to
the overall success of the application. To identify these scenarios, you
generally start by defining a single scenario that you want to stress-test in
order to identify a potential performance issue. Consider these guidelines when
choosing appropriate scenarios:
- Select scenarios based on how critical they are to overall
application performance.
- Try to test those operations that are most likely to
affect performance. These might include operations that perform intensive
locking and synchronization, long transactions, and disk-intensive
input/output (I/O) operations.
- Base your scenario selection on the specific areas of your
application identified as potential bottlenecks by load-testing data. Although
you should have fine-tuned and removed the bottlenecks after load testing,
you should still stress-test the system in these areas to verify how well
your changes handle extreme stress levels.
Examples of scenarios that may need to be stress tested separately
from other usage scenarios for a typical e-commerce application include the
following:
- An order-processing scenario that updates the inventory
for a particular product. This functionality has the potential to exhibit
locking and synchronization problems.
- A scenario that pages through search results based on user
queries. If a user specifies a particularly wide query, there could be a
large impact on memory utilization. For example, memory utilization could
be affected if a query returns an entire data table.
Step 3 - Identify the Workload
The load you apply to a particular scenario should stress
the system sufficiently beyond threshold limits to enable you to observe the
consequences of the stress condition. One method to determine the load at which
an application begins to exhibit signs of stress is to incrementally increase
the load and observe the application behavior under various load conditions.
The key is to systematically test with various workloads until you create a
significant failure. These variations may be accomplished by adding more users,
reducing delay times, adding or reducing the number and type of user activities
represented, or adjusting test data.
For example, a stress test could be designed to simulate
every registered user of the application attempting to log on during one
30-second period. This would simulate a situation where the application
suddenly became available again after a period of downtime and all users were
anxiously refreshing their browsers, waiting for the application to come back
online. Although this situation does not occur frequently in the real world, it
does happen often enough for there to be real value in learning how the
application will respond if it does.
Remember to represent the workload with accurate and
realistic test data — type and volume, different user logins, product IDs,
product categories, and so on — allowing you to simulate important failures
such as deadlocks or resource consumption.
The following activities are generally useful in identifying
appropriate workloads for stress testing:
- Identify the distribution of work. For each key
scenario, identify the distribution of work to be simulated. The
distribution is based on the number and type of users executing the
scenario during the stress test.
- Estimate peak user loads. Identify the maximum
expected number of users during peak load conditions for the application.
Using the work distribution you identified for each scenario, calculate
the percentage of user load per key scenario.
- Identify the anti-profile. As an alternative, you
can start by applying an anti-profile to the normal workload. In an anti-profile,
the workload distributions are inverted for the scenario under
consideration. For example, if the normal load for the order-processing
scenario is 10 percent of the total workload, the anti-profile would be 90
percent of the total workload. The remaining load can be distributed among
the other scenarios. Using an anti-profile can serve as a valuable
starting point for your stress tests because it ensures that the critical
scenarios are subjected to loads beyond the normal load conditions.
Step 4 - Identify Metrics
When identified and captured correctly, metrics provide
information about how well or poorly your application is performing as compared
to your performance objectives. In addition, metrics can help you identify
problem areas and bottlenecks within your application.
Using the desired performance characteristics identified during
the “Identify objectives” step, identify metrics to be captured that focus on
potential pitfalls for each scenario. The metrics can be related to both
performance and throughput goals as well as providing information about
potential problems; for example, custom performance counters that have been
embedded in the application.
When identifying metrics, you will use either direct
objectives or indicators that are directly or indirectly related to those
objectives. The following table describes performance metrics in terms of related
performance objectives.
|
Performance metrics
|
Category
|
|
Base set of metrics
|
|
|
Processor
|
|
|
Process
|
- Memory consumption
- Processor utilization
- Process recycles
|
|
Memory
|
- Memory available
- Memory utilization
|
|
Disk
|
|
|
Network
|
|
|
Transactions/business metrics
|
- Transactions/sec
- Transactions succeeded
- Transactions failed
- Orders succeeded
- Orders failed
|
|
Threading
|
- Contentions per second
- Deadlocks
- Thread allocation
|
|
Response times
|
|
Step 5 - Create Test Cases
Identifying workload profiles and key scenarios generally
does not provide all of the information necessary to implement and execute test
cases. Additional inputs for completely designing a stress test include
performance objectives, workload characteristics, test data, test environments,
and identified metrics. Each test design should mention the expected results
and/or the key data of interest to be collected, in such a way that each test
case can be marked as a “pass,” “fail,” or “inconclusive” after execution.
The following is an example of a test case based on the
order-placement scenario.
Test 1 – Place Order Scenario
- Workload: 1,000 simultaneous users.
- Think time: Use a random think time between 1 and
10 seconds in the test script after each operation.
- Test Duration: Run the test for two days.
Expected results:
- Application hosting process should not recycle because of
deadlock or memory consumption.
- Throughput should not fall below 35 requests per second.
- Response time should not be greater than 7 seconds for 95
percent of total transactions completed.
- “Server busy” errors should not be more than 10 percent of
the total response because of contention-related issues.
- Order transactions should not fail during test execution.
Database entries should match the “Transactions succeeded” count.
Step 6 - Simulate Load
After you have completed the previous steps to an
appropriate degree, you should be ready to simulate the load executing the
stress test. Typically, test execution follows these steps:
- Validate that the test environment matches the
configuration that you were expecting and/or designed your test for.
- Ensure that both the test and the test environment are
correctly configured for metrics collection.
- Before running the test, execute a quick “smoke test” to
make sure that the test script and remote performance counters are working
correctly.
- Reset the system (unless your scenario is to do otherwise)
and start a formal test execution.
Note: Make sure that the client (a.k.a. load
generator) computers that you use to generate load are not overly stressed. Utilization
of resources such as processor and memory should remain low enough to ensure
that the load-generation environment is not itself a bottleneck.
Step 7 - Analyze Results
Analyze the captured data and compare the results against
the metric’s accepted level. If the results indicate that your required
performance levels have not been attained, analyze and fix the cause of the
bottleneck. To address observed issues, you might need to do one or more of the
following:
- Perform a design review.
- Perform a code review.
- Run stress tests in environments where it is possible to
debug possible causes of failures, during test execution.
In situations where performance issues are observed, but
only under conditions that are deemed to be unlikely enough to warrant tuning
at the current time, you may want to consider conducting additional tests to
identify an early indicator for the issue in order to avoid unwanted surprises.
Usage Scenarios for Stress Testing
The following are examples of how stress testing is applied
in practice:
- Application stress testing. This type of test
typically focuses on more than one transaction on the system under stress,
without the isolation of components. With application stress testing, you
are likely to uncover defects related to data locking and blocking,
network congestion, and performance bottlenecks on different components or
methods across the entire application. Because the test scope is a single
application, it is common to use this type of stress testing after a
robust application load-testing effort, or as a last test phase for
capacity planning. It is also common to find defects related to race
conditions and general memory leaks from shared code or components.
- Transactional stress testing. Transactional stress
tests aim at working at a transactional level with load volumes that go
beyond those of the anticipated production operations. These tests are
focused on validating behavior under stressful conditions, such as high
load with same resource constraints, when testing the entire application.
Because the test isolates an individual transaction, or group of
transactions, it allows for a very specific understanding of throughput
capacities and other characteristics for individual components without the
added complication of inter-component interactions that occurs in testing
at the application level. These tests are useful for tuning, optimizing,
and finding error conditions at the specific component level.
- Systemic stress testing. In this type of test,
stress or extreme load conditions are generated across multiple
applications running on the same system, thereby pushing the boundaries of
the applications’ expected capabilities to an extreme. The goal of
systemic stress testing is to uncover defects in situations where
different applications block one another and compete for system resources
such as memory, processor cycles, disk space, and network bandwidth. This
type of testing is also known as integration stress testing or consolidation
stress testing. In large-scale systemic stress tests, you stress all
of the applications together in the same consolidated environment. Some
organizations choose to perform this type of testing in a larger test lab
facility, sometimes with the hardware or software vendor’s assistance.
Exploratory Stress Testing
Exploratory stress testing is an approach to
subjecting a system, application, or component to a set of unusual parameters
or conditions that are unlikely to occur in the real world but are nevertheless
possible. In general, exploratory testing can be viewed as an interactive
process of simultaneous learning, test design, and test execution. Most often,
exploratory stress tests are designed by modifying existing tests and/or
working with application/system administrators to create unlikely but possible
conditions in the system. This type of stress testing is seldom conducted in
isolation because it is typically conducted to determine if more systematic stress
testing is called for related to a particular failure mode. The following are
some examples of exploratory stress tests to determine the answer to “How will
the system respond if…?”
- All of the users logged on at the same time.
- The load balancer suddenly failed.
- All of the servers started their scheduled virus scan at
the same time during a period of peak load.
- The database went offline during peak usage.
Summary
Stress testing allows you to identify potential application issues
that surface only under extreme conditions. Such conditions range from exhaustion
of system resources such as memory, processor cycles, network bandwidth, and
disk capacity to excessive load due to unpredictable usage patterns, common in
Web applications.
Stress testing centers around objectives and key user
scenarios with an emphasis on the robustness, reliability, and stability of the
application. The effectiveness of stress testing relies on applying the correct
methodology and being able to effectively analyze testing results. Applying the
correct methodology is dependent on the capacity for reproducing workload
conditions for both user load and volume of data, reproducing key scenarios,
and interpreting the key performance metrics.
.gif)