Performance Tuning Overview
Performance tuning is the main activity associated with performance management. Reduced to its most basic level, tuning consists of finding and eliminating bottlenecks — a condition that occurs, and is revealed, when a piece of hardware or software in a server approaches the limits of its capacity.
Before starting the performance tuning cycle, you must do some preparatory work that establishes the framework for ongoing performance tuning activities. You should:
- Identify constraints — A site's business case determines priorities, which in turn establish boundaries. Constraints, such as maintainability and budget limits, are unalterable factors in search of higher performance. You must focus performance work on factors that are not constrained.
- Specify the load — This involves determining what services the site's clients require and the level of demand for those services. The most common metrics for specifying load are the number of clients, client think time (the amount of time between the receipt of a reply by a client and the subsequent submission of a new request), and load distribution (steady or fluctuating, average, and peak load).
- Set performance goals — Performance goals must be explicit, which involves identifying the metrics used for tuning as well as their corresponding benchmark values. Total system throughput and response time are two common metrics used to measure performance. After identifying the performance metrics, you must establish quantifiable and reasonable benchmark values for each one.
Note Because performance and capacity are so closely related, the constraints, load, and goals that you identify are also applicable to capacity planning.
After establishing the boundaries and expectations for performance tuning, you can begin the tuning cycle, which is an iterative series of controlled performance experiments.
The Tuning Cycle
Repeat the four phases of the tuning cycle shown below until you achieve the performance goals that you established prior to starting the tuning process.
The Collecting phase is the starting point of any tuning exercise. During this phase, you are simply gathering data with the collection of performance counters that you have chosen for a specific part of the system. These counters could be for the network, the server, or the back-end database.
Regardless of the part of the system you are tuning, you require a baseline measurement against which to compare performance changes. You need to establish a pattern of system behavior when the system is idling as well as when the system is executing specific tasks. Therefore, you can use your first data-gathering pass to establish a baseline set of values for the system's behavior. The baseline establishes the typical counter values that you would expect to see when the system is behaving satisfactorily.
Note Baseline performance is a subjective standard — you must set a baseline that is appropriate for your work environment and that best reflects your system's workload and service demands.
After you have collected the performance data that you require for tuning the selected part of the system, you need to analyze the data to determine bottlenecks. Remember, a performance number is only an indicator — it does not necessarily identify the actual bottleneck because you can trace a performance problem back to multiple sources. It is also common for problems in one system component to result from problems in another component. A memory shortage is the best example of this; it is indicated by increased disk and processor use.
The following points, taken from the Microsoft Windows 2000 Resource Kit, provide guidelines for interpreting counter values and eliminating false or misleading data that might cause you to set inappropriate target values for tuning.
- Monitoring processes of the same name — Watch for unusually large values for one instance and not the other. Sometimes, the System Monitor misrepresents data for separate instances of processes with the same name by reporting the combined values of the instances as the value of a single instance. You can work around this by tracking processes by the process identifier.
- Monitoring several threads — When you are monitoring several threads and one of them stops, the data for one thread might appear to be reported for another. This is because of the way threads are numbered. You can get around this by including the thread identifiers of the process's threads in your log or display. Use the Thread/Thread ID counter for this purpose.
- Intermittent spikes in data values — Do not give too much weight to occasional spikes in data. These might be due to the startup of a process and are not an accurate reflection of counter values for that process over time. Counters that average, in particular, can cause the effect of spikes to linger over time.
- Monitoring over an extended period — We recommend using graphs instead of reports or histograms because the latter views only show the last values and averages. As a result, you might not get an accurate picture of values when you are looking for spikes.
- Excluding start-up events — Unless you have a specific reason for including start-up events in your data, exclude these events because the temporarily high values they produce tend to skew overall performance results.
- Zero values or missing data — Investigate all occurrences of zero values or missing data. These can hamper your ability to establish a meaningful baseline.
After you have collected your data and completed the analysis of the results, you can determine which part of the system is the best candidate for a configuration change and implement this change.
The cardinal rule for implementing changes is: implement only one configuration change at a time. A problem that appears to relate to a single component might be the result of bottlenecks involving several components. For this reason it is important to address problems individually. If you make multiple changes simultaneously, it may be impossible to accurately assess the impact of each change.
After implementing a configuration change, you must complete the appropriate level of testing to determine the impact of the change on the system that you are tuning. At this point, it is a matter of determining whether or not the change:
- Improved performance — Did the change improve performance, and if so, by how much?
- Degraded performance — Did the change cause a bottleneck somewhere else?
- Had no impact on performance — Did the change have any noticeable impact at all on performance?
If you are lucky and performance improves to the anticipated level, you can quit. If not, you must step through the tuning cycle again.
Tip You can obtain the monitoring results produced by your testing from monitoring log files — which are exportable to Microsoft Excel — and the Event log.
When testing, be sure to:
- Check the correctness and performance of the application that you are using for testing by looking for memory leaks and inordinate delays in response to client requests.
- Ensure that all tests are working correctly.
- Make sure you can repeat all tests by using the same transaction mix and the same clients generating the same load.
- Document changes and results.
Performance | Testing for Performance | Logging Application, Server, and Security Events | Monitoring Performance Thresholds | Building, Debugging, and Testing | Contention Analysis for Web Server Performance