TN_1200: Using the Visual Studio Team System Profiler: Summary View
Ian Huff, Software Design Engineer
Microsoft Corporation
Included with Visual Studio Team System Developer (and Suite) edition is a powerful new profiler for finding performance issues in your native, managed or ASP.NET applications. The profiler can run in both sampling mode (which looks at program state in some periodic cycle) and instrumentation mode (which looks at every function exit and entry point). The performance sessions that the profiler generates have several different views to help you to diagnose performance issues. This TechNote will take a look at the information that you can glean from the summary view of the performance report.
For a demo application, I downloaded a rational numbers class off of GotDotNet. This class has a function to create and to factor several large rational numbers. The first step was to add the performance session to the solution. To do this, click the “Performance Wizard…” option under Tools->Performance Tools. This will bring up the wizard where you can select the rational number class as the profiling target and select the profiling method (don’t worry about method for now as we can change it later). On completing the wizard you will see the performance explorer open with the rational number class selected to be profiled. For my profiling scenario I launched the application by clicking the launch button in the top-left of the performance explorer, clicked the performance button (in the launched application, not in the IDE) to launch the performance function above and then I closed the app after the results from the performance function were reported. If you want to, you can download the project and add a performance session to it to follow along with the analysis in this TechNote.
After running your first profiling session, you will now have a .VSP report file in your performance explorer window under the “Reports” folder. The IDE will automatically open the VSP file for you after running a performance session or, alternatively, you can open up the file manually by double clicking on it. While opening the VSP file, a progress bar will keep you informed of the file loading progress. VSP files, especially those from instrumentation mode, can be very large, so if the files are taking longer to load then you like, switch to sampling mode or choose a smaller scenario to profile.
The VSP file will open up like a new source file page in the IDE (shown below). There is a selection of different analysis views selectable by buttons along the bottom of the VSP file. The first view that will be opened by default is the summary view. The summary view provides you with a quick and easy way to see where the biggest performance issues are cropping up in your program. The summary view (shown below) will look different in instrumentation mode, sampling mode and object allocation mode so I will cover the three different modes separately.
.jpg)
When first profiling an application it is often best to start out in sampling mode. Sampling mode will perturb your application less, and since it only collects application state at specific intervals (as opposed to collecting at every function entry and exit point like instrumentation) it collects a much smaller amount of data. By sampling first you can narrow down the number of binaries that you need to look at in instrumentation mode. The screen below shows the summary view from a typical sampling run of our performance session.
.jpg)
The summary view for sampling lists the top three (this number can be adjusted in the Tools->Options->Performance page) functions in terms of inclusive samples and exclusive samples. If you are unfamiliar with inclusive and exclusive, an inclusive sample is a sample that was in the listed function or one of its sub-functions, while an exclusive sample is a sample that was taken just in the listed function. While seeing the inclusive numbers can be useful, the first step when looking for a performance issue is often to look at the function with the highest number of exclusive samples; that function is where most of your processing power is going. By looking at the top exclusive functions, we see that the Rational.reduce function is the big offender, as almost eighty percent of the samples taken were taken in that function. The only real thing that the top inclusive functions list tells us is that all 847 of the samples came in some sub-function of the PerfButton_Click function. This lets us know that we are getting very few samples in areas outside of the performance scenario that we wanted to profile.
In instrumentation mode, the summary view contains three different sections “most called functions,” “functions with most individual work” and “functions taking longest.” The “most called functions” section lists the functions that were called the highest number of times, and lists the total number of calls for each of them. “Functions with most individual work” details the functions that have the most time spent inside that function and not in any sub-function (since we capture all function entries and exits in instrumentation mode we can tell the exact time spent in a function). And finally, “functions taking longest” shows the functions that took the most time while including all time spent in their sub-functions (which is more like inclusive mode for sampling). The picture below shows the instrumentation summary view for our demo application. The nice thing about instrumentation is that it captures every function entry and exit, so you get absolute data to analyze, not just samples.
.jpg)
Looking at “functions with most individual work,” we can confirm what we learned from sampling, that Rational.reduce is the function that is using the most CPU time. But we also can glean another useful bit of information from this page that we couldn’t see in sampling mode. We can see in “most called functions” that System.Array.GetUpperBound is being called almost 87,000 times, much more then any other function. Also you may notice the 0x2b000001 listed at the bottom of functions with most individual work. This is simply an unresolved function as I did not have all of my symbol paths configured correctly before this profiling run.
The final type of summary view is the summary view for runs of the profiler that have object allocation collection enabled. Object allocation collection only works for .NET applications, and it captures all allocations and de-allocations of managed types and can tell you which function performed the allocation. You can turn on object allocation by right-clicking on a performance session and checking the collection option on the first property page. Pictured below is a view of the summary screen for an object allocation run. First off, it listed the functions that allocated the highest percent of bytes compared to the total number of bytes allocated during the profiling run. Then it listed the types that allocated the highest total percent of bytes. And finally, it lists the types that had the most instances allocated during the profiling run. From this view we can see that most of our memory space is being used by strings and string builders in this profiling run. If we are running into memory issues, we can use some of the more detailed allocation views to see who is allocating all of the strings.
.jpg)
So by using the summary views we were able to isolate two possible performance issues in our code. First, we learned that Rational.reduce is the function that takes up the majority of processing time. Second, we learned that System.Array.GetUpperBound is being called a very large number of times. These are the type of top level performance issues that you can find with summary view without having to drill into any of the more detailed views of the performance report.