TN_1201: Using the Visual Studio Team System Profiler: Function View, Caller / Callee View and Calltree View
Ian Huff, Software Design Engineer
Microsoft Corporation
Included with Visual Studio Team System Developer (and Suite) edition is a powerful new profiler for finding performance issues in your native, managed or ASP.NET applications. The profiler can run in both sampling mode (which looks at program state in some periodic cycle) and instrumentation mode (which looks at every function exit and entry point). The performance sessions that the profiler generates have several different views to help you to diagnose performance issues. This TechNote will take a look at the information that you can glean from some of the more detailed views that the profiler provides.
For a demo application, I downloaded a rational numbers class off of GotDotNet. This class has a function to create and to factor several large rational numbers. The first step was to add the performance session to the solution. To do this, click the “Performance Wizard…” option under Tools->Performance Tools. This will bring up the wizard where you can select the rational number class as the profiling target and select the profiling method (don’t worry about method for now as we can change it later). On completing the wizard you will see the performance explorer open with the rational number class selected to be profiled. For my profiling scenario I launched the application by clicking the launch button in the top-left of the performance explorer, clicked the performance button (in the launched application, not in the IDE) to launch the performance function above and then I closed the app after the results from the performance function were reported. If you want to, you can download the project and add a performance session to it to follow along with the analysis in this TechNote.
After running your first profiling session, you will now have a .VSP report file in your performance explorer window under the “Reports” folder. The IDE will automatically open the VSP file for you after running a performance session or, alternatively, you can open up the file manually by double clicking on it. While opening the VSP file, a progress bar will keep you informed of the file loading progress. VSP files, especially those from instrumentation mode, can be very large, so if the files are taking longer to load then you like, switch to sampling mode or choose a smaller scenario to profile.
The different performance views can all be accessed by the buttons at the bottom of any open VSP file. The second view that you see in the VSP file (after summary view) is the function view. The function view contains a list of all the functions that were run (in instrumentation mode) or all the functions in which we sampled at any point (in sampling mode). In the columns next to each function we display different type of aggregate data for each function. Since the data in these columns is aggregated, you cannot find the execution time of any specific instance of a function. However, you can add max and min columns (Just right click on a column header and select “Add / Remove columns” to add or remove columns from the view) to find the maximum and minimum instance execution times for a function. Below I have a screenshot of function view in instrumentation mode, sorted descending by elapsed exclusive time. Elapsed exclusive time measures the time spent in just this function (inclusive also adds in time spent in subchildren) so the top function in exclusive time is your big CPU killer. The default columns provide a good set of columns to diagnose performance issues, especially elapsed exclusive time. Also, you can get a very large selection of other columns from the Add / Remove columns option such as the min and max columns mentioned above. Unlike summary view, which only shows the top functions in a few select view categories, function view shows all the functions and lets you sort by any available column. Also, if you only want to examine specific modules, the function view allows you to group the function by module, just right-click on any function and select “Group by Module” which will put all the functions under module nodes in the view.
.jpg)
By looking at this view we can see that Rational.Reduce and Array.GetUpperBound are our two main performance issues. They are at the top of the elapsed exclusive time heap and there are over 864,000 calls to Array.GetUpperBound. Perhaps it would help our analysis to see where the two trouble functions fit into the overall structure of the program. For this task, we will turn to the next two views, caller/callee view and calltree view.
Function view looks at aggregate data for a function over the programs entire run, while caller/callee view focuses on the functions that call a specific function (callers) and on the functions that the specific function calls (callees). Since we want to focus on our hotspots, right click on the Rational.reduce function in function view, and select “Show in caller/callee view.” Selecting this jumps you to the caller/callee view, with Rational.reduce selected as the current function, as shown in the screenshot below. The profiler does a good job of helping you to maintain your context while switching between views.
.jpg)
The caller/callee view is divided into three panes. The top pane contains all the callers of the current function. The middle pane is the currently selected function. Finally (and perhaps obviously), the bottom pane contains all the callees of the current function. In function view, the columns total all the data for the functions, but in caller/callee the columns contain only the data for the specific current function. So for the function Prime.IsSmallPrime in the bottom pane, the value of 14997 in number of calls is the number of times IsSmallPrime was called from Rational.reduce, not the total number of times it was called in the entire profiling run. Double-clicking a function in the top or bottom pane will select that as the new current function, so you can trace a specific execution path up or down the calltree. Looking at the caller / callee view for Rational.reduce we see that Array.GetUpperBound is always being called from Rational.reduce. Perhaps our performance issue is being caused by Array.GetUpperBound being called too many times from Rational.reduce?
The calltree view provides an alternative to caller / callee view for viewing profiling data along with program structure. In calltree view, the functions are placed in a tree structure corresponding to how they were called. Pictured below is the calltree for our example, with the nodes expanded to show the Rational.reduce and the Array.GetUpperBound function. Just as in caller/callee we can see that GetUpperBound is being called many times, and always from Rational.reduce. The call tree provides an easy way to browse the structure of a program, using the common UI paradigm of expanding tree nodes.
.jpg)
So now we think that we have isolated a possible performance issue, that being the number of times Array.GetUpperBound is called from the Rational.reduce function. Now if we want to jump to the possible problem code all we have to do is right click on any instance of Rational.Reduce in our views and select “View Code.” Doing this will jump us to the Rational.Reduce function in the correct code file; where we can then discover and fix our performance issue. In this case, the issue was a call to Array.GetUpperBound in Rational.Reduce being called every time the for loop executed, slowing down the program by around 150%.
Using the profiler views we were able to drill down and easily diagnose a performance issue in a code base that we were not familiar with. Try giving the profiler a run at your applications to see what it turns up.