Understanding Performance Terms
Before you use performance tools to detect and correct performance bottlenecks in your application, you should familiarize yourself with the various terms used to describe data collected during profiling. The performance data gathered depends on the profiling method you choose - sampling or instrumentation.
Sampling, in which the application is periodically interrupted, provides the advantage of low overhead, which allows the application to behave closer to how it would behave in the real world. During sampling, the performance data collection infrastructure periodically interrupts the application as it runs to determine what function is running and increments the sample count of that function. It stores information about the call stack leading up to the function call. The drawback to this approach is that it can get only relative performance data for the functions that were sampled. It is possible that a function you wanted to sample did not get sampled and therefore, no information is available about it.
You must choose appropriate sampling event to be used during sampling. For example, CPU cycle sample event only shows the locations in the application that consume CPU cycles. If the application is blocked waiting for disk, network, page fault, etc. the resulting information will not be helpful in determining the actual problem. To detect problems with page faults, use page fault as the sample event.
Instrumentation provides the advantage of gathering exact performance data for specific sections of the application. During instrumentation, enter and exit probes are inserted into the application's functions. These probes report back to the data collection infrastructure and allow users to capture exact amounts of time and other metrics that a function took to run.
Probes are not inserted in inline functions - functions compiled with. Therefore, number of calls in the report will not agree with the actual number of times the code block executed. To determine the exact number of times that code block executed, compile code with . However, compiling code using this option will impact code optimization.
After you profile your application, a report is generated. The performance report file contains the data collected during profiling. The following list provides terms that you will need to understand before analyzing the report:
Application time shows the time spent in the direct execution of the profiled code. It excludes performance data that contain calls to the operating system and time that was spent waiting for other threads to execute (transition events).
Elapsed time shows the total system time spent executing the profiled code. It includes performance data that contain transition events.
The term exclusive refers to only those samples taken in the function, and does not include samples taken in other functions called by it.
The term inclusive refers to the samples taken in the function, and includes the samples taken in other functions called by it.
The number of transition events that occurred during profiling the application.
A change in the location of processor event execution between ring 3 (user mode) and ring 0 (kernel-mode). Transition events represent time spent outside the direct execution of the application code. Transition events can be time spent in threads that are not part of the profiled item, or time spent executing calls from the profiled item to the operating system.
Memory and type instances allocated during profiling the application. The two type of allocation reported are: exclusive and inclusive.
Bytes allocated during profiling the application. The two type of byte allocation reported are: exclusive and inclusive.
The term root refers to a function that calls and/or is called by one or more functions. Information about root functions appears in the Caller/Callee report and view. In the, root function is listed in the middle part of the view.