Skip to main content

Analyze Driver Performance

Updated: March 2, 2004


Note: Kernrate is deprecated and will be removed from future releases of Windows. Instead, use the Windows Performance Analysis Tools. For more information, see the Windows Performance Analysis Developer Center.

On This Page


Profile Driver Performance Using Kernrate Profile Driver Performance Using Kernrate

Analyze Kernrate Data Using KrView Analyze Kernrate Data Using KrView

Best Practices for Using Kernrate and KrView Best Practices for Using Kernrate and KrView

Download KrView and Kernrate Download KrView and Kernrate

Other Performance Analysis Tools Other Performance Analysis Tools


This paper provides information about using Kernrate and KrView on the Microsoft Windows family of operating systems. Driver writers can use these tools to collect and analyze driver performance data.

The first step towards improving any driver's performance is to analyze that driver's performance. If you don't know where your driver is spending time, you can't effectively tune it. In general, you should focus your tuning efforts on the most frequently exercised and most time-consuming code paths. One way to identify these areas is to gather data with Kernrate and analyze it with KrView.

  • Identify CPU usage patterns in kernel-mode or user-mode code.

  • Determine which functions consume the most CPU time and are therefore candidates for performance tuning.

  • Collect data for individual processors in a multiprocessor system.

  • Separate kernel-mode and user-mode usage.

Kernrate is a command-line tool that supports all Windows hardware architectures and runs on Windows 2000 and later versions. Kernrate is shipped with the Microsoft Windows Server 2003 Resource Kit.

KrView is a companion tool that organizes the Kernrate data and displays it graphically in Microsoft Office Excel spreadsheets. KrView requires a version of Kernrate that supports the -yr option. Microsoft Excel must be installed on the computer on which you run KrView, but is not required on the computer on which you run Kernrate. (The current version of KrView requires Microsoft Office Excel XP or later. Support for Microsoft Office Excel 2000 is planned but is not yet available.)

Profile Driver Performance Using Kernrate

Before running Kernrate, make sure that you have the symbol table (.pdb file) for the current build of your driver. Kernrate uses the symbol table to map addresses to the names of individual functions.

If possible, you should also have the symbol tables for any other modules your diver calls or is called by. Having these symbol tables lets you find out which functions are calling your driver the most and which external functions your driver uses. For example, if your driver frequently acquires system spin locks, the time spent in those locking functions will be assigned to the system functions associated with them, and not to your driver. After changing your driver, you can compare “before” and “after” profiles to see the effects in other modules as well as the direct effects in the driver itself.

Kernrate can gather data on the entire system or on one or more specific processes, depending on the options you specify. It can also “zoom in” on one or more modules to generate detailed, function-level data.

Kernrate divides the address space of each module into “buckets” and then records CPU event occurrences within each bucket. After profiling is complete, Kernrate translates the bucket addresses into symbols. Using KrView, you can see how much CPU time each function in the module consumed.

To get an accurate picture of driver performance, consider profiling in stages:

  1. Create a baseline profile by running Kernrate on a standard system configuration with your driver installed. This approach will give you a general picture of system performance at module level for the kernel, hardware application layer (HAL), and drivers.

  2. Refine the profile by zooming in on CPU-intensive modules (including those in your driver) to identify any CPU-intensive functions.

  3. If you find routines that consume an inordinate amount of CPU time, further investigate these routines by reducing the bucket size to help you identify hot spots, such as loops.

As you identify potential performance issues, try running Kernrate on a variety of test machines and with a variety of hardware and software configurations. Performance characteristics can change significantly from system to system because of interactions among components. What looks like a bottleneck in one configuration might disappear in another. As a general rule, the more data you gather, the clearer picture you'll have of how your driver performs. Use KrView's comparison feature to compare profiles made on different test configurations.

Kernrate has numerous options that control the extent and detail of the generated profile. (For a complete list, see the Kernrate documentation.) Options that are particularly useful in profiling drivers are listed below.

OptionDescription

-a

Generates profile data for both user mode and kernel mode. Required when profiling both the system and one or more user-mode processes.

-b

Sets the size of a bucket, in bytes. The default is 16 bytes and the minimum is 4 bytes. Bucket size must be a power of 2.

-f

Processes the collected data at high priority. This option does not affect the gathering of data, but can speed the processing of the data on a busy machine after Kernrate profiling is complete.

-j “symbolpath

Specifies the path to the symbol file; the quotation marks are recommended required. By default, Kernrate uses the debugger's path to the binary file and looks for a matching symbol file. With this option, Kernrate looks first in symbolpath.

-m

Generates per-CPU profiles on multiprocessor machines. By default, Kernrate generates an average profile for all CPUs.

-n processname

Profiles the first 8 processes named processname. Specify multiple -n options to profile multiple processes with different names.

-o processname {cmdline}

Creates and profiles processname, passing cmdline as parameters for process creation.

-p PID

Profiles the process with process ID PID. Specify multiple -p options to profile multiple processes.

-r

Gathers raw data for each bucket in zoomed modules. If possible, gets source line information for each bucket.

-rd

Gathers raw data for each bucket in zoomed modules and provides disassembly information in hexadecimal. If possible, gets source line information for each bucket.

-s

Specifies the duration of the profile, in seconds.

-ts

Includes a summary of kernel-mode and user mode CPU usage for all processes and lists the services in each process.

-v 1 -v 2 OR
-v 3

Reports the path that Kernrate used to load the symbols (-v 1), the extent to which the buckets are shared, and the percentage of CPU usage system-wide (-v 2). (Values for the -v option are ORed together; thus -v 3 is equivalent to -v 1 -v 2.)

-x

Reports on critical sections (on a per-process basis) and on executive resources (system-wide) that have a high contention rate.

-yr filename.kv

Creates an output file for processing by KrView. Required for using KrView.

-z modulename

Specifies the name of a module to zoom in on. Specify multiple -z options to zoom on multiple modules.


Use a command line like the following to gather a baseline system profile on a multiprocessor system:

kernrate  -v 3  -b 4  -f  -m  -ts  -yr baseline.kv

This command runs Kernrate on a multiprocessor system and:

  • Reports the path to the symbol file, percentage of CPU usage, and detailed information about each bucket (-v 3).

  • Sets bucket size to 4 bytes for zoomed modules (-b 4).

  • Processes the collected data at high priority (-f).

  • Gathers profiling data for each processor (-m).

  • Includes a summary of kernel-mode and user-mode CPU usage and lists the services in each process (-ts).

  • Generates an output file named baseline.kv for input to KrView.

After you create the baseline profile for the system, you can use a command line like the following to zoom in on modules within your driver:

















kernrate -v 3 -b 4 -f -m -ts j "c:\mydriver\symbols" 















-yr driverbase.kv -z driverio -z driverpnp -z driverwmi 















This command line causes Kernrate to gather the same profile data as the preceding example. In addition, it looks for the symbol table for the driver in the directory c:\mydriver\symbols and gathers detailed data for the modules “driverio,” “driverpnp,” and “driverwmi” that are loaded by the system.

Analyze Kernrate Data Using KrView

After you collect Kernrate data, you can use KrView to analyze it. KrView can display data from a single Kernrate sample or can compare up to 64 Kernrate samples at once.

KrView consists of an Excel workbook that defines macros for organizing and displaying Kernrate data. The KrView installation program adds the KrView toolbar to the Excel interface; you can then run KrView by either clicking on the KrView toolbar icon within Excel or by opening the KrView.xls workbook. The KrView.xls workbook sets defaults that apply each time you run KrView.

When you process a Kernrate output file, KrView creates a new workbook that contains the processed data. The number and contents of the worksheets in the workbook depend on the options you specified when you ran Kernrate.

Suppose you run Kernrate with the command line shown in the preceding example:

















kernrate -v 3 -b 4 -f -m -ts j "c:\mydriver\symbols" 















-yr driverbase.kv -z driverio -z driverpnp -z driverwmi 















When KrView processes the input file driverbase.kv, it creates an Excel workbook named driverbase_KrView.xls by default. The workbook includes worksheets that summarize module data for both user-mode and kernel-mode (privileged) modules and for the top “hot” (that is, most CPU-intensive) modules in the profile. If the profile covers more than one CPU, as in the example, each of the summary worksheets includes a Per CPU Detail button. Click this button to show CPU usage split out by CPU number.

For driver writers, the individual worksheet for each zoomed module is of primary interest. In this example, these worksheets are named Driverio, Driverpnp, and Driverwmi. Each displays the percentage of CPU time used by functions in that module. By choosing from the drop-down menu at the upper right on one of these worksheets, you can change the number of functions shown in the chart. You can also use the pivot tables below the chart to add or remove data for individual functions or CPUs from the chart.

Clicking on Raw Data at the top of any chart displays the raw profiling data used to build the chart. You can use this information to determine if the profile size is adequate. If you see that a function takes much less CPU time than you expected, check the number of hits reported in the raw data. If the number of hits is relatively small, you might try changing the sampling rate to get a more adequate profile. You might also try running the profile with a different mix of applications.

Note
Kernrate can generate profiling data down to a resolution of 4 bytes. That is, it can divide the address space into 4-byte units and record hits within each 4-byte range. However, KrView does not yet have the capability to display at resolutions smaller than a function. Matching the bucket addresses to specific lines of code within a function requires getting an assembly output when you compile, using the debugger, or disassembling the binary.

Best Practices for Using Kernrate and KrView

  • Always have available the symbol table for the current driver binary and as many other modules as possible. Put the driver binary and symbol (.pdb) file in the same directory and use the -j option to specify the path to this directory. This will help ensure that Kernrate uses the correct executable and symbol file pair for the driver build being profiled.

  • Profile the entire system before focusing on your driver. Doing so can help you determine whether performance issues are occurring independent of your driver and can possibly help you later to identify interactions between your driver and other system components or applications.

  • Use the -v 1 and -v 2 options (or combine them into -v 3) to get the percentage of total CPU usage for each module, the proportion of buckets that are shared among functions, and the location at which Kernrate found the executable and symbol (.pdb) files that it used to resolve symbols.

  • Set a small bucket size to reduce the chance that a bucket contains hits for more than one function.

  • To get the best overall picture of driver performance, run Kernrate against as many driver operations and hardware configurations as possible.

Download KrView and Kernrate

Download KrView, the latest version of Kernrate, and the documentation for both tools from KrView - the Kernrate Viewer. This download package includes versions of the Kernrate executable for Windows 2000 and both the 32-bit and 64-bit versions of Windows XP.

KrView is a new tool, and Microsoft encourages your feedback and suggestions. To respond, send email to krview@microsoft.com.

Other Performance Analysis Tools

  • Event tracing for Windows, supported on Windows 2000 and later systems. See “WPP Software Tracing” in the “Driver Development Tools”\Tools for Testing Drivers section of the Windows DDK and “Event Tracing” in the Performance Monitoring section of the Platform SDK.

  • Sysmon, available on Windows 2000 and later versions.

  • Perfmon, available on all Windows systems. On Windows 2000 and later, Sysmon replaces Perfmon, but Perfmon is provided for backwards compatibility with existing performance configuration files.

Rate: