Printer Friendly Version      Send     
Click to Rate and Give Feedback
Related Articles
This article presents an overview of the motivation behind new techniques that decompose problems into independent pieces for optimal use of parallel programming.

By David Callahan (October 2008)
We take a look at planned support for parallel programming for both managed and native code in the next version of Visual Studio.

By Stephen Toub and Hazim Shafi (October 2008)
Here we describe some of the more common challenges to concurrent programming and present advice for coping with them in your software.

By Joe Duffy (October 2008)
Here is an ASP.NET AJAX data-driven Web application that takes the best features from server- and client-side programming to deliver an efficient, user-friendly experience.

By Bertrand Le Roy (October 2008)
More ...
Articles by this Author
This time James McCaffrey sets up a virtual environment to use for configuration testing to introduce you to software configuration testing with Microsoft Virtual Server

By Dr. James McCaffrey (September 2008)
This month James McCaffrey builds a test harness for WCF applications that really puts them through the paces.

By Dr. James McCaffrey (July 2008)
Did you know you can use Windows PowerShell to perform lightweight request/response testing for an ASP.NET Web app? Here's how.

By Dr. James McCaffrey (May 2008)
Language Integrated Query makes lots of things easier. Here we put LINQ, or more specifically the LINQ to SQL provider, to use testing SQL stored procedures.

By Dr. James McCaffrey (April 2008)
Here we show you how to use Windows PowerShell to create quick and easy UI test automation for ASP.NET and classic ASP Web applications.

By Dr. James McCaffrey (March 2008)
In this month's column Dr. James McCaffrey describes some of the ways you can use the Visual Studio 2005 Team System to manage custom software test automation.

By Dr. James McCaffrey (Launch 2008)
James McCaffrey shows you how to get started with UI test automation using the new Microsoft UI Automation library.

By Dr. James McCaffrey (February 2008)
This installment of Test Run is a guide to using Windows PowerShell to perform ultra lightweight UI automation.

By Dr. James McCaffrey (December 2007)
More ...
Popular Articles
This article presents an overview of the motivation behind new techniques that decompose problems into independent pieces for optimal use of parallel programming.

By David Callahan (October 2008)
Microsoft Robotics Studio is not just for playing with robots. It also allows you to build service-based applications for a wide range of hardware devices.

By Sara Morgan (June 2008)
One-time passwords offer solutions to dictionary attacks, phishing, interception, and lots of other security breaches. Here's how it all works.

By Dan Griffin (May 2008)
If you're unfamiliar with Windows Presentation Foundation (WPF), building that first Silverlight custom control can be a daunting experience. This article walks through the process.

By Jeff Prosise (August 2008)
More ...
Read the Blog
Well designed code keeps things that have to change together as close together in the code as possible and allows unrelated things in the code to change independently, while minimizing duplication in the code. In the October 2008 issue of MSDN Magazine, Jeremy Miller shows you some design ...
Read more!
The process for ink capture and analysis on the Tablet PC is straightforward in managed code. To the uninitiated developer, however, creating unmanaged Tablet PC applications can be rather daunting. In the October 2008 issue of MSDN Magazine, Gus Class a quick introduction to the Tablet PC ...
Read more!
Multicore systems are becoming increasingly prevalent, but the majority of software today will not automatically take advantage of this additional processing ability. And multithreaded programming, for anything but the most trivial of systems, is incredibly difficult and error prone today. In the October 2008 issue of MSDN ...
Read more!
Concurrent programming is notoriously difficult, even for experts. You have all of the correctness and security challenges of sequential programs plus all of the difficulties of parallelism and concurrent access to shared resources. In the October 2008 issue of MSDN Magazine, David Callahan describes ...
Read more!
A major advantage of AJAX and Silverlight applications is that they can transparently and continuously interact with a back-end service. The problem is that they run over HTTP, which wasn't designed with security in mind. In the September 2008 issue of MSDN Magazine, Dino Esposito shows you ...
Read more!
Unhandled exception processing shouldn't be a mystery. It's actually quite useful since it gives a crashing application an opportunity to perform last-minute diagnostic logging about what went wrong. In the September 2008 issue of MSDN Magazine, Gaurav Khanna discusses how ...
Read more!
More ...
Testing
Perform Code Coverage Analysis with .NET to Ensure Thorough Application Testing
James McCaffrey

This article discusses:
  • The importance of code coverage analysis
  • A custom code coverage tool
  • The role of code coverage in the development process
  • Profiling in .NET
This article uses the following technologies:
C++, C#, Microsoft .NET Framework
The basic idea behind code coverage is straightforward. During product development, a large number of test cases are created and run to ferret out bugs in the system. Code coverage analysis monitors which parts of the product's code are exercised by the collection of test cases. For it to be effective, 100 percent of the product's code should be executed by the test cases. If there are segments of product code that are never run during testing, then the product has not been thoroughly tested.
Although the idea of code coverage is simple enough, actually performing code coverage analysis in a non-.NET environment can be very time consuming, difficult, and expensive. However, the Microsoft® .NET environment provides software developers and testers with a simple and effective new way to perform code coverage analysis. In the days before .NET, code coverage was one of the most frustrating parts of my job as a tester. Recently, though, I discovered that the .NET environment provides a much improved process for performing code coverage. In this article, I will provide you with all the code you need to perform .NET code coverage analysis. I call this system Fundamental Function code coverage. Adding code coverage analysis skills to your toolset will help you keep a watchful eye on your code, whether you are a developer, tester, or program manager.
In the sections that follow, I will walk through an example of code coverage at a high level so you can understand the principles involved. Then I will discuss the details of the implementation. I will conclude with a discussion of how code coverage analysis fits into the software product development cycle.

Overview of Code Coverage Analysis
Because traditional code coverage analysis is so tricky, I'll walk through a concrete example at a high level. Code coverage analysis has three phases: preparing the coverage environment, enabling coverage profiling and running tests, and then disabling coverage profiling and generating a report.
The product I will analyze is a Windows®-based application that references a dummy class library that contains several dummy constructors and methods. Figure 1 illustrates how my code coverage environment is prepared.
Figure 1 Methods Under Coverage 
First, you build a special profiling DLL which, when enabled later, will watch the execution of code through the common language runtime (CLR) and log all entries to specified methods in the product being tested. After creating the profiler, you identify which .NET assemblies you want to include under coverage. In this case, I'm looking at the DummyApp.exe and DummyLib.dll assemblies.
Next you determine which constructors, methods, and properties are in the assemblies under coverage. As you can see, there are 12 of these under coverage in Figure 1.
Figure 2 shows the second phase of code coverage: enabling the profiling DLL. Now any .NET managed code that runs in the environment will be monitored and recorded.
Figure 2 Enabling and Using a Profiler 
Next, you run either a single test case or a suite of tests. In this example, I just clicked the Methods button of the dummy application to simulate testing it. When performing code coverage analysis, I'm usually not concerned with whether tests pass or fail, but rather how much of the codebase of the product under test is touched by the tests. In other words, even if every test fails, but code coverage is 100 percent, I know I'll need to fix my code but I'm happy that all the methods in the product are being tested. That said, you need to be careful about test failures and their affects on code coverage. If you don't do coverage using the tests as they're supposed to run, you can't be sure that you covered the right code. After running the test, close the dummy application to flush to a text file all coverage information that the profiler gathered.
The third phase of code coverage is illustrated in Figure 3. I disabled coverage profiling so that it wouldn't continue gathering information. Then I ran the report generator, which collects the information about all methods that could potentially be touched (which was generated in Phase 1) and gathers the information about the methods that were actually touched (generated in Phase 2). It then compares these two lists to determine the resulting code coverage percentage. In this example, 8 of the potential 12 methods were exercised by the test, for a 67 percent coverage rate. I can use the report data to determine which methods have not been exercised and create new test cases.
Figure 3 Report Generation 
In other systems the preparation of code coverage, during which time you determine which methods you will monitor, and the reporting part, where you analyze the results, are often performed by two different programs. In the Fundamental Function system presented here, I put both of these functionalities into a single program named FFcover.exe and controlled them by -m and -r command-line arguments.
The system described in this article monitors code coverage at the method level. With it I determine if a particular constructor, method, or property has been entered, but I do not learn about the path of execution inside the method. More granular code coverage systems operate at lower levels. For example, one system called Basic Block coverage operates at the block level (testing whether execution enters a particular block of statements).

The Profiling DLL
As mentioned in the previous section, at the foundation of code coverage is a profiling DLL that can monitor all the activity that passes through the CLR. As you might expect, this is very sophisticated code. Fortunately, the .NET Framework comes with two complete example profiler source code sets. I was able to use one of them with very few changes to create my profiler—ProfilerFF.dll.
In the example shown in Figure 2, the profiler works behind the scenes to capture method names that were entered when the dummy test application ran in the coverage-enabled environment. That data is logged to a text file named ffcoverXXX.log, where the Xs represent a time-stamp value. One line of the resulting log file looks like the following:
0x00000d38;void DummyLib.Dummy::Bar(int32&,float64&,String[])
Here the data following the thread ID indicates that the Bar method in the Dummy class, which is part of the DummyLib namespace and which accepts three parameters and returns void, was entered during the test.
Let's examine the profiler. The base profiler source code from which I derived my code coverage profiler is located by default in the C:\Program Files\Microsoft Visual Studio .NET\FrameworkSDK\Tool Developers Guide\Samples\profiler\hst_profiler folder. HST stands for hot spot tracker to indicate that a DLL built from that code can monitor how much time is spent in methods.
I first created a C:\Coverage root folder on my machine and then copied the hst_profiler and Include folders to it. The build process will require that these two folders have the same parent folder. Next, I renamed the hst_profiler folder to FFprofiler to reflect its new functionality.
Before editing, I renamed three files in the FFprofiler folder: ProfilerHST.cpp to ProfilerFF.cpp, ProfilerHST.def to ProfilerFF.def, and ProfilerHST.mak to ProfilerFF.mak. It's clear that these three files ending with HST are specific to the timing profiler, so renaming them makes sense. The files ProfilerInfo.cpp, ProfilerInfo.h, ProfilerCallback.cpp, and ProfilerCallback.h are, for the most part, generic to any profiling DLL, so it is reasonable to leave their names alone. I also renamed EnableProfiler.bat to enable.bat to make it easier to type on a shell command line.
The first file I edited is ProfilerCallback.h. At the beginning of the file, I made a few minor changes:
extern const GUID __declspec( selectany ) CLSID_PROFILER = 
{ 0x5ac86959, 0x7927, 0x4177, { 0x98, 0x02, 0xa5, 0x40, 0xf1, 0x96, 
                                0x78, 0x74 } };

#define THREADING_MODEL     "Both"
#define PROGID_PREFIX       "ProfilerFF"
#define COCLASS_DESCRIPTION "Fundamental Function Code Coverage"
#define PROFILER_GUID       "{5AC86959-7927-4177-9802-A540F1967874}"
Every profiler has a GUID identifier so that different profilers can be active in the same environment, monitoring different CLR activity. I used the Visual Studio® .NET Create GUID tool to generate a new GUID and pasted it at the two places shown. I also changed the values of PROGID_PREFIX and COCLASS_DESCRIPTION to reflect code coverage functionality.
The second change I made was in the ProfilerFF.def definition file. Again, this was just a minor change to make it conform to the other edits, so that the first line below is changed to the second:
LIBRARY         ProfilerHST
LIBRARY         ProfilerFF
The third set of edits was in the ProfilerFF.mak makefile. I will cover those changes in the next section when I build the profiler DLL. The fourth and last file in folder ProfilerFF that I edited is ProfilerInfo.cpp. Most of the changes were in the FunctionTimingInfo::Dump function. The original Dump function outputs a semicolon-delimited text file containing the name of the function entered, the period of time the execution engine spent there, the number of times the function was entered, and other useful information. For most types of code coverage, you only need to know which methods were entered. In fact, additional information just makes parsing more difficult later. The portion of the code that records which methods were actually entered during a test in a coverage-enabled environment is shown in Figure 4.
The Dump code first checks to make sure that the pointer to the function name is not NULL, or in other words that there is a function name to log:
if ( pFunctionInfo->m_functionName[0] != NULL )
The next if conditional acts as a filter to specify which methods to log. The following code returns true only for functions (such as .NET methods, constructors, and properties) whose names begin with DummyApp or DummyLib:
wcsstr(pFunctionInfo->m_functionName,L"DummyApp") != NULL || 
     wcsstr(pFunctionInfo->m_functionName,L"DummyLib") != NULL )
Because the full name of a .NET method begins with its namespace, this filter will capture all methods in the DummyApp.exe and DummyLib.dll assemblies. If you don't add such a filter, the profiler will capture all system-related methods too, and there are a lot of them. This can, of course, be changed to suit your needs.
All actual logging is done using a LOG_TO_FILE macro defined in the Include\basehdr.h file. The following code writes an ID associated with the thread of execution to a text file specified in the file Include\basehlp.hpp (which I will examine in a moment):
LOG_TO_FILE( ("0x%08x;", m_win32ThreadID) )
Having a thread ID is useful when the application under test is multithreaded. It also gives you an easy way to determine if a line of output is actual data or other information such as comments.
It would be wonderful if you could identify a method using only its namespace prefix and its name, but because of overloading you need return type and parameter information to differentiate methods with the same name. Here's the code that logs the method return type and method name:
if ( pFunctionInfo->m_returnTypeStr[0] != NULL )               
    LOG_TO_FILE( ("%S ", pFunctionInfo->m_returnTypeStr) )

LOG_TO_FILE( ("%S(", pFunctionInfo->m_functionName) )
Dealing with data type names is a major issue. The problem is that there are three sets of type names: those used by the profiler (for example, int32), those used by the CLR and the .NET Framework (System.Int32), and those used by C# and other .NET-compliant languages (int). For now it is enough for you to realize that the profiler has one set of type names that will later have to be mapped to .NET type names.
The code that logs the function parameters uses an old C/C++ parsing technique, as shown here:
parameter = wcstok( pFunctionInfo->m_functionParameters, separator );
while ( parameter != NULL )
{
  LOG_TO_FILE( ("%S", parameter) )               
  parameter = wcstok( NULL, separator );
  if ( parameter != NULL )
    LOG_TO_FILE( (",") )
}  
The wcstok function is the wide-character version of the venerable strtok function. The code logs each parameter to a file, separated by the comma character. There is nothing particularly special about the comma—any character could have been used, but commas are standard delimiters.
Of the 11 files in the Include folder, only one needs to be edited. The basehlp.hpp file specifies the path and file name of the text file that the profiler writes the log data to. The original line was:
strcpy( logfile, "output.log" );
I replaced it with this:
time_t ltime;
time( &ltime );
char buffer[20];
_i64toa( ltime, buffer, 10 );
    
strcpy( logfile, "C:\\Coverage\\FFdataFiles\\ffcover" );
strcat( logfile, buffer );
strcat( logfile, ".log" );
The new code gets a Unix-style timestamp and uses it to create a file name like ffcover0123456789.log so that multiple log files can reside in the same folder. The location of the files is hardcoded, which is usually a bad idea but in this case leads to a simpler design.

Building ProfilerFF.dll
Building a DLL of any size is a nontrivial task. The system in this article is predominantly command-line oriented, so I will build my profiling DLL from the command line. Because the Visual Studio .NET GUI environment is so powerful, it is quite possible that you have never built a program or DLL this way, so I will explain the process in detail. As you'll see, it is quite easy once you've seen an example.
Although it is possible to invoke the compiler (cl.exe) and linker (link.exe) directly from the command line, there are so many options available that developers frequently employ a utility program to manage all of them. The nmake.exe program uses a data file (usually with a .mak extension) to determine the recipe for creating your project using build tools. Again, it is possible to call the nmake.exe program directly but there are many user environment variables, such as PATH and INCLUDE, that need to be set, so it is common to write a .bat file to manage them. In short, this .bat file sets up environment variables and calls nmake.exe, which in turn reads compiler options from a .mak file then calls the compiler and linker to produce the resulting DLL.
I created the short .bat file named build.bat with the key statements shown in Figure 5. The first two SET statements tell the shell environment where required files such as mspdb70.dll and executables such as nmake.exe reside. You will probably have to modify the part of the .bat file that sets user environment variables, depending on the PATH variable your shell inherits from the system environment variables. Try to build the DLL, and if you get an error message about a missing file, find the file and add another SET path statement.
The next three SET statements tell the shell where to look for files such as cor.h that are found in the various C++ #include statements. The final two SET statements tell the shell where library files used by the linker are located. Again, you may have to modify or add statements depending on your environment.
Finally, nmake.exe is called with the compiler and linker options specified in ProfilerFF.mak. The ProfilerFF.mak data file consists of 14 file references such as $(INTDIR)\ProfilerFF.obj, which specifies the location and name of the intermediate .obj file created during compilation.
With everything ready to go, you can build the code coverage profiling DLL from a command shell using this command:
C:\Coverage\FFprofiler>build.bat
You can see the shell command in Figure 1. Because nmake.exe looks for its makefile in the current directory, you need to invoke build.bat from the FFprofiler folder. It is unlikely that the build will work on your first attempt, but a few edits to the build.bat file as I've described will lead to a successful build.

Preparing for Code Coverage
Before you enable the code coverage profiler, you need to specify which .NET assemblies to include in the coverage monitoring, and determine which methods in those assemblies you want to watch. The profiling DLL will create a list of all methods that were actually entered, and you can compare it to the list you provided to determine your code coverage percentage.
My system manages these lists by storing data in simple text files. The first step is to manually create an ffcoverAssemblies.txt file and save it in the Coverage\FFdataFiles folder. For the example shown in Figure 1, Figure 2, and Figure 3, the contents of this file are:
C:\Coverage\AppsToTest\DummyApp\bin\Debug\DummyApp.exe
C:\Coverage\AppsToTest\DummyApp\DummyLib\bin\Debug\DummyLib.dll
These are the .NET assemblies that make up my system under test. Determining which assemblies are referenced in a large product can be a difficult task. You can use the .NET Framework Configuration tool, mscorcfg.msc, to determine assembly dependencies. You can also write a little utility program that uses the System.Assembly.GetReferencedAssemblies method to find dependencies.
After you have created the ffcoverAssemblies.txt file, the next step is to find all the methods, constructors, and properties in the assemblies. I put this code, along with code to do the reporting, in one project named FFcover. Here's where the powerful methods in the System.Reflection namespace come into play. The algorithm is shown in Figure 6 in C#-like pseudocode.
In principle this is easy, but there are a few tricks along the way. I decided to read all of the assemblies into an ArrayList container and work from it, rather than use a file-oriented approach:
ArrayList al = new ArrayList();  // assemblies under coverage
StreamReader sr = new 
    StreamReader("C:\\Coverage\\FFdataFiles\\ffcoverAssemblies.txt");

string line;
while ( (line = sr.ReadLine()) != null )  // read assemblies to test
{
  Assembly a = Assembly.LoadFrom(line.Trim());
  al.Add(a);  // store assembly in array list al
}
Notice that information such as file paths is hardcoded and error checking has been omitted; this has been done for simplicity and clarity only. In a production system, plenty of error checking would need to be added and the hardcoded strings should instead be passed as parameters.
I get the types (classes, enumerations, and arrays) in each assembly using the following loop:
foreach (Assembly a in al)
{
  Type[] types = a.GetTypes();
•••
Then for each class I get the constructors, methods, and properties that belong to the class. Because the code for getting constructor, method, and property information from a class is similar, let's look at the code for methods only. You start by setting up a BindingFlags variable that acts as a filter for the kinds of methods you want to catch, as you can see in the following code:
foreach (Type type in types)
{
  BindingFlags flags = 
  BindingFlags.Public | 
  BindingFlags.NonPublic | 
  BindingFlags.Instance | 
  BindingFlags.Static |
  BindingFlags.DeclaredOnly;
•••
Depending on your system, you may want to modify these flags to include or exclude different kinds of methods. Next, you get all methods in the current class into an array:
MethodInfo[] methodinfos = type.GetMethods(flags);
You then iterate through each method, building up its signature in the following format:
return-type Assembly.Class::MethodName(param1,param2,. . . )
Starting with an empty string, you append the return type, a blank space, the full name of the Type (for example, the class name preceded by namespace), and the method name:
foreach (MethodInfo method in methodinfos)
{
  string s = "" + method.ReturnType.FullName + " " + type.FullName + 
      "::" + method.Name + "(";
•••
After you get the method return type and name, you must get the parameter types. Remember that this is necessary because with overloading you can have multiple methods with the same name and return types, but with different parameter lists:
ParameterInfo[] parameterinfos = method.GetParameters();
foreach (ParameterInfo p in parameterinfos)
{
  if (p.ParameterType.FullName != "System.Decimal" &&
      p.ParameterType.FullName != "System.EventArgs")
  {
    s += p.ParameterType.FullName + ",";
  }
}
During the development of this code coverage system I noticed that the coverage profiler does not capture the System.Decimal data type or System.EventArgs parameters. This required a design decision: I could either modify the profiler source code to catch these or modify the C# analysis code. I chose the second approach, mostly because I wanted to change as little as possible in the profiler. However, given more time it would have been better to change the profiler source code itself.
After the method string has been built up, there is a critical processing step:
s = Mapped(s);  // map C# type names to Profiler type names
The entire string is modified using a Mapped function that replaces the .NET return type representation (System.Int32, for example) with the return type representation used by the profiler (int32). The two representations must match because the profiler outputs its own representation of data types for methods actually hit during testing. Here's the Mapped function:
static string Mapped(string s)
{
    string t = s;

    while (t.IndexOf("System.Int32") >= 0) 
        t = t.Replace("System.Int32", "int32");

  // etc.

    while (t.IndexOf("System.String") >= 0) 
        t = t.Replace("System.String", "String");

    return t;

}  
The mappings I used handle the C# data types because the application under test was written in C#. Depending on the system you're analyzing, you may have to add mapping statements to map other data types. The technique I used here—a while loop—isn't very elegant. If you're a fan of regular expressions, you can recast the Mapped function using them instead. A table lookup would also be appropriate.
Finally, you strip off any trailing comma characters, append a closing parenthesis character, and write the result to the ffcoverMethods.txt file in the FFdataFiles folder:
if (s[s.Length-1] == ',') s = s.Remove(s.Length-1,1);
s += ")";

Console.WriteLine(s);
sw.WriteLine(s);
sw.Flush();

Running Tests Under Code Coverage
After the code coverage profiling DLL has been built, and information has been gathered about which methods in the product should be hit while testing, you can run a test or a suite of tests under code coverage. The steps are easy: enable the code coverage environment, run a test or a suite of tests, and disable the code coverage environment.
The code coverage environment is activated with a simple enable.bat file that registers the profiler and sets three shell variables. The key statements in enable.bat are:
@set DBG_PRF_LOG=0x1
@set Cor_Enable_Profiling=0x1
@regsvr32 /s debug\ProfilerFF.dll
@set COR_PROFILER={5AC86959-7927-4177-9802-A540F1967874}
The first set statement tells ProfilerFF.dll to send output to the log file specified in the ProfilerInfo.cpp file. Recall that this file is named ffcoverXXX.log, where the Xs are a timestamp. The second statement sets the value of the environment variable that the profiler checks to determine if it should profile or not (0x1 represents true). The third statement registers the profiling DLL. The last statement sets an ID value for the profiler that is run-ning in the current environment. It is the GUID specified in the ProfilerCallback.h file.
After calling enable.bat, you simply run a test or a suite of tests just as you would normally. Your tests can be manual or automated. All .NET activity is monitored and when the application terminates, the information about which methods were actually entered are saved as an ffcoverXXX.log file in folder FFdataFiles. Here are a few lines of code from the example shown at the beginning of this article:
0x00000c58;void DummyApp.Form1::Main()
0x00000c58;void 
    DummyApp.Form1::button2_Click(Object)
0x00000c58;void 
    DummyLib.Dummy::.ctor(int32,int64)
0x00000c58;void 
    DummyLib.Dummy::Bar(int32&,float64&,String[])
Each line of data is preceded by an arbitrary thread ID in case your application is multithreaded. Notice that parameters such as int32 are in profiler format and not in .NET format (System.Int32).
If you run a second test, a second log file with a different timestamp will be saved. This is a common scenario in code coverage—you run a sequence of tests to obtain the cumulative coverage ratio. After you are finished running your tests, you want to turn off code coverage. You do this with a disable.bat file that has just one important statement:
@set Cor_Enable_Profiling=0x0
This statement effectively instructs the profiler not to monitor activity anymore. The relationship between the various parts of the Fundamental Function code coverage system are illustrated in the diagram in Figure 7.
Figure 7 Code Coverage System 

Reporting Code Coverage
After you have built a profiling DLL, created a list of all methods in the system under test that can potentially be entered during testing, and produced a list that contains all the methods that were actually entered, you are ready to create a results report. I organized my reporting code algorithm around three hash tables:
Hashtable ht1 = new Hashtable();  // holds all potential methods
Hashtable ht2 = new Hashtable();  // holds methods actually hit
Hashtable ht3 = new Hashtable();  // holds methods missed
In pseudocode, the algorithm looks like this:
load ht1 from ffcoverMethods.txt with methods that could be hit
load ht2 from all *.log files with methods actually hit

for each method in ht1 loop
{
  if method is not in ht2, add method to ht3
}

percent coverage = ht2.Count / ht1.Count
Loading the first hash table with names of methods that could be entered is easy because I already have this data in the ffcoverMethods.txt file. I simply open the file for reading, traverse through it, and add them like so:
if (!ht1.ContainsKey(line)
  ht1.Add(line, line);
Loading the second hash table with the methods that were actually entered during the test requires two steps. First I examine the FFdataFiles folder to find all files with names in the form of ffcoverXXX.log form, as shown here:
DirectoryInfo dir = new 
    DirectoryInfo("C:\\Coverage\\FFdataFiles\\");
FileInfo[] files = dir.GetFiles("ff*.log");  
foreach (FileInfo file in files)
{
  al1.Add(file.FullName);  
}
I then take each file, open it for reading, and add data lines to the hash table. The main processing loop is shown in Figure 8.
Because I stored a thread ID along with method data, the StartsWith method makes it easy to tell when I've hit a line of data. When I'm at a data line, I use the String.Split method to parse out the method data from the thread ID, check to make sure that method hit is not already in the hash table, and then store it into the hash table.
Although not absolutely necessary, I like to determine which methods were not hit. This gives me feedback to help design new tests that should be added to increase coverage. To do this, I iterate through the first hash table of methods that could be hit and if any method is not in the hash table of methods actually hit, it must have been missed. Because hash tables are designed for quick data lookups, you might not have seen the code to traverse all data in a hash table before:
foreach(DictionaryEntry e in ht1)  
{
  if (!ht2.ContainsValue(e.Value))  // if not in hit table
    ht3.Add(e.Value, e.Value);      // it was missed
} 
Finally, I do a little math to determine the percentage of methods hit, as shown in the following code:
int numtotal = ht1.Count;
int numhit = ht2.Count;
int nummissed = ht3.Count;
double percent = (double)numhit/(double)numtotal * 100;
percent = Math.Round(percent, 0);
I now have all the information I need to produce a code coverage analysis report.

Before and After .NET
If you know a little bit about code coverage in environments other than .NET, it's easier to understand how the Fundamental Function code coverage technique presented in this article fits into the overall production cycle. The main difficulty in traditional environments is that you cannot work with normal product code. You must first obtain a special build of the product you want to analyze. Then you must process that release, inserting code hooks to notify the profiler when methods or blocks of statements have been entered. This process is usually called instrumentation. This is not only a serious technical challenge, but on large products the management issues are difficult too.
Because all .NET code is executed in the CLR, you can trap method entries there using code built in the usual way, as demonstrated in this article. The overall result is that .NET code coverage is much easier to perform. In large projects before .NET, you would typically assign one full-time engineer just to manage the code coverage process, and even then results were inconsistent at best. The code coverage system presented in this article has been used successfully on several large .NET projects and is simple enough to be used by team members on an ad hoc basis.
In terms of the overall product development cycle, because Fundamental Function code coverage is so easy to perform, you can employ it earlier and more often in the development cycle. Pre-.NET code coverage is normally so time-consuming that it isn't feasible in early stages of development when the product is unstable (just when you need code coverage the most). Additionally, the simplicity allows daily code coverage analysis as part of the code check-in and build processes. In many product groups, before checking in new code, developers run a small set of developer regression tests to ensure that the new code has not broken old functionality. It is also very common for each new build of a product to be subjected to a set of build verification tests to ensure basic functionality. In both cases, these sets of tests can be analyzed with code coverage analysis to determine their effectiveness, even in a rapidly changing environment.
Compared with code coverage techniques before .NET, code coverage with Fundamental Function coverage does have some disadvantages. First, the coverage is not as granular as other techniques. I have found that this is not a serious drawback. By breaking up large methods into smaller methods (which is good design anyway), relatively little coverage information is lost. Additionally, I have noticed that the typical method in my .NET project tends to have approximately 25 percent fewer lines of code because of various efficiencies. Second, the ad hoc nature of the technique presented here means you don't get detailed reports. In my experience, this is a small price to pay for the greatly increased flexibility that this system provides. It's better to get very good code coverage today than very, very good code coverage tomorrow.

Conclusion
Even though code coverage analysis is a critical part of testing any significant software product, it is often so time-consuming in development environments before .NET that it is simply not done. Without a code coverage analysis, you can never be sure that your test cases touch all the product code. It is not unusual for a test team to believe they have full coverage, then run code coverage analysis and discover that their initial suite of tests hits less than 75 percent of the code base. In the old days, it may have been acceptable to push the product out the door and worry about additional testing later, but with the increased focus on security, this is no longer a viable option.

James McCaffreyworks for Volt Information Sciences Inc., where he manages technical training for software engineers working at the Microsoft Redmond, WA campus. He has worked on several Microsoft products including Internet Explorer and MSN Search. James can be reached at jmccaffrey@volt.com or v-jammc@microsoft.com.

© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.
Page view tracker