November 2014

Volume 29 Number 11


Application Instrumentation : Application Analysis with Pin

Hadi Brais | November 2014

Program analysis is a fundamental step in the development process. It involves analyzing a program to determine how it will behave at run time. There are two types of program analysis: static and dynamic.

You’d perform a static analysis without running the target program, usually during source code compilation. Visual Studio provides a number of excellent tools for static analysis. Most modern compilers automatically perform static analysis to ensure the program honors the language’s semantic rules and to safely optimize the code. Although static analysis isn’t always accurate, its main benefit is pointing out potential problems with code before you run it, reducing the number of debugging sessions and saving precious time.

You’d perform a dynamic program analysis while running the target program. When the program ends, the dynamic analyzer produces a profile with behavioral information. In the Microsoft .NET Framework, the just-in-time (JIT) compiler performs dynamic analysis at run time to further optimize the code and ensures it won’t do anything that violates the type system.

The primary advantage of static analysis over dynamic analysis is it ensures 100 percent code coverage. To ensure such high code coverage with dynamic analysis, you usually need to run the program many times, each time with different input so the analysis takes different paths. The primary advantage of dynamic analysis is it can produce detailed and accurate information. When you develop and run a .NET application or secure C++ application, both kinds of analysis will be automatically performed under the hood to ensure that the code honors the rules of the framework.

The focus in this article will be on dynamic program analysis, also known as profiling. There are many ways to profile a program, such as using framework events, OS hooks and dynamic instrumentation. While Visual Studio provides a profiling framework, its dynamic instrumentation capabilities are currently limited. For all but the simplest dynamic instrumentation scenarios, you’ll need a more advanced framework. That’s where Pin comes into play.

What Is Pin?

Pin is a dynamic binary instrumentation framework developed by Intel Corp. that lets you build program analysis tools called Pintools for Windows and Linux platforms. You can use these tools to monitor and record the behavior of a program while it’s running. Then you can effectively evaluate many important aspects of the program such as its correctness, performance and security.

You can integrate the Pin framework with Microsoft Visual Studio to easily build and debug Pintools. In this article, I’ll show how to use Pin with Visual Studio to develop and debug a simple yet useful Pintool. The Pintool will detect critical memory issues such as memory leaking and double freeing allocated memory in a C/C++ program.

To better understand the nature of Pin, look at the complete definition term by term:

  • A framework is a collection of code upon which you write a program. It typically includes a runtime component that partially controls program execution (such as startup and termination).
  • Instrumentation is the process of analyzing a program by adding or modifying code—or both.
  • Binary indicates the code being added or modified is machine code in binary form.
  • Dynamic indicates the instrumentation processes are performed at run time, while the program is executing.

The complete phrase “dynamic binary instrumentation” is a mouthful, so people usually use the acronym DBI. Pin is a DBI framework.

You can use Pin on Windows (IA32 and Intel64), Linux (IA32 and Intel64), Mac OS X (IA32 and Intel64) and Android (IA32). Pin also supports the Intel Xeon Phi microprocessor for supercomputers. It not only supports Windows, but also seamlessly integrates with Visual Studio. You can write Pintools in Visual Studio and debug them with the Visual Studio Debugger. You can even develop debugging extensions for Pin to use seamlessly from Visual Studio.

Get Started with Pin

Although Pin is proprietary software, you can download and use it free of charge for non-commercial use. Pin doesn’t yet support Visual Studio 2013, so I’ll use Visual Studio 2012. If you’ve installed both Visual Studio 2012 and 2013, you can create and open Visual Studio 2012 projects from 2013 and use the C++ libraries and tools of Visual Studio 2012 from 2013.

Download Pin from intel.ly/1ysiBs4. Besides the documentation and the binaries, Pin includes source code for a large collection of sample Pintools you’ll find in source/tools. From the MyPinTool folder, open the MyPinTool solution in Visual Studio.

Examine the project properties in detail to determine the proper Pintool configuration. All Pintools are DLL files. Therefore, the project Configuration Type should be set to Dynamic Library (.dll). You’ll also have to specify all headers, files, libraries and a number of preprocessor symbols required by the Pin header files. Set the entry point to Ptrace_DllMainCRTStartup%4012 to properly initialize the C runtime. Specify the /export:main switch to import the main function.

You can either use the properly configured MyPinTool project or create a new project and configure it yourself. You can also create a property sheet containing the required configuration details and import that into your Pintool project.

Pin Granularity

Pin lets you insert code into specific places in the program you’re instrumenting—typically just before or after executing a particular instruction or function. For example, you might want to record all dynamic memory allocations to detect memory leaks.

There are three main levels of granularity to Pin: routine, instruction and image. Pin also has one more not-so-obvious level—trace granularity. A trace is a straight-line instruction sequence with exactly one entry. It usually ends with an unconditional branch. A trace may include multiple exit points as long as they’re conditional. Examples of unconditional branches include calls, returns and unconditional jumps. Note that a trace has exactly one entry point. If Pin detected a branch to a location within a trace, it will end that trace at that location and start a new trace.

Pin offers these instrumentation granularities to help you choose the appropriate trade-off between performance and level of detail. Instrumenting at the instruction level might result in severe performance degradation, because there could be billions of instructions. On the other hand, instrumenting at the function level might be too general and, therefore, it might increase the complexity of the analysis code. Traces help you instrument without compromising performance or detail.

Write a Pintool

Now it’s time to write a useful Pintool. The purpose of this Pintool example is to detect memory deallocation problems common to C/C++ programs. The simple Pintool I’m going to write can diagnose an existing program without having to modify the source code or recompile it, because Pin performs its work at run time. Here are the problems the Pintool will detect:

  • Memory leaks: Memory allocated, but not freed.
  • Double freeing: Memory deallocated more than once.
  • Freeing unallocated memory: Deallocating memory that hasn’t been allocated (such as calling free and passing NULL to it).

To simplify the code, I’ll assume the following:

  • The main function of the program is called main. I won’t consider other variants.
  • The only functions that allocate and free memory are new/malloc and delete/free, respectively. I won’t consider calloc and realloc, for example.
  • The program consists of one executable file.

Once you understand the code, you can modify it and make the tool much more practical.

Define the Solution

To detect those memory problems, the Pintool must monitor calls to the allocation and deallocation functions. Because the new operator calls malloc internally, and the delete operator calls free internally, I can just monitor the calls to malloc and free.

Whenever the program calls malloc, I’ll record the returned address (either NULL or the address of the allocated memory region). Whenever it calls free, I’ll match the address of the memory being freed with my records. If it has been allocated but not freed, I’ll mark it as freed. However, if it has been allocated and freed, that would be an attempt to free it again, which indicates a problem. Finally, if there’s no record the memory being freed has been allocated, that would be an attempt to free unallocated memory. When the program terminates, I’ll again check records for those memory regions that have been allocated but not freed to detect memory leaks.

Choose a Granularity

Pin can instrument a program at four granularities: image, routine, trace and instruction. Which is best for this Pintool? While any of the granularities will do the job, I need to choose the one that incurs the least performance overhead. In this case, the image granularity would be the best. Once the image of the program is loaded, the Pintool can locate the malloc and free code within the image and insert the analysis code. This way, instrumentation overhead will be per-image instead of, for exmple, per-instruction.

To use the Pin API, I must include the pin.H header file in the code. The Pintool will be writing the results to a file, so I also have to include the fstream header file. I’ll use the map STL type to keep track of the memory being allocated and deallocated. This type is defined in the map header file. I’ll also use the cerr stream to show informative messages:

#include "pin.H"
#include <iostream>
#include <fstream>
#include <map>

I will define three symbols to hold the names of the functions malloc, free and main:

#define MALLOC "malloc"
#define FREE "free"
#define MAIN "main"

These are the required global variables:

bool Record = false;
map<ADDRINT, bool> MallocMap;
ofstream OutFile;
string ProgramImage;
KNOB<string> OutFileName(KNOB_MODE_WRITEONCE, 
  "Pintool", "o", "memtrace.txt",
  "Memory trace file name");

The Record variable indicates whether I’m inside the main function. The MallocMap variable holds the state of each allocated memory region. The ADDRINT type is defined by pin.H and represents a memory address. If the value associated with a memory address it TRUE, it has been deallocated.

The ProgramImage variable holds the name of the program image. The last variable is a KNOB. This represents a command-line switch to the Pintool. Pin makes it easy to define switches for a Pintool. For each switch, define a KNOB variable. The template type parameter string represents the type of the values that the switch will take. Here, the KNOB lets you specify the name of the output file of the Pintool through the “o” switch. The default value is memtrace.txt.

Next, I have to define the analysis routines executed at specific points in the code sequence. I need an analysis function, as defined in Figure 1, called just after malloc returns to record the address of the allocated memory. This function takes the address returned by malloc and returns nothing.

Figure 1 The RecordMalloc Analysis Routine Called Every Time Malloc Returns

VOID RecordMalloc(ADDRINT addr) {
  if (!Record) return;
  if (addr == NULL) {
    cerr << "Heap full!";
    return;
  }
  map<ADDRINT, bool>::iterator it = MallocMap.find(addr);
  if (it != MallocMap.end()) {
    if (it->second) {
      // Allocating a previously allocated and freed memory.
      it->second = false;
    }
    else {
      // Malloc should not allocate memory that has
      // already been allocated but not freed.
      cerr << "Imposible!" << endl;
    }
  }
  else {
    // First time allocating at this address.
    MallocMap.insert(pair<ADDRINT, bool>(addr, false));
  }
}

This function will be called every time malloc is called. However, I’m only interested in the memory if it’s part of the instrumented program. So I’ll record the address only when Record is TRUE. If the address is NULL, I’ll just ignore it.

Then the function determines whether the address is already in MallocMap. If it is, then it must have been previously allocated and deallocated and, therefore, it’s now being reused. If the address isn’t in MallocMap, I’ll insert it with FALSE as the value indicating it hasn’t been freed.

I’ll define another analysis routine, shown in Figure 2, that I’ll have called just before free is called to record the address of the memory region being freed. Using MallocMap, I can easily detect if the memory being freed has already been freed or it hasn’t been allocated.

Figure 2 The RecordFree Analysis Routine

VOID RecordFree(ADDRINT addr) {
  if (!Record) return;
  map<ADDRINT, bool>::iterator it = MallocMap.find(addr);
  if (it != MallocMap.end()) {
    if (it->second) {
      // Double freeing.
      OutFile << "Object at address " << hex << addr << "
        has been freed more than once."  << endl;
    }
    else {
      it->second = true; // Mark as freed.
    }
  }
  else {
    // Freeing unallocated memory.
    OutFile << "Freeing unallocated memory at " 
      << hex << addr << "." << endl;
  }
}

Next, I’ll need two more analysis routines to mark the execution and return of the main function:

 

VOID RecordMainBegin() {
  Record = true;
}
VOID RecordMainEnd() {
  Record = false;
}

Analysis routines determine the code to instrument the program. I also have to tell Pin when to execute these routines. That’s the purpose of instrumentation routines. I defined an instrumentation routine as shown in Figure 3. This routine is called every time an image is loaded in the running process. When the program image is loaded, I’ll tell Pin to insert the analysis routines at the appropriate points.

Figure 3 The Image Instrumentation Routine

VOID Image(IMG img, VOID *v) {
  if (IMG_Name(img) == ProgramImage) {
    RTN mallocRtn = RTN_FindByName(img, MALLOC);
    if (mallocRtn.is_valid()) {
      RTN_Open(mallocRtn);
      RTN_InsertCall(mallocRtn, IPOINT_AFTER, (AFUNPTR)RecordMalloc,
        IARG_FUNCRET_EXITPOINT_VALUE,
        IARG_END);
      RTN_Close(mallocRtn);
    }
    RTN freeRtn = RTN_FindByName(img, FREE);
    if (freeRtn.is_valid()) {
      RTN_Open(freeRtn);
      RTN_InsertCall(freeRtn, IPOINT_BEFORE, (AFUNPTR)RecordFree,
        IARG_FUNCARG_ENTRYPOINT_VALUE, 0,
        IARG_END);
      RTN_Close(freeRtn);
    }
    RTN mainRtn = RTN_FindByName(img, MAIN);
    if (mainRtn.is_valid()) {
      RTN_Open(mainRtn);
      RTN_InsertCall(mainRtn, IPOINT_BEFORE, (AFUNPTR)RecordMainBegin,
        IARG_END);
      RTN_InsertCall(mainRtn, IPOINT_AFTER, (AFUNPTR)RecordMainEnd,
        IARG_END);
      RTN_Close(mainRtn);
    }
  }
}

The IMG object represents the executable image. All Pin functions that operate at the image level start with IMG_*. For example, IMG_Name returns the name of the specified image. Similarly, all Pin functions that operate at the routine level start with RTN_*. For example, RTN_FindByName accepts an image and a C-style string and returns an RTN object representing the routine for which I’m looking. If the requested routine is defined in the image, the returned RTN object would be valid. Once I find the malloc, free and main routines, I can insert analysis routines at the appropriate points using the RTN_InsertCall function.

This function accepts three mandatory arguments followed by a variable number of arguments:

  • The first is the routine I want to instrument.
  • The second is an enumeration of type IPOINT that specifies where to insert the analysis routine.
  • The third is the analysis routine to be inserted.

Then I can specify a list of arguments to be passed to the analysis routine. This list must be terminated by IARG_END. To pass the return value of the malloc function to the analysis routine, I’ll specify IARG_FUNCRET_EXITPOINT_VALUE. To pass the argument of the free function to the analysis routine, I’ll specify IARG_FUNCARG_ENTRYPOINT_VALUE followed by the index of the argument of the free function. All these values starting with IARG_* are defined by the IARG_TYPE enumeration. The call to RTN_InsertCall has to be wrapped by calls to RTN_Open and RTN_Close so the Pintool can insert the analysis routines.

Now that I’ve defined my analysis and instrumentation routines, I’ll have to define a finalization routine. This will be called upon termination of the instrumented program. It accepts two arguments, one being the code argument that holds the value returned from the main function of the program. The other will be discussed later. I’ve used a range-based for loop to make the code more readable:

VOID Fini(INT32 code, VOID *v) {
  for (pair<ADDRINT, bool> p : MallocMap) {
    if (!p.second) {
      // Unfreed memory.
      OutFile << "Memory at " << hex << p.first << "
        allocated but not freed." << endl;
    }
  }
  OutFile.close();
}

All I have to do in the finalization routine is to iterate over MallocMap and detect those allocations that haven’t been freed. The return from Fini marks the end of the instrumentation process.

The last part of the code is the main function of the Pintool. In the main function, PIN_Init is called to have Pin parse the command line to initialize the Knobs. Because I’m searching for functions using their names, PIN has to load the symbol table of the program image. I can do this by calling PIN_InitSymbols. The function IMG_AddInstrumentFunction registers the instrumentation function Image to be called every time an image is loaded.

Also, the finalization function is registered using PIN_AddFiniFunction. Note that the second argument to these functions is passed to the v parameter. I can use this parameter to pass any additional information to instrumentation functions. Finally, PIN_StartProgram is called to start the program I’m analyzing. This function actually never returns to the main function. Once it’s called, Pin takes over everything:

int main(int argc, char *argv[]) {
  PIN_Init(argc, argv);
  ProgramImage = argv[6]; // Assume that the image name is always at index 6.
  PIN_InitSymbols();
  OutFile.open(OutFileName.Value().c_str());
  IMG_AddInstrumentFunction(Image, NULL);
  PIN_AddFiniFunction(Fini, NULL);
  PIN_StartProgram();
  return 0;
}

Assembling all these pieces of code constitutes a fully func­tional Pintool.

Run the Pintool

You should be able to build this project without any errors. You’ll also need a program to test the Pintool. You can use the following test program:

#include <new>
void foo(char* y) {
  int *x = (int*)malloc(4);
}
int main(int argc, char* argv[]) {
  free(NULL);
  foo(new char[10]);
  return 0;
}

Clearly, this program is suffering from two memory leaks and one unnecessary call to free, indicating a problem with the program logic. Create another project that includes the test program. Build the project to produce an EXE file.

The final step to run the Pintool is to add Pin as an external tool to Visual Studio. From the Tools menu, select External tools. A dialog box will open as shown in Figure 4. Click the Add button to add a new external tool. The Title should be Pin and the Command should be the directory of the pin.exe file. The Arguments include the arguments to be passed to pin.exe. The -t switch specifies the Pintool directory. Specify the program to be instrumented after the two hyphens. Click OK and you should be able to run Pin from the Tools menu.

Add Pin to Visual Studio Using the External Tools Dialog Box
Figure 4 Add Pin to Visual Studio Using the External Tools Dialog Box

While running the program, the Output window will print anything you throw in the cerr and cout streams. The cerr stream usually prints informative messages from Pintool during execution. Once Pin terminates, you can view the results by opening the file the Pintool has created. By default, this is called memtrace.txt. When you open the file, you should see something like this:

Freeing unallocated memory at 0.
Memory at 9e5108 allocated but not freed.
Memory at 9e5120 allocated but not freed.

If you have more complex programs that adhere to the Pintool assumptions, you should instrument them using the Pintool, as you might find other memory issues of which you were unaware.

Debug the Pintool

When developing a Pintool, you’ll stumble through a number of bugs. You can seamlessly debug it with the Visual Studio Debugger by adding the -pause_tool switch. The value of this switch specifies the number of seconds Pin will wait before it actually runs the Pintool. This lets you attach the Visual Studio Debugger to the process running the Pintool (which is the same as the process running the instrumented program). Then you can debug your Pintool normally.

The Pintool I’ve developed here assumes the name of the image is at index 6 of the argv array. So if you want to add the pause-tool switch, the image name will be at index 8. You can automate this by writing a bit more code.

Wrapping Up

To further develop your skills, you can enhance the Pintool so it can detect other kinds of memory problems such as dangling pointers and wild pointers. Also, the Pintool output isn’t very useful because it doesn’t point out which part of the code is causing the problem. It would be nice to print the name of the variable causing the problem and the name of the function in which the variable is declared. This would help you easily locate and fix the bug in the source code. While printing function names is easy, printing variable names is more challenging because of the lack of support from Pin.

There are a lot of interactions happening between Pin, the Pintool and the instrumented program. It’s important to understand these interactions when developing advanced Pintools. For now, you should work through the examples provided with Pin to gain a better understanding of its power.


Hadi Brais is a Ph.D. scholar at the Indian Institute of Technology Delhi (IITD), researching optimizing compiler design for the next-generation memory technology. He spends most of his time writing code in C/C++/C# and digging deep into the CLR and CRT. He blogs at hadibrais.wordpress.com. Reach him at hadi.b@live.com.

Thanks to the following technical expert for reviewing this article: Preeti Ranjan Panda