.NET Internals

Rewrite MSIL Code on the Fly with the .NET Framework Profiling API

Aleksandr Mikunov

Code download available at:NETProfilingAPI.exe(2,901 KB)

This article assumes you're familiar with the CLR and C#

Level of Difficulty123

SUMMARY

In this article, the author shows how to dynamically rewrite Microsoft Intermediate Language code on the fly using the Profiling API of the CLR. Unlike approaches based on Reflection.Emit, this scheme works with the existing assemblies and doesn't require the creation of proxy or dynamic assemblies. The need for IL code rewriting emerges when you want to make your changes transparent to the client and preserve the identity of classes. This technique can be used for creation of interceptors, pre- and post-processing method calls, and code instrumentation and verification.

Contents

Internal Representation of Methods by the CLR
Basics of MSIL Code Rewriting
More Advanced Techniques
Profiler Example
Conclusion

As a developer you have probably encountered a scenario where one application needs to preprocess and post-process calls to another application (whose source code is not available) to perform validation of input and output parameters, enforce application-specific business rules, or do some call tracing. You may also have had to cache method calls when returning a cached result rather than calling the original method, presuming the method was called with the same parameter and within a certain period of time. In such cases, you could dynamically generate a new proxy assembly which does the preprocessing, calls the original assembly's method, and performs the post-processing.

The most obvious drawback with this scheme is that the client has to reference the proxy assembly instead of the original. Thus, generally speaking, the identity of the classes is not preserved and the client application has to be modified.

Another drawback is that you must work with dynamic assemblies that are created using Reflection.Emit. When a dynamic assembly is saved to disk and reloaded, it is no longer dynamic and therefore is treated like any other assembly. The infrastructure of Reflection.Emit will no longer allow you to modify its modules and classes nor change its intermediate language (IL) code. If you want to dynamically change a method's IL code within an already existing assembly or modify an assembly's types and methods, you need to find some other way.

In this article, I will describe a simple and powerful approach which allows you to overcome many (if not all) of the limitations of Reflection.Emit. This technique will allow you to modify existing assemblies and add new types and methods to already loaded assemblies. You will also be able to rewrite the IL code of a given method on the fly.

Unlike the traditional approaches based on the Reflection API and dynamically emitted assemblies, this scheme works with existing assemblies and doesn't require the creation of proxies or dynamic assemblies. Thus, you preserve the identity of the classes whose methods are instrumented and your changes are transparent to the client applications. This approach will give you further insight into the underpinnings of IL code and insight into how the Profiling API works, but note that its not suited for deployment in a production environment. The reason is that it's based on the Profiling API, which after being registered circumvents the CLR security infrastructure. Plus it prevents use of other profilers which are seeking to perform real performance analysis on the app.

The basic idea of this approach is as follows. When the CLR loads a class and executes its method, the method's IL code is compiled to native instructions during the just-in-time (JIT) compilation process. The Profiling API provided as part of the CLR allows you to intercept this process. Before a method gets JIT-compiled, you can modify its IL code. In the simplest scenario, you can insert the customized prologue and epilogue into the method's IL and give the resulting IL back to the JIT compiler. If you desire, the newly generated IL could do some additional work before and after the original method's code is called.

You can also add new local variables, guarded blocks, and exception handlers to the method. More advanced techniques could dynamically add new data members and methods to the loaded class or create new types.

Figure 1 Reflection API versus IL Code Rewriting

Figure 1** Reflection API versus IL Code Rewriting **

Figure 1 summarizes the important differences between approaches based on the Reflection API and the approach based the dynamic IL code rewriting. The techniques I describe here were tested for both the Microsoft® .NET Framework 1.0 and Shared Source CLI (SSCLI) Beta Refresh ("Rotor").

Internal Representation of Methods by the CLR

Before I delve into the details of IL code rewriting, I want to review several important aspects of the CLR, such as the internal representation of methods and structured exception handling (SEH) tables. Let's start with a simple "Hello world!" program:

// C# using System; class MainApp { public static void Main() { Console.WriteLine( "Hello World!" ); } }

What follows is the IL code for this code snippet, which I got from running the IL Disassembler utility (ildasm.exe). I've also included opcodes for the IL instructions:

// MSIL .method public hidebysig static void Main() cil managed { /* 72 | (70)000001*/ ldstr "Hello World!" /* 28 | (0A)000002*/ call void System.Console::WriteLine(string) /* 2A | */ ret }

When the program gets compiled to IL, information about classes and methods in the hello.exe assembly is stored in the metadata tables. In particular, the Method metadata table holds information about the assembly's methods.

Each entry in this table provides the CLR with important information about a method. This is the relative virtual address (RVA) of the method's IL code—the offset from the address where the image (EXE or DLL) was loaded. The entry describes the method's properties such as managed/unmanaged, private/public, static/instance, virtual, and abstract. It also contains the method's name, signature, and a pointer to the Param table, which specifies additional information about the method's parameters.

The image file also contains a special header called the runtime header, which stores information about the application entry point. To view this information, launch the IL Disassembler utility and then click the View / COR header menu. The runtime header for the "Hello world!" assembly should look like this:

CLR Header: 72 Header Size 2 Major Runtime Version 0 Minor Runtime Version 1 Flags 6000001 Entrypoint Token 207c [208 ] address [size] of Metadata Directory: •••

Now let's see how the CLR handles method-specific metadata during loading and JIT compilation. When the assembly is loaded, the CLR examines the runtime header to determine the application entry point and will recognize that this is a method coded by the metadata token 0x06000001 (tokens of the form 0x06XXXXXX are used for methods). It will also recognize that the method's information is stored in the first row of the Method table that is shown in Figure 2. Using this record, the CLR will be able to locate the memory address that holds the actual method body and also obtain the method's description.

Figure 2 Main Method Stored in the Method Table

RVA ImplFlags Flags Name Signature ParamList
0x2050
(offset from the start of the EXE file)
0x0
(IL = 0x0, Managed = 0x0)
0x96
(Public =0x6, Static = 0x10, Hide by name + signature = 0x80)
Some value
(an index pointing to the "Main" string)
0x00 0x00 0x01
(no params, return type is void)
Some value

If you added the method's RVA to the load address (the address where the executable file for the "Hello World!" assembly was loaded), you would see a memory dump layout similar to the one shown in Figure 3. On my machine, the load address was 0x06EA1000, so the physical address for the method body is 0x06EA1000 + 0x2050, or 0x6EA3050. You could also open the hello.exe file (provided in the code download for this article) with any binary editor and locate the method's body at the offset of 0x250.

Figure 3 Memory Dump

Figure 3** Memory Dump **

Although the offset of a method in the assembly file (file pointer) usually differs from its RVA, you can follow these steps to locate a method's body within the file on disk. First, run ildasm.exe with the /Adv switch. Next, open the assembly and select the View | MetaInfo | Show! menu item (or press Ctrl-M). The tool will generate an output window with the assembly's metadata information (Modules, Types, Methods, and so on). You can now find the method you are interested in and read its RVA.

Run the dumpbin.exe utility with the /ALL switch (the utility is located in the Vc7\bin folder under your Visual Studio® .NET installation) and in the output produced by the tool you should find information about the section named .text (this section contains the IL code of the assembly). You will see both the value of the section's file pointer and the section dump called "raw data." Finally, use the method's RVA provided by ILDASM to find the method's body within the dump.

As you can see in Figure 3, the RVA column points to the method body, which consists of a method header, IL code, and possibly some additional sections, as shown in Figure 4. Let's look closely at these structures.

Figure 4 IL Method Body Layout

Figure 4** IL Method Body Layout **

Currently, there are two types of method headers: tiny and fat. The tiny header is used when the method is smaller than 64 bytes, when its stack depth won't exceed 8 slots (one slot for each item on the stack regardless of the item's size), and when it contains no local variables or SEHs.

The structure of the tiny header is declared in the CorHdr.h file, which is located in the \FrameworkSDK\include folder under your Visual Studio .NET installation (see Figure 5). The Flags_CodeSize field has the following binary form: XXXXXX10b. The upper 6 bits are used to store the IL code size in bytes (the header size is not counted) and the lower 2 bits hold the tiny header type code (0x02). If you need to calculate the size of a tiny method, you would read the Flags_CodeSize byte and then shift it right by 2 bits.

Figure 5 Tiny and Fat Headers

// CorHdr.h ----------------------------------------------------------------------- // tiny method header typedef struct IMAGE_COR_ILMETHOD_TINY { BYTE Flags_CodeSize; } IMAGE_COR_ILMETHOD_TINY; ----------------------------------------------------------------------- // fat method header typedef struct IMAGE_COR_ILMETHOD_FAT { unsigned Flags : 12; // Flags unsigned Size : 4; // size in DWords of this structure // (currently 3) unsigned MaxStack : 16; // maximum number of items (I4, I, I8, // obj ...), // on the operand stack DWORD CodeSize; // size of the code mdSignature LocalVarSigTok;// token that indicates the signature of // the local vars (0 means none) } IMAGE_COR_ILMETHOD_FAT; typedef union IMAGE_COR_ILMETHOD { IMAGE_COR_ILMETHOD_TINY Tiny; IMAGE_COR_ILMETHOD_FAT Fat; } IMAGE_COR_ILMETHOD;

The fat header has a more complex structure and must be DWORD aligned. Unlike the tiny header, the fat header has a special field to store the size of the IL code. The CorHdr.h file also defines a convenient union IMAGE_COR_ILMETHOD (also shown in Figure 5). Using this union, you can easily write these two functions to determine the type of a given header (see Figure 6).

Figure 6 IsTinyHeader and IsFatHeader Functions

// these flags are declared in CorHdr.h, CorILMethodFlags enum // ••• // CorILMethod_FormatShift= 3, // CorILMethod_FormatMask = ((1 << CorILMethod_FormatShift) - 1), // CorILMethod_TinyFormat = 0x0002, // CorILMethod_FatFormat = 0x0003, // ••• BOOL IsTinyHeader( const IMAGE_COR_ILMETHOD* pMethodHeader ) { return ( BOOL )( (pMethodHeader->Tiny.Flags_CodeSize & (CorILMethod_FormatMask >> 1)) == CorILMethod_TinyFormat ); } BOOL IsFatHeader ( const IMAGE_COR_ILMETHOD* pMethodHeader ) { return ( BOOL )( (pMethodHeader->Fat.Flags & CorILMethod_FormatMask) == CorILMethod_FatFormat ); }

Now let's go back to the Main method's memory layout. As was shown in Figure 3, the Main method has a fat header:

// method header 0x13 0x30 0x01 0x00 0x0b 0x00 0x00 0x00 0x00 0x00 0x00 0x00

This gives the layout shown in Figure 7 (don't forget about Intel's reversed byte-order).

Figure 7 Fat Header of the Main Method

Fat Header Entry and Its Size Value Note
Header type, Flags, and header size (WORD) 0x3013 (0011000000010011) The upper 4 bits (0011) hold the header size in DWORDs; that is, 3. The next 10 bits (0000000100) hold the Flags value (0x4), which means that local variables must be initialized. The lower 2 bits (11) indicate the header type (Fat).
MaxStack (WORD) 0x1 Maximum stack size in slots (items).
CodeSize (DWORD) 0x0b IL code size in bytes (without method header).
LocalVarSigTok (DWORD) 0x0 Token of the local variables signature. It's equal to zero since no local variables are presented.

The method's IL code, which consists of a sequence of IL instructions, is located immediately after the method's header:

// IL code of the Main method 0x72 0x01 0x00 0x00 0x70 0x28 0x02 0x00 0x00 0x0a 0x2a // <== this the last // "ret" instruction

As you can see, the code size is 11 bytes (0x0b), as specified by the method header. You may also notice that the last byte is 0x2a, which is the IL opcode for the return instruction.

Using IL opcode values and the IL code parsing technique, which I'll explain later in this article, you could easily generate a listing for the Main method similar to the one produced by ILDASM. Since the size of the fat header in double words is 3 (12 bytes), a fat method's IL code is always DWORD aligned. This is obviously not the case for tiny headers.

So far, you can see that the internal representation of methods is relatively simple. The picture gets more complicated, however, when a method uses exception handing. In this case, there should be a way to make this information available to the execution engine. For this purpose, a source code compile to IL (using the C# compiler in this case) generates special SEH tables. The compiler also sets the Flags field in the method header to 0x02 (the CorILMethod_MoreSects value from CorHdr.h) to tell the runtime that the method body has extra sections.

Although I won't go into all the details of SEH tables until later when I discuss IL code rewriting, it's important to understand that these tables simply contain DWORD-aligned sections that follow the method's IL code body, beginning with the section header. This is then followed by a sequence of exception handling clauses, as was shown in Figure 4. Each exception handling clause is also DWORD aligned. Because of the alignment, there is usually some gap (up to 3 bytes) between the last IL instruction and the first SEH section header. The section header and each exception handling clause can have either a small or fat format. They are described in CorHdr.h and are also shown in Figure 8.

Figure 8 Small and Fat SEH Section Headers

// CorHdr.h typedef struct IMAGE_COR_ILMETHOD_SECT_SMALL { BYTE Kind; BYTE DataSize; } IMAGE_COR_ILMETHOD_SECT_SMALL; typedef struct IMAGE_COR_ILMETHOD_SECT_FAT { unsigned Kind : 8; unsigned DataSize : 24; } IMAGE_COR_ILMETHOD_SECT_FAT; typedef struct IMAGE_COR_ILMETHOD_SECT_EH_CLAUSE_FAT { CorExceptionFlag Flags; DWORD TryOffset; DWORD TryLength; // relative to start of try block DWORD HandlerOffset; DWORD HandlerLength; // relative to start of handler union { DWORD ClassToken; // use for type-based exception // handlers DWORD FilterOffset; // use for filter-based exception // handlers (COR_ILEXCEPTION_FILTER is set) }; } IMAGE_COR_ILMETHOD_SECT_EH_CLAUSE_FAT; typedef struct IMAGE_COR_ILMETHOD_SECT_EH_CLAUSE_SMALL { #ifdef _WIN64 unsigned Flags : 16; #else // !_WIN64 CorExceptionFlag Flags : 16; #endif unsigned TryOffset : 16; unsigned TryLength : 8; // relative to start of try // block unsigned HandlerOffset : 16; unsigned HandlerLength : 8; // relative to start of // handler union { DWORD ClassToken; DWORD FilterOffset; }; } IMAGE_COR_ILMETHOD_SECT_EH_CLAUSE_SMALL;

Let's take a look at the SEH section header. The Kind field is always 1 byte in size and it holds a set of binary flags which are defined by the CorILMethodSect enumerator (see CorHdr.h for details). The IL compiler always sets the CorILMethod_Sect_EHTable flag (0x01) which tells the runtime that this is an SEH header. For the fat section header, the Kind byte also contains the CorILMethod_Sect_FatFormat (0x40) value. The rest of the flags are either optional or not currently used. The most typical values for the Kind field are 0x41 and 0x01.

The DataSize holds the total size in bytes of the section header and any related exception handler clauses. If, for example, you have one fat SEH section and a sequence of 14 fat clauses, DataSize will be set to the following value:

sizeof(FAT section header) + 14 * sizeof(FAT exception handler clause)

Each exception handler clause provides a complete detailed description of the exception handler, such as the offsets from the beginning of the method body of both the protected code block (the try block) and the exception handler itself, along with the sizes of the try and handler blocks.

The types of exception handler clauses are described by the Flags field, which can have one the following values declared in the CorExceptionFlag enumerator (see CorHdr.h for details):

  • The COR_ILEXCEPTION_CLAUSE_NONE value (0x0) corresponds to the try/catch blocks.
  • The COR_ILEXCEPTION_ CLAUSE_FILTER value (0x1) corresponds to the filters.
  • COR_ILEXCEPTION_CLAUSE_ FINALLY (0x2) is used for the try/finally blocks.
  • COR_ILEXCEPTION_CLAUSE_FAULT (0x4) is used for the finally blocks that are called inside exception handlers.

Basics of MSIL Code Rewriting

Now let's turn to the basic technique of IL code rewriting; namely, the insertion of prologues and epilogues into methods. This approach makes use of the Profiling API and the internal CLR structures that I've described so far. The application I'll develop is a profiler DLL, which is integrated with the CLR through the ICorProfilerCallback and ICorProfilerInfo interfaces. The runtime uses ICorProfilerCallback to notify the profiler of the various events such as class loads and unloads, JIT compilation, garbage collection, and threading. The profiler calls the ICorProfilerInfo interface (the profiler host) to get detailed information about the runtime's internal structures and to modify them if needed.

I'm particularly interested in the events related to the JIT compilation process. Before a method gets JIT-compiled, the runtime calls the profiler's ICorProfilerCallback::JITCompilationStarted method, allowing you to make changes in the method's IL.

Now let's take a closer look at the profiler's implementation details. Some of these are common to all profilers. For example, at initialization time, I store a pointer to the ICorProfilerInfo interface (implemented by the CLR) and register for the events I'm interested in. The Profiling API also requires that you implement standard COM entry points such as DllGetClassObject, DllCanUnloadNow, DllRegisterServer, and DllUnregisterServer functions. Note that although profilers are implemented as COM DLLs, the CLR itself doesn't use the COM API.

Those of you who have the SSCLI can look at the Rotor implementation of the method that loads a profiler DLL. You'll see that the runtime implements its own version of the CoCreateInstance function. The code loads the profiler DLL, gets the class object, and calls the IClassFactory interface to instantiate the object, which then implements the ICorProfilerCallback interface. (This implementation is located in the file \sscli\clr\src\profile\ee\profile.cpp.)

When looking at the Rotor implementation, you will also notice that unlike the .NET Framework, Rotor uses the environment variable COR_PROFILER_DLL to find the complete path to the profiler DLL. Before you can use the DLL in the Rotor environment you should set this variable. Now let's see how the profiler handles the JIT compilation events.

The CLR calls ICorProfilerCallback::JITCompilationStarted to notify the code profiler that the JIT compiler is starting to compile a function. Here's the description of this callback:

HRESULT ICorProfilerCallback::JITCompilationStarted( FunctionID functionId, BOOL fIsSafeToBlock )

The CLR passes the ID of the function being JIT-compiled to the profiler. In Rotor this ID is actually a pointer to the internal runtime's structure called MethodDesc, and can be used to query the profile host to get information about the function, but it should only be used as an opaque identifier. The second parameter tells you whether it's safe to perform a time-consuming operation in the code. The Profiling API documentation states that ignorance of this parameter won't harm the runtime. In addition, I've noticed in the Rotor source code that this parameter is always set to TRUE, so I'll ignore it.

The JITCompilationStarted event is a safe place to modify the method before it gets JIT-compiled. For this purpose, the core profiler host, ICorProfilerInfo, provides two very powerful functions, GetILFunctionBody and SetILFunctionBody:

HRESULT GetILFunctionBody( ModuleID moduleId, // ModuleID of the given module. mdMethodDef methodId, // Metadata token for method. LPCBYTE *ppMethodHeader, // Pointer to the IL method body. ULONG *pcbMethodSize // Pointer to the size of the method); HRESULT SetILFunctionBody( ModuleID moduleId, // ModuleID of the given module. mdMethodDef method, // Metadata token for method. LPCBYTE pbNewILMethod, // Pointer to the new IL method body. ULONG cbNewMethod // Size of the new method.);

The parameters are self-explanatory. The first method (GetILFunctionBody) returns a pointer to the method body for a given method's metadata token and a module ID (metadata is created on a per-module basis). The second function allows you to modify the existing method using a newly created method body. The Profiling API also requires that you allocate memory for the new method using the special IMethodMalloc interface, which can be obtained from the CLR through an ICorProfilerInfo::GetILFunctionBodyAllocator call.

My implementation of ICorProfilerCallback::JITCompilationStarted performs the following main steps. It modifies a method header (both fat and tiny). It then adds a prologue, which implies that you have to add a set of new IL instructions to the beginning of the method so the original IL code gets shifted. It adds an epilogue to the end. Since the original IL code changes, you also have to modify SEH tables, which store the relative values of offsets and lengths of guarded blocks and exception handlers.

Note that the JIT compilation process also performs a verification process that examines the code and attempts to determine whether or not the code is safe and incapable of performing any hidden hacks. As long as you add verifiable IL instructions, the original code also remains verifiable and will thus pass the verification process. Let's take a closer look at these four steps.

Changes in the method header are relatively simple—since you modify the size of the IL code, you'll have to change the CodeSize field accordingly. Epilogues and prologues may also use the method's stack, declare their own local variables, and add SEH sections. Thus, three fields—Flags, MaxStack, and LocalVarSigTok—will have to be modified as well.

Inserting a prologue is a straightforward procedure. The only requirement is that the method stack should be empty before the old IL gets executed. This rule has a very simple explanation—the original method won't know anything about your code and assumes that its stack is empty. You also have to restore the values of the method's arguments and initial values for any local variables you've used. In other words, the evaluation stack of the method and local variables should be restored to their initial state.

Insertion of an epilogue requires a lot of work and, in the very generic case, is extremely complicated. First, you have to identify all the points where the method may return. Typically, there are three return scenarios:

  • The ret (return) instruction, which causes the method to return to the call site.
  • The throw instruction, which pops an exception object from the stack and throws it as a managed exception.
  • The rethrow instruction, which throws the already-caught exception again and can only be used within SEH handlers.

Combinations of these three methods are also possible, such as in the example shown in Figure 9. In this particular case there's no easy way to emit the epilogue. Even if you replaced the ret statement with a branching instruction, you would still have to modify the original error handler (which starts at line IL_0012). The less obvious problem is that since the ret is at least 1 byte shorter than any of the branching instructions, you will also have to modify the bne.un.s instruction at IL_0008 (the bne.un.s instruction takes two values from the stack and branches if the first value is not equal to the second one; the "s" postfix means that it is a short form of the instruction).

Figure 9 Method Using Ret and Throw to Exit

// MSIL .method ••• SomeMethod( ••• ) cil managed { ••• IL_0008: bne.un.s IL_0012 // go to the error handler ••• IL_0011: ret // normal exit IL_0012: ••• // error handler begins here ••• // create an exception object IL_0048: newobj instance void System.Exception::.ctor(•••) // and throw it IL_004d: throw } // end of SomeMethod

For the sake of simplicity, I will consider only those cases in which a method returns via ret. I will have to perform a few steps such as parsing the IL code and identifying all the return instructions, then replacing them with my branching instructions in order to transfer control to the epilogue. Since the new branching instructions I've added are at least 1 byte longer, I will have to correct the original branching instructions as well. As you can see, the simplest scenario is when the IL code has only one ret instruction at the end. In this case, I could replace the ret opcode with the nop instruction, causing the method's execution flow to be passed to the epilogue. In order to do this, I have to be able to parse, restructure, and modify the IL code. This will require some familiarity with the basics of IL code parsing.

IL instruction opcodes are declared in the \FrameworkSDK\include\opcode.def file, which also contains information about instruction sizes and the intermediate language parameters they take. Initially you set the instruction pointer (IP) to point to the very first byte of the IL code. Next, you try to identify the first IL instruction using IL opcodes and add the size of instruction and its parameters to the IP. After doing this, the process will repeat until you reach the end of the method. It looks like a straightforward procedure, except for one thing: the number of IL instructions is about 300, so you must have a huge switch statement with more than 250 cases. Let's see how it all works for this simple input:

// IL code (DWORD aligned) 0x14 0x0E 0x00 0x28 0x01 0x00 0x00 0x0A 0x26 0xDE 0x0D 0x26 0x72 0x73 0x00 0x00 0x70 0x28 0x02 0x00 0x00 0x0A 0xDE 0x00 0x2A

The first byte of this code is 0x14, corresponding to the ldnull instruction, which loads a null object reference on the stack. This instruction doesn't take any parameters and its size is 1 byte. You will need to increase the IP value by 1 to get the next opcode, which is 0x0E. This is the opcode for the ldarg.s instruction, which loads a method argument value on the stack. The number of this argument is specified by the instruction's parameter. Since it's in short-parameter form (as noted by the .s suffix), this parameter is a 1-byte integer in the range 0 through 255 (or -128 thru 127, depending on the parameter type). In other words, you have to skip the next byte (0x00) in order to move to the next instruction, whose opcode is 0x28. This is the call instruction, which takes a 4-byte token as a parameter. In order to read the next IL instruction, I'll skip the next 5 bytes (one for the call opcode, 0x28, and 4 bytes for the token, 0x0A000001).

The full implementation is provided by the VerifyReturn function and can be found in the source code which accompanies the article. SEH tables also require some extra work.

In a typical situation you have to modify the sizes of the try blocks and their relative offsets since the original IL code and its size are changing. Then the offsets of the handlers have to be shifted accordingly. Depending on the application's rewrite logic, the handler sizes themselves may also need to be changed.

The simplest rewrite scenario is when you add a prologue before the very first try block and an epilogue right after the very last handler. In such cases, all you have to do is modify the offsets of the try and handler blocks. To illustrate this, let's take a look at Figure 10. (The complete source code and IL code can be found in the cc.il file provided with the article.) Note that I'm using the short-parameter form of the leave instruction to exit the guarded and handler blocks. According to the CLR exception handling rules, the leave instruction is the only correct way to leave SEH blocks. Branching into or out of the SEH block is illegal.

Figure 10 Method with a Simple SEH Block

// IL assembly language .class public auto ansi CCC extends [mscorlib]System.Object { ••• .method public static void TestException( int32 ) { .try { ldnull // purposely! ldarg.s 0 // load param #1 on the stack call vararg int32 printf( string, •••, int32 ) pop // remove ret value from the stack leave.s brOK } catch [mscorlib]System.Exception { pop ldstr "Exception!" call void [mscorlib]System.Console::WriteLine( string ) leave.s brOK } brOK: ret } // TestException ••• } // CCC Class // IL code and opcodes for TestException. .method /*06000004*/ public static void TestException(int32 A_0) cil managed { // Code size 25 (0x19) .maxstack 8 .try { IL_0000: /* 14 | */ ldnull IL_0001: /* 0E | 00 */ ldarg.s A_0 IL_0003: /* 28 | (0A)000001 */ call vararg int32 printf(string, •••, int32) IL_0008: /* 26 | */ pop IL_0009: /* DE | 0D */ leave.s IL_0018 } // end .try catch [mscorlib]System.Exception/* 01000002 */ { IL_000b: /* 26 | */ pop IL_000c: /* 72 | (70)000073 */ ldstr "Exception!" IL_0011: /* 28 | (0A)000002 */ call void [mscorlib]System.Console::WriteLine(string) IL_0016: /* DE | 00 */ leave.s IL_0018 } // end handler IL_0018: /* 2A | */ ret } // end of method CCC::TestException

If you located the TestException method body in the cc.exe file (this method's RVA is 0x000020a0) using the steps I described earlier, you should have seen the memory layout for the method header, shown in Figure 11. Pay attention to the Flags field in the method header, which is now set to 0x02 to tell the CLR that an SEH table is present. Now take a look at the method's IL code and SEH sections shown in Figure 12 (I've also added some comments to underscore SEH-related details).

Figure 12 IL Code and SEH Header

// method header ••• // IL code (DWORD aligned) 0x14 0x0E 0x00 0x28 0x01 0x00 0x00 0x0A 0x26 0xDE 0x0D 0x26 0x72 0x73 0x00 0x00 0x70 0x28 0x02 0x00 0x00 0x0A 0xDE 0x00 0x2A <== this the "ret" instruction // SEH Header goes here immediately after the code and also DWORD aligned 0x01 0x10 0x00 0x00 <== padded with 2 bytes small SEH header // small EH clause is located immediately after SEH header 0x00 0x00 0x00 0x00 0x0B 0x0B 0x00 0x0D 0x02 0x00 0x00 0x01

Figure 11 Method Header for TestException

Fat Header Entry (Size) Value Note
Header type, Flags, and header size (WORD) 0x300B (0011000000001011) Flags field is set to 0x2 (10b) to tell runtime that method body has extra sections
MaxStack (WORD) 0x8 Maximum stack size in slots (items)
CodeSize (DWORD) 0x19 IL code size (without method header)
LocalVarSigTok (DWORD) 0x0 No local variables presented

Due to the DWORD alignment, there's a 3-byte gap (delta) between the last IL instruction and the SEH section header. The first byte of the section header (the Kind field) is 0x01, which means that this is an exception handler section in small format. The small exception handler clause is located right after the section header and is 12 bytes in size.

As you can see, the exception type metadata token is 0x01000002, which refers to the second row in the TypeRef table. This table contains a row for each class defined in another module. In the test application (cc.exe), this token corresponds to the System.Exception class (which is the base class for all CLR exceptions). This can be seen by generating metainfo output using the ILDASM tool.

Let's assume that I'm going to add a prologue, which simply loads the value of the method's argument (an int32) onto the stack and then removes it:

ldarg.0 /*0x02*/ pop /*0x26*/

The epilogue is also very simple—I replaced the ret instruction with a nop and then added a new return statement:

nop /*0x00*/ ret /*0x2A*/

So, the resulting IL code looks like this, where epilogue and prologue are highlighted:

<span xmlns="https://www.w3.org/1999/xhtml">0x02 0x26</span> 0x14 0x0E 0x00 0x28 0x01 0x00 0x00 0x0A 0x26 0xDE 0x0D 0x26 0x72 0x73 0x00 0x00 0x70 0x28 0x02 0x00 0x00 0x0A 0xDE 0x00 0x00 <span xmlns="https://www.w3.org/1999/xhtml">0x00 0x2A</span>

Since I added 2 more bytes to the beginning of the method, the try and catch blocks are both shifted. I also have to modify the TryOffset and HandlerOffset fields in the exception handler clause to let the SEH mechanism work correctly. The modified exception handler clause is shown in Figure 13. Note that the SEH header remains the same.

Figure 13 Modified Exception Handler Clause

Exception Handler Clause Entry Value Note
Flags 0x0000 COR_ILEXCEPTION_CLAUSE_NONE typed handler from CorExceptionFlag enum (see corhdr.h for details)
TryOffset 0x0 + 0x02 Original offset + size of prologue
TryLength 0xB Size is the same
HandlerOffset 0xB + 0x02 Original offset + size of prologue
HandlerLength 0xD Our handler size remains the same
ClassToken Exception type metadata token (in our example it's 0x01000002) A System.Exception instance /*01000002*/

In my profiler I've implemented a generic function called FixSEHSections which does all this work (it can be found in the source code which accompanies this article). It analyzes SEH tables and fixes their offsets and sizes depending on the types of exception handler clauses. This function also uses several handy structures provided by the .NET Framework SDK and declared in corhlpr.h. In particular, I'm using the COR_ILMETHOD_TINY, COR_ILMETHOD_FAT, COR_ILMETHOD, and COR_ILMETHOD_DECODER structures, which encapsulate all the work with the method bodies.

Now I'm ready to return to the implementation details of the JITCompilationStarted method, which is outlined in Figure 14. The GetMethodInfoByFunctionID function is a helper that gets the method's information and fills out the CMethodInfo info structure. This structure holds a complete set of data for a method. In order to get the method information, GetMethodInfoByFunctionID calls the core profiler's GetTokenAndMetaDataFromFunction function. This function takes the function ID (provided by the runtime via the JITCompilationStarted call) and returns the corresponding method's metadata token and a special IMetaDataImport interface. The metadata import interface can be used to query method details through the GetMethodProps function. More information about these functions can be found in Profiling.doc, located in the \FrameworkSDK\Tool Developers Guide\docs folder.

Figure 14 JITCompilationStarted Method

extern CMethodInfo* pmi; extern GetMethodInfoByFunctionID(FunctionID functionID, CMethodInfo* pmi ); HRESULT COurProfiler::JITCompilationStarted( FunctionID functionID, BOOL fIsSafeToBlock ) { // 1. get old method info GetMethodInfoByFunctionID( functionID, pmi ); // 2. generate prologue and epilogue based on method info PrepareNewMethodInfo( pmi ); // 3. get IL code of the old method hr = m_pCorProfilerInfo->GetILFunctionBody( pmi->m_moduleID, pmi->m_tkMethodDef, &pMethodHeader, &cbMethodSize ); // 4. create new function's body by adding prologue and epilogue void* pNewMethodBody = NULL; ULONG cbNewMethodSize = 0; hr = CreateILFunctionBody( m_pCorProfilerInfo, pmi, // method info pMethodHeader, // old method info cbMethodSize, // old method size /* [out] */(void**)&pNewMethodBody, // newly generated IL body /* [out] */ cbNewMethodSize // new method size); // 5. Set new ILFunctionBody hr = m_pCorProfilerInfo->SetILFunctionBody( pmi->m_moduleID, pmi->m_tkMethodDef, (LPCBYTE)pNewMethodBody ); return S_OK; }

The most difficult part of the JITCompilationStarted implementation is hidden in the CreateILFunctionBody function, which does most of the real work. The part that creates a new method body with the tiny header is outlined in Figure 15. It's self-explanatory. First, it uses prologue and epilogue sizes to calculate the new method size and then it allocates memory for the new method body. Then it copies the prologue and the old method, after which it replaces the last return instruction in the old IL with the nop code (causing the method's execution flow to be passed to the epilogue). Finally, it adds the epilogue.

Figure 15 CreateILFunctionBody Method

static ULONG g_cbFatHeaderSize = 0xC; /*12 bytes = WORD + WORD + DWORD + DWORD */ static ULONG g_cbTinyHeaderSize = 0x1; /*1 byte*/ static ULONG g_cbMaxTinySize = 0x40; /* 64 bytes */ static WORD g_wDefMaxStack = 0x8; /*default stack depth in slots*/ HRESULT COurProfiler::CreateILFunctionBody( ICorProfilerInfo* pCorProfilerInfo, CMethodInfo* pmi, // method info LPCBYTE pMethodHeader, // old method info, points to method header const ULONG cbMethodSize, // old method size /* [out] */void** ppNewMethodBody, // newly generated IL body, // points to method header /* [out] */ULONG& cbNewMethodSize // new method size ) { // 1. get malloc interface IMethodMalloc* pMethodMalloc = NULL; hr = pCorProfilerInfo->GetILFunctionBodyAllocator( pmi->m_moduleID, &pMethodMalloc ); // 2. calculate sizes and allocate memory for new function // 2.1. Get size of our prologue ULONG cbPrologSize = GetPrologSize(•••) // 2.2. Get epilogue size ULONG cbEpilogSize = GetEpilogSize(•••); // 2.3. Calculate new method size (including header) cbNewMethodSize = g_cbTinyHeaderSize; // TINY HEADER cbNewMethodSize += ( cbPrologSize + cbEpilogSize ); // our prologue + epilogue cbNewMethodSize += ( cbMethodSize - g_cbTinyHeaderSize ); // old method size w/o TINY HEADER ULONG cbNewMethodSizeWithoutHeader = ( cbNewMethodSize - g_cbTinyHeaderSize ); ULONG cbMethodSizeWithoutHeader = ( cbMethodSize - g_cbTinyHeaderSize ); // 2.4. allocate memory void* pNewMethodBody = pMethodMalloc->Alloc( cbNewMethodSize ); pMethodMalloc->Release(); // 3. copy over new function // 3.1. "Create" a header and specify the new method size BYTE tinyHeader = (BYTE)cbNewMethodSizeWithoutHeader; // shift left 2 bits ( so, XXXXXXXXb becomes XXXXXX00b ) // and set first 2 bits holding tiny header type ( XXXXXX10b = 0x2 ) tinyHeader = ( tinyHeader << 2 ) | CorILMethod_TinyFormat; // 3.2 Copy header memcpy( pNewMethodBody, (void*)&tinyHeader, sizeof(BYTE) ); // 3.3 copy our prologue memcpy( (BYTE*)pNewMethodBody + g_cbTinyHeaderSize, (void*)Prolog, cbPrologSize ); // 3.4. copy old method body memcpy( (BYTE*)pNewMethodBody + g_cbTinyHeaderSize + cbPrologSize, (BYTE*)pMethodHeader + g_cbTinyHeaderSize, cbMethodSizeWithoutHeader ); // 3.5. replace last 'RET' statement with 'NOP' BYTE OpCodeNop = g_ILcodes[ CEE_NOP ].OpCode2; memcpy( (BYTE*)pNewMethodBody + g_cbTinyHeaderSize + cbPrologSize + cbMethodSizeWithoutHeader -1, (void*)&OpCodeNop, sizeof(BYTE) ); // 3.6. copy our epilogue memcpy( (BYTE*)pNewMethodBody + g_cbTinyHeaderSize + cbPrologSize + cbMethodSizeWithoutHeader, (void*) Epilog, cbEpilogSize ); // everything went OK *ppNewMethodBody = pNewMethodBody; return S_OK; }

The same function for a method with the fat header is much more complicated. First, you need to take care of the method header—the MaxStack and CodeSize fields have to be modified, which is the easiest part. Second, you have to walk through all SEH headers and clauses and make sure they still contain valid offsets and sizes, despite the fact that the original IL code gets shifted. You also have to correctly calculate the gap between the IL and the first SEH header and make sure that all the SEH sections are properly aligned. You can find the complete version of the function in the source code which accompanies this article.

More Advanced Techniques

I will now consider more complex techniques such as adding local variables and exception handlers to a method and creating new methods at run time. I'll start with local variables.

The local variables of a method are coded as a signature and the corresponding record is stored in the StandAloneSig metadata table as 0x11XXXXXX formatted tokens. The table has only one column which stores offsets into the special metadata stream called the Binary Large Object (BLOB) heap. (StandAloneSig also holds signatures for indirect calls that use the IL instruction calls.)

For example, consider the following method in IL assembly language with two local variables of type int32:

// IL assembly language .method public static int32 SomeFunction( int32, int32 ) { .maxstack 10 // information about those variables is stored in StandAloneSig // "init" means that all local variables // must be initialized .locals init ( int32 nParam1, int32 nParam2 ) ••• }

When the method gets compiled to IL, the compiler will create a signature based on the number of local variables (2) and their types (both int32). At the very beginning of the signature the compiler will also add a special byte to identify the type of the signature, which is 0x07 (IMAGE_CEE_CS_CALLCONV_LOCAL_SIG) for local variables.

Taking into account that the int32 type is represented by the ELEMENT_TYPE_I4 value (0x08), the resulting signature would look like this:

0x07 0x02 0x08 0x08

Next, the signature will be stored in the BLOB stream and the corresponding record will be created in the StandAloneSig table. The compiler will also update the LocalVarSigTok field of the method header to reference the newly created record. Since I've specified the init keyword in the local variables declaration, the Flags value in the method header should be set to 0x04. At run time it will indicate to the JIT compiler that local variables have to be initialized by calling their default constructors. Otherwise, the code won't pass the runtime verification process. Thus, in order to add local variables to a method you must modify the StandAloneSig table. This is also true when you modify methods at run time. Before you change the method and call ICorProfilerInfo::SetILFunctionBody you have to generate a new local signature and dynamically add a new record to the StandAloneSig table.

Fortunately, the CLR supports an interface called IMetaDataEmit, through which the profiler can modify the existing metadata at run time (unlike Reflection.Emit). To obtain a pointer to this interface, you'll need to call the ICorProfilerInfo::GetModuleMetaData method with dwOpenFlags of ofRead | ofWrite and an riid of IID_IMetaDataEmit. Here's the function's format:

HRESULT GetModuleMetaData( ModuleID moduleId, DWORD dwOpenFlags, REFIID riid, IUnknown **ppOut )

It will give you back a writeable metadata interface (Metadata Emitter), which is quite powerful. It's defined in Cor.h and has several important methods (see Figure 16).

Figure 16 IMetaDataEmit Interface

// Cor.h DECLARE_INTERFACE_(IMetaDataEmit, IUnknown) { ••• STDMETHOD(DefineTypeDef)( // S_OK or error LPCWSTR szTypeDef, // [IN] Name of TypeDef DWORD dwTypeDefFlags, // [IN] CustomAttribute flags mdToken tkExtends, // [IN] extends this TypeDef or // typeref mdToken rtkImplements[], // [IN] Implements interfaces mdTypeDef *ptd) PURE; // [OUT] Put TypeDef token here STDMETHOD(DefineMethod)( // S_OK or error mdTypeDef td, // Parent TypeDef LPCWSTR szName, // Name of member DWORD dwMethodFlags, // Member attributes PCCOR_SIGNATURE pvSigBlob, // [IN] point to a BLOB value of // CLR signature ULONG cbSigBlob, // [IN] count of bytes in the // signature BLOB ULONG ulCodeRVA, DWORD dwImplFlags, mdMethodDef *pmd) PURE; // Put member token here STDMETHOD(GetTokenFromSig)( // S_OK or error PCCOR_SIGNATURE pvSig, // [IN] Signature to define ULONG cbSig, // [IN] Size of signature data mdSignature *pmsig) PURE; // [OUT] returned signature token ••• }; // IMetaDataEmit

Using this interface, you will be able to create new metadata tokens on the fly. The entire process of the dynamic creation of local variables for a given method can be summarized as follows:

  1. Get the IMetaDataEmit interface by calling ICorProfilerInfo::GetModuleMetaData.
  2. Allocate (as a sequence of bytes) the local variable's signature and fill it out.
  3. Add a signature to the StandAloneSig table using the IMetaDataEmit::GetTokenFromSig method, which returns a metadata token for the newly added record.
  4. Modify LocalVarSigTok to point to the new token. The Flags field has to be set to 0x04 to initialize local variables.

The complete implementation is provided by the CreateILFunctionBodyWithLocalVariables function and can be found in the source code that accompanies the article.

Adding a new method requires some extra work and can be done by calling IMetaDataEmit::DefineMethod. Note that any changes to class metadata—adding new methods or whole new classes—should be done as early as possible in the run of the program. The ideal location is during the ModuleLoadFinished callback for the module into which you want to add the new class or method. This process can be described as follows:

  1. Obtain the IMetaDataEmit interface.
  2. Create the local variable signatures and add them to the metadata (as described in the previous section).
  3. Create the new method body (see CreateILFunctionBody for details) and modify its header to point to the correct signature (with the LocalVarSigTok and Flags entries).
  4. Calculate the method's signature, attributes, and implementation flags. Note that unlike the local variable signatures, the method's signature is not coded as a token. Therefore, you don't have to store it as a record in a separate table before you call DefineMethod.
  5. Call the IMetaDataEmit::DefineMethod function, which modifies the Method metadata table and does the actual work, storing the method's signature, attributes, and implementation flags in the Signature, Flags, and ImplFlags columns, respectively.

The source code for the CreateILFunctionBodyWithLocalVariables function will provide additional details.

To understand the process of adding new exception handlers to methods, first consider a generic method with no exception handler as shown here:

// IL assembly language .method ••• void SomeMethod( ••• ) { ••• // method body ••• ret } // SomeMethod

I'm going to modify the method by adding the simplest possible catch handler (see Figure 17). Note that in this example I'm using the short-parameter leave instruction, which requires two bytes:

IL_01a7: /* 26 | */ pop IL_01a8: /* DE | 00 */ leave.s IL_01aa

The long-parameter instruction would require 5 bytes:

IL_01a7: /* 26 | */ pop IL_01a8: /* DE | 00000000 */ leave IL_01aa

Since I'm replacing the return statement in the original IL code with the short leave instruction, which is 1 byte longer, the guarded block will also grow by 1 byte.

Figure 17 Method with Dynamically Added SEH Handler

// IL assembly language .method ••• void SomeMethod ( ••• ) { .try { ••• // original method body without return statement ••• leave.s brOK // short form of SEH Block exiting instruction } catch [mscorlib]System.Exception { pop // get rid of the reference to the intercepted exception // (in our case it's a System.Exception instance leave.s brOK } brOK: ret } // SomeMethod

In order to add an exception handler you will need to:

  1. Set the Flags field in the method header to 0x02, since the method has extra sections.
  2. Find the last return instruction (as you did in the regular case of emitting epilogues and prologues) and replace it with the leave instruction (which exits SEH blocks).
  3. Add the exception handler block (for example, the catch [mscorlib]System.Exception {...} block).
  4. Add a new SEH header if the method doesn't have one.
  5. Add either a new small (if the SEH header is small) or fat exception handler clause and set its field.

If you also added a prologue and an epilogue, the corresponding exception handler clause would have the layout shown in Figure 18.

Figure 18 Generic Exception Handler Clause

Exception Handler Clause Entry Value
Flags 0x0000
TryOffset Prologue size + 0x0
TryLength Short form of leave instruction:
Prologue size + Size of the original code + 1
Long form of leave instruction:
Prologue size + Size of the original code + 4
HandlerOffset Short form of leave instruction:
Prologue size + Size of the original code + 1
Long form of leave instruction:
Prologue size + Size of the original code + 4
HandlerLength The handler size
ClassToken Exception type metadata token

Profiler Example

The profiler example provided with the article inserts a simple prologue and an epilogue for each method of a given class. The code illustrates most of the techniques I've discussed in the article. It also produces a log file, so you can see what's going on in the CChecker.log file. The profiler requires the following environment variables to be set:

set Cor_Enable_Profiling=0x1 set COR_PROFILER={97D8965A-8686-4639-9C24-E1F6D13EE105} set COR_PROFILER_DLL=<your complete path>\CChecker.dll set CC_PROFILER_ROTOR_APP=YourApp.exe

The COR_PROFILER_DLL and CC_PROFILER_ROTOR_APP variables are specific to Rotor. The first variable sets the full path to your profiler DLL and makes it available for Rotor. The second is used by the profiler itself and specifies the assembly name being instrumented. For example, if you wanted to profile the hello.exe application in the Rotor environment, you would have to set CC_PROFILER_ROTOR_APP like this:

set CC_PROFILER_ROTOR_APP=hello.exe

Conclusion

This article showed some of the advanced features found in the common language runtime. I've explained how to use the Profiling API and the internal structures of the CLR to dynamically modify intermediate language code, preserving the identity of the classes whose methods are instrumented. These techniques demonstrate how to instrument code for gathering metrics that aren't possible with the events provided in the rest of the Profiling API. In addition, the example provided gave you some insight into the internal workings of the CLR and will allow you to tweak your own advanced .NET Framework-targeted applications.

For related articles see:
The Implementation of Model Constraints in .NET
Inside Microsoft .NET IL Assembler by Serge Lidin (Microsoft Press, 2002)
Avoiding DLL Hell: Introducing Application Metadata in the Microsoft .NET Framework
Under the Hood: The .NET Profiling API and the DNProfiler Tool

Aleksandr Mikunov is a senior software engineer at Compuware Corporation. His background is in developing relational databases and n-tier applications on Windows platforms. Reach him at Aleksandr.Mikunov@compuware.com.