This tutorial offers an introduction to the MSIL Disassembler (Ildasm.exe) that is included with the .NET Framework SDK. The Ildasm.exe parses any .NET Framework .exe or .dll assembly, and shows the information in human-readable format. Ildasm.exe shows more than just the Microsoft intermediate language (MSIL) code — it also displays namespaces and types, including their interfaces. You can use Ildasm.exe to examine native .NET Framework assemblies, such as Mscorlib.dll, as well as .NET Framework assemblies provided by others or created yourself. Most .NET Framework developers will find Ildasm.exe indispensable.
For this tutorial, use the Visual C# version of the WordCount sample that is included with the SDK. You can also use the Visual Basic version, but the MSIL generated will be different for the two languages and the screen images will also not be identical. WordCount is located in the <FrameworkSDK>\Samples\Applications\WordCount\ directory. To build and run the sample, follow the instructions outlined in the Readme.htm file. This tutorial uses Ildasm.exe to examine the WordCount.exe assembly.
To get started, build the WordCount sample, and load it into Ildasm.exe using the following command line:
This causes the Ildasm.exe window to appear, as shown in the following figure.
The tree in the Ildasm.exe window shows the assembly manifest information contained inside WordCount.exe and the four global class types: App, ArgParser, WordCountArgParser, and WordCounter.
By double-clicking any of the types in the tree, you can see more information about the type. In the following figure, the WordCounter class type has been expanded.
In the previous figure, you can see all the WordCounter members. The following table explains what each graphic symbol means.
|Manifest or a class info item|
Double-clicking the .class public auto ansi beforefieldinit entry shows the following information:
In the previous figure, you can easily see that the WordCounter type is derived from the System.Object type.
The WordCounter type contains another type, called WordOccurrence. You can expand the WordOccurrence type to see its members, as shown in the following figure.
Looking at the tree, you can see that WordOccurrence implements the System.IComparable interface — specifically, the CompareTo method. However, the rest of this conversation will ignore the WordOccurrence type, and concentrate on the WordCounter type instead.
You can see that the WordCounter type contains five private fields: totalBytes, totalChars, totalLines, totalWords, and wordCounter. The first four of these fields are instances of the int64 type, while the wordCounter field is a reference to a System.Collections.SortedList type.
Following the fields, you can see the methods. The first method, .ctor, is a constructor. This particular type has just one constructor, but other types can have several constructors — each with a different signature. The WordCounter constructor has a return type of void (as all constructors do) and accepts no parameters. If you double-click the constructor method, a new window appears that displays the MSIL code contained within the method, as shown in the following figure.
MSIL code is actually quite easy to read and understand. (For all the details, see the CIL Instruction Set Specification, which is located in the Partition III CIL.doc file in the <FrameworkSDK>\Tool Developers Guide\Docs\ folder.) Toward the top, you can see that this constructor requires 50 bytes of MSIL code. From this number, you really have no idea how much native code will be emitted by the JIT compiler — since the size depends on the host CPU and on the compiler being used to generate the code.
The common language runtime is stack based. So, in order to perform any operation, MSIL code first pushes the operands onto a virtual stack, and then executes the operator. The operator grabs the operands off the stack, performs the required operation, and places the result back on the stack. At any one time, this method has no more than eight operands pushed onto the virtual stack. You can identify this number by looking at the .maxstack attribute that appears just before the MSIL code.
Now examine the first few MSIL instructions, which are reproduced on the following four lines:
IL_0000: ldarg.0 ; Load the object's 'this' pointer on the stack
IL_0001: ldc.i4.0 ; Load the constant 4-byte value of 0 on the stack
IL_0002: conv.i8 ; Convert the 4-byte 0 to an 8-byte 0
IL_0003: stfld int64 WordCounter::totalLines
The instruction at IL_0000 loads the first parameter that was passed to the method onto the virtual stack. Every instance method is always passed the address of the object's memory. This argument is called Argument Zero and is never explicitly shown in the method's signature. So, even though the .ctor method looks like it receives zero arguments, it actually receives one argument. The instruction at IL_0000, then, loads the pointer to this object onto the virtual stack.
The instruction at IL_0001 loads a constant 4-byte value of zero onto the virtual stack.
The instruction at IL_0002 takes the value from the top of the stack (the 4-byte zero), and converts it to an 8-byte zero — thus placing the 8-byte zero on the top of the stack.
At this point, the stack contains two operands: the 8-byte zero and the pointer to this object. The instruction at IL_0003 uses both of these operands to store the value from the top of the stack (the 8-byte zero) into the totalLines field of the object identified on the stack.
The same MSIL instruction sequence is repeated for the totalChars, totalBytes, and totalWords fields.
Initialization of the wordCounter field begins with instruction IL_0020, as shown here:
IL_0021: newobj instance void [mscorlib]System.Collections.SortedList::.ctor()
IL_0026: stfld class [mscorlib]System.Collections.SortedList WordCounter::wordCounter
The instruction at IL_0020 pushes the this pointer for the WordCounter onto the virtual stack. This operand is not used by the newobj instruction but will be used by the stfld instruction at IL_0026.
The instruction at IL_0021 tells the runtime to create a new System.Collections.SortedList object and to call its constructor with no arguments. When newobj returns, the address of the SortedList object is on the stack. At this point, the stfld instruction at IL_0026 stores the pointer to the SortedList object in the WordCounter object's wordCounter field.
After all the WordCounter object's fields have been initialized, the instruction at IL_002b pushes the this pointer onto the virtual stack, and IL_002c calls the constructor in the base type (System.Object).
Of course, the last instruction at IL_0031 is the return instruction that causes the WordCounter constructor to return to the code that created it. Constructors have to return void, so nothing is placed on the stack before the constructor returns.
Now look at another example. Double-click the GetWordsByOccurranceEnumerator method to see its MSIL code, which is shown in the following figure.
You see that the code for this method is 69 bytes in size and that the method requires four slots on the virtual stack. In addition, this method has three local variables: one is of the System.Collection.SortedList type and the other two are of the System.Collections.IDictionaryEnumerator type. Note that the variable names mentioned in the source code are not emitted to the MSIL code unless the assembly is compiled with the /debug option. If /debug is not used, the variable names V_0, V_1, and V_2 are used instead of sl, de, and CS$00000003$00000000 respectively.
When this method begins execution, the first thing it does is execute the newobj instruction, which creates a new System.Collections.SortedList and calls this object's default constructor. When newobj returns, the address of the created object is on the virtual stack. The stloc.0 instruction (at IL_0005) stores this value in local variable 0, or sl (V_0 without /debug) (which is of the System.Collections.SortedList type).
At instructions IL_0006 and IL_0007, the WordCounter object's this pointer (in Argument Zero passed to the method) is loaded onto the stack, and the GetWordsAlphabeticallyEnumerator method is called. When the call instruction returns, the address of the enumerator is on the stack. The stloc.1 instruction (at IL_000c) saves this address in local variable 1, or de (V_1 without /debug) which is of the System.Collections.IDictionaryEnumerator type.
The br.s instruction at IL_000d causes an unconditional branch to the IL test condition of the while statement. This IL test condition begins at instruction IL_0032. At IL_0032, the address of de (or V_1) (the IDictionaryEnumerator) is pushed onto the stack and, at IL_0033, its MoveNext method is called. If MoveNext returns true, an entry exists to be enumerated, and the brtrue.s instruction jumps to the instruction at IL_000f.
At instructions IL_000f and IL_0010, the addresses of the objects in sl (or V_0) and de (or V_1) are pushed onto the stack. Then, the IdictionaryEnumerator object's get_Value property method is called to get the number of occurrences of the current entry. This number is a 32-bit value that is stored in a System.Int32. The code casts this Int32 object to an int value type. Casting a reference type to a value type requires the unbox instruction at IL_0016. When unbox returns, the address of the unboxed value is on the stack. The ldind.i4 instruction (at IL_001b) loads a 4-byte value, which points to the address currently on the stack, onto the stack. In other words, the unboxed 4-byte integer is placed on the stack.
At instruction IL_001c, the value of sl (or V_1) – the address of the IDictionaryEnumerator)– is pushed onto the stack, and its get_Key property method is called. When get_Key returns, the address of the System.Object is on the stack. The code knows that the dictionary contains strings, so the compiler casts this Object to a String using the castclass instruction at IL_0022.
The next few instructions (from IL_0027 through IL_002d) create a new WordOccurrence object, and pass the object's address to the Add method of the SortedLists.
At instruction IL_0032, the test condition of the while statement is evaluated again. If MoveNext returns true, the loop executes another iteration. However, if MoveNext returns false, the execution falls through the loop and ends up at instruction IL_003a. The instructions from IL_003a through IL_0040 call the SortLists object's GetEnumerator method. The value returned is a System.Collections.IDictionaryEnumerator, which is left on the stack to become GetWordsByOccurrenceEnumerator return value.