Click to Rate and Give Feedback
Related Articles
Here the author introduces SQL Server Data Services, which exposes its functionality over standard Web service interfaces.

By David Robinson (July 2008)
Here the author answers questions regarding the Entity Framework and provides an understanding of how and why it was developed.

By Elisa Flasko (July 2008)
Here we present techniques for programmatic and declarative data binding and display with Windows Presentation Foundation.

By Josh Smith (July 2008)
Systems that handle failure without losing data are elusive. Learn how to achieve systems that are both scalable and robust.

By Udi Dahan (July 2008)
More ...
Articles by this Author
In this month’s installment of .NET Matters, columnist Stephen Toub answers reader questions concerning asynchronous I/O .

By Stephen Toub (July 2008)
This month Stephen Toub discusses asynchronous stream processing.

By Stephen Toub (March 2008)
This month Stephen Toub explains how to make the most of dual processors when running encryption and compression tasks.

By Stephen Toub (February 2008)
The author creates a managed wrapper to use the new IFileOperations interface in Windows Vista from managed code.

By Stephen Toub (December 2007)
Find out how to use finalizers as a way to warn developers who use your custom types when they are garbage collected without having been disposed of correctly.

By Stephen Toub (November 2007)
This month Stephen Toub discusses deadlocks that can occur when synchronizing threads.

By Stephen Toub (October 2007)
Stephen Toub and Shawn Farkas discuss creating an adapter that takes the functionality of RNGCryptoServiceProvider and adapts it to the interface of Random.

By Stephen Toub and Shawn Farkas (September 2007)
Stephen Toub gets nostalgic as he prepares to leave MSDN Magazine.

By Stephen Toub (August 2007)
More ...
Popular Articles
Speech Server 2007 lets you create sophisticated voice-response applications with Microsoft .NET Framework and Visual Studio tool integration. Here’s how.

By Michael Dunn (April 2008)
In this article, author John Torjo presents a guide to his C++ GUI library called eGUI++ and explains how it makes user interface programming easier.

By John Torjo (June 2008)
In this article we introduce you to BizTalk Services, new technology that offers the Enterprise Service Bus features of BizTalk Server as a hosted service.

By Jon Flanders and Aaron Skonnard (June 2008)
Joel Pobar presents an introduction to how compilers work and how you can write your own compiler to target the .NET Framework.

By Joel Pobar (February 2008)
More ...
Read the Blog
There are many things called threat modeling. Rather than argue about which is "the one true way," a good practice is to consider your needs and what your skills, abilities, and schedules are, and then work with a method that's best for you. In the July 2008 issue of MSDN Magazine, ...
Read more!
Want to develop games for Xbox Live? Want to get paid for it, too? Click on over to the XNA Team Blog to learn more about their initial rollout of the XNA Creators Club for XNA Game Studio. ...
Read more!
The Microsoft Entity Data Model (EDM), based on Dr. Peter Chen's Entity Relationship (ER) model, is the driving force behind the ADO.NET Entity Framework. The EDM is also the feature that most significantly differentiates the Entity Framework from other ORM-style technologies in the marketplace. In the July 2008 issue of MSDN ...
Read more!
System.IO.File is a handy helper class for reading and writing data, but its methods support only synchronous operation. Is there an easy way to provide File’s functionality for asynchronous file I/O? In the July 2008 issue of MSDN Magazine, Stephen Toub walks through several ...
Read more!
Remember .NET Terrarium, the interactive game meant to introduce .NET development techniques? Well, the Windows SDK team has released the source code for .NET Terrarium 2.0 on CodePlex. You can read more about this release on the Windows SDK blog and at Microsoft ...
Read more!
The Enumerable class plays an important role in every LINQ query you create. Because the Enumerable class's extension methods can process many other classes—including Array and List—you can use methods of the Enumerable class not only to create LINQ queries, but also to manipulate the behavior of arrays and other data structures. In the July 2008 issue of MSDN ...
Read more!
More ...
New information has been added to this article since publication.
Refer to the Editor's Update below.

.NET Matters
How Microsoft Uses Reflection
Edited by Stephen Toub

Code download available at: NETMatters0407.exe (153 KB)
Browse the Code Online
In this fourth installment of .NET Matters, I'm taking a breather from the Q&A format. Instead, this month I bring you stories from the trenches; developers in product groups within Microsoft describe how they use .NET technologies to get their jobs done and develop their products. This month, Steve Lamb, a lead software design engineer on the Xbox team, and Adi Oltean, a software design engineer from the core file services team, discuss ways they've used reflection (System.Reflection and System.Reflection.Emit) in their particular products.

Steve Lamb, Xbox Live Team
The Xbox Live™ architecture consists primarily of services written using ASP.NET. The many benefits of ASP.NET and the .NET Framework, including rapid development, easy deployment, stability and maintainability, made them ideal technologies to meet the stability, security, and flexibility requirements of Xbox Live.
One of the challenges of using managed code on our servers was in developing the network protocol used to talk with Xboxes. These Xbox clients are written primarily in C++ and are optimized for low memory overhead and network bandwidth usage. We needed a simple way of converting data structures written in C# to and from the binary equivalents used by the C++ clients.
In the simplest cases, this can be done by using the functionality provided in the System.Runtime.InteropServices namespace, such as the Marshal.StructureToPtr and Marshal.Copy routines. However, these methods are very limited in the data types that can be converted, there's little control over how a data type is serialized, and there's added risk of causing crashes or leaking memory when improperly used. It's also possible in some scenarios to use unsafe code to read and write byte arrays, casting them to and from the appropriate C# data structures. On our team, however, we try very hard not to use unsafe code as it increases our risks, just as with the Marshal options.
Most of the 200+ protocol structures defined for Xbox Live are anything but simple. Many contain reference types, embedded structures, and variable length arrays and strings. Dates are transmitted as FILETIME values. Strings are UTF8-encoded before transmitting over the network. Most strings and arrays also include a length field specifying the number of bytes in a string or the number of elements in an array.
Writing custom marshaling code for each protocol structure would quickly become prohibitively expensive and error prone. To address this, we developed a fast, flexible, and type-safe way of serializing C# structures and classes, which maintains most of the advantages of custom serialization code without having to actually write custom serialization code. This was done using the powerful Reflection and Reflection.Emit features of the Framework SDK.
By way of example, take a look at the protocol structure written in C++ and C# in Figure 1. These structures may be sent when an Xbox client requests the latest version of a user's Friends list. The "header" structure, XBOX_FRIENDS_LIST, contains a count of the number of friends (cFriends) and is followed by a list of XBOX_FRIEND structures, each describing the user ID and gamertag of that particular friend, and the date they were added. The structure definitions are surrounded by #pragma pack() instructions, which are used to tell the C++ compiler not to add the usual padding to byte-aligned fields in the structure.
This XBOX_FRIENDS_LIST, as defined, would not be serializable by the Marshal.StructureToPtr method, due to the inability to serialize the array of structures in the friends field. Furthermore, the XBOX_FRIEND class could not be correctly serialized by the same Marshal method, as the DateTime member added would not be converted to a FILETIME before conversion.
In order to marshal these structures, the developer would have to write custom serialization methods. Reading these structures from a byte stream would typically utilize the System.IO.BinaryReader object, reading each individual field sequentially from the stream. Appropriate conversions would be performed along the way, such as converting strings from their UTF8 form to Unicode and converting FILETIME values to the corresponding DateTime value. This method would consist of code similar to that in Figure 2.
Custom marshaling code to serialize the structures to a byte stream would consist of essentially the same process, instead using the System.IO.BinaryWriter class. Furthermore, fields such as numFriends and gamertagLen could be set to the correct values just before serialization to ensure accuracy.
Obviously, repeating this pattern for the large number of structures we currently use would be tedious, error prone, and difficult to maintain—the antithesis of what we've come to expect when developing in C#. This becomes a great application for Reflection and Reflection.Emit, since each of the fields are serialized based on their individual types. The Xbox team's approach was to define a base class, Serializer, from which any object to be serialized inherits and which provides two methods, ReadFromStream and WriteToStream, to support reading to and writing from a byte stream. These methods, in turn, invoke dynamically generated code which performs the actual calls to the field serialization methods. The type-specific work performed by these methods is generated at run time as a part of the first invocation of the method. This is implemented by declaring delegates for the dynamically generated read and write methods. The delegate methods for each type is maintained in a static hashtable.
On each invocation of ReadFromStream or WriteToStream, the method checks this hashtable for an existing delegate for the type. If it doesn't exist, one is created and is added to the hashtable. The delegate is then invoked, taking as arguments the specific object instance and the stream to be read from or written to:
public void ReadFromStream(BinaryReader reader) {
    ReadStreamDelegate readStream;
  string key = this.GetType() + ".ReadStream";
  readStream = (ReadStreamDelegate)SerializerDelegates[key];
  if (readStream == null) {
      readStream = GenerateReadStream();
      SerializerDelegates[key] = readStream;
  }
  readStream(this, reader);
}
The process of dynamically generating the method consists of creating a container assembly and class, creating the method, and emiting the Microsoft® intermediate language (MSIL) opcodes to read or write the individual fields of the provided type. The dynamically generated MSIL looks very similar to the MSIL that would be generated by the C# compiler if a custom serialization method was written to resemble the code shown in Figure 2. Reflection is used to discover the fields of the object to be serialized.
[ Editor's Update - 8/17/2004: This solution relies on assumptions concerning the functionality of Type.GetFields, assumptions which are not guaranteed to hold in future versions of the .NET Framework, including v2.0. Please note that GetFields is not guaranteed to return the fields of a struct or a class in the same order in which they're defined in that struct or class. However, in v1.x of the .NET Framework, the C# compiler usually maintains source-order when emitting metadata, and in v1.x the order of FieldInfo instances returned from GetFields will always be the same for a given Type. The solution provided here can be remedied by sorting the FieldInfo objects returned by GetFields into the proper order for serialization and deserialization; one approach for doing so would be to tag each field with a custom attribute specifying the ordering and then use that attribute to sort the FieldInfo instances returned from GetFields.] For simplicity, we just serialize all of the public instance fields of the type, as you can see here:
ILGenerator gen = methodBuilder.GetILGenerator();
FieldInfo fields = this.GetType().GetFields(
    BindingFlags.Public | BindingFlags.Instance);
EmitHeader(gen);
foreach(FieldInfo field in fields) EmitReadField(gen, field);
EmitFooter(gen);
To generate the MSIL for reading a field, we first determine the BinaryReader method that will be invoked. To simplify this code, the team built a hashtable to map field types to the specific method to be called, as shown in Figure 3.
Generating the MSIL code to read each field is then simply a matter of pushing the BinaryReader onto the stack and then calling the helper method, setting the field to the result:
MethodInfo method = (MethodInfo)
    SerializationMethods.ReadMethods[fieldInfo.FieldType];
gen.Emit(OpCodes.Ldarg_0);
gen.Emit(OpCodes.Ldarg_1);
gen.EmitCall(OpCodes.Callvirt, method, null);
gen.Emit(OpCodes.Stfld, fieldInfo);
Writing primitives to a BinaryWriter takes a similar approach, first pushing the BinaryWriter on the stack, followed by the locally cast object and field as parameters. Reading and writing more complex types such as strings, DateTime values (which must be converted to or from FILETIME for our purposes), and arrays takes a little more code than the basic primitives. However, because dynamically generated MSIL is magnitudes more difficult to debug and maintain, the team tries to keep as much of this added complexity as possible in C#. This is done by adding a series of static helper methods to the SerializationMethods class to perform the necessary functions. For example:
public static DateTime ReadDateTime(BinaryReader reader)
{ 
    return DateTime.FromFileTime(reader.ReadInt64());
}
Now, the dynamically generated MSIL for reading more complex types is almost identical to that for the primitive types.
The MSIL code for arrays and strings is slightly more complex. Helper methods to read arrays and strings are defined as taking both the BinaryReader and the length as parameters. The length to use is determined while generating the dynamic code, based on a custom attribute specified on the field declaration. Lengths can either be declared as a constant length or as a dynamic size specified in another field on the object. If it's a constant length, the value is simply pushed on the stack before invoking the helper method. Otherwise, the field's value is loaded onto the stack. The helper method is then invoked:
FieldInfo sizeField = GetLengthField(fieldInfo, out staticSize); 
if (sizeField == null) {
    gen.Emit(OpCodes.Ldc_I4, staticSize); 
} else {
    gen.Emit(OpCodes.Ldloc_0);
    gen.Emit(OpCodes.Ldfld, sizeField);
}
gen.EmitCall(OpCodes.Callvirt, method, null);
Of course, if you're using a field to define the length of another field, you need to ensure that the field storing the length is correct before the object is serialized. When writing arrays and strings which may have variable-length fields, we first generate the code to set the corresponding length field to the correct value. This is done by checking each field on a type for a custom "Serialize" attribute with the SizeField parameter set. As shown in Figure 4, when such a field is found, code is emitted to set the value of the SizeField to the length of the UTF8-encoded string or to the number of elements in the array, depending on the field's type. The code to write arrays and strings is similar to the code that reads them, except it takes as parameters a BinaryWriter instead of a BinaryReader along with the actual array or string.
When reading fields that inherit from Serializer (since Serializer-derived classes can have fields whose types are also derived from Serializer), the MSIL code first creates an instance of the type and then calls ReadFromStream on the new object:
ConstructorInfo ctor =   
    fieldInfo.FieldType.GetConstructor(Type.EmptyTypes);
readMethod = fieldInfo.FieldType.GetMethod("ReadFromStream");
gen.Emit(OpCodes.Ldloc_0);
gen.Emit(OpCodes.Newobj, ctor); 
gen.Emit(OpCodes.Stfld, fieldInfo);
gen.Emit(OpCodes.Ldloc_0);
gen.Emit(OpCodes.Ldfld, fieldInfo);
gen.Emit(OpCodes.Ldarg_1);
gen.Emit(OpCodes.Callvirt, readMethod);
Writing is even simpler (since no new object needs to be created) and we simply invoke the subobject's WriteToStream method. Reading and writing arrays of Serializer-derived objects involve the same process as for a single object, repeated for each element that is found in the array.
This approach for marshaling objects was chosen for the Xbox Live service primarily for its flexibility in handling a large variety of data types and structures. It's fairly easy to add support for new types, and it helps to avoid common bugs such as specifying the incorrect length for an international string. Furthermore, the custom attributes allow us to provide other code-enforcement checks, such as ensuring that variable-length strings can be no longer than a specific length. This provides another means of defense against common attacks such as buffer overruns.
Interestingly, dynamically generated assemblies are also relatively fast. Unscientific tests show that reading and writing structures that have six to eight fields with the dynamically created objects perform almost twice as fast as marshaling using the System.Runtime.InteropServices.Marshal methods. Custom code, however, could be an even faster route than the dynamically generated methods. Thus, the ReadFromStream and WriteToStream methods are declared as virtual; if speed is of the utmost importance, these can always be overridden with custom code. The dynamically generated code, however, has been more than fast enough for the vast majority of the needs of the Live service.
Note that the approach I've outlined here is similar to how XmlSerializer works, although rather than requiring all serializable types to derive from a base class, XmlSerializer takes the Type of the object to be serialized as a constructor parameter. The XmlSerializer's instance methods for serializing and deserializing then delegate to the dynamically created read and write methods for that Type. For more information on the code generated by XmlSerializer, see the last question in the April 2004 .NET Matters column.

Adi Oltean, Core File Services Team
One of the questions most frequently asked by developers is how to write reusable code. There are many answers to this question. Still, writing a reusable set of classes or templates can be quite difficult. If you write a generic "data grid" class, for example, you can't possibly know in advance all of the scenarios in which your class will be used. If your class will be used by thousands of developers, your implementation will invariably not satisfy all of them. Some people will need a faster version of your class, or a thread-safe version, or will require any of the plethora of other features that your implementation does not provide.
Generally, the problem with flexibility resides in the contract between the class/template being used and the code that is using this module. Sometimes this contract is too restrictive because this is how it was designed in the first place. Sometimes, this contract does not address certain usage scenarios, and developers will have a hard time using your class.
One solution to this new challenge is to design reusable code in a more flexible and configurable way. Instead of designing a C++ class, you could instead create a template. This template would accept a number of parameters that can be used to configure the piece of code that is being reused. A standard example of this kind of flexibility is the memory allocation scheme in STL—there, many classes can be configured to use a custom memory allocator (though a default allocator is provided).
C++ templates represent a simple way to design configurable, reusable code. Still, in some scenarios, templates might offer limited configurability. For example, a template has a fixed number of parameters, which can be very restrictive in certain situations.
The .NET Framework adds a new dimension to the usual array of techniques for creating reusable code. The fact that you can attach attributes to methods and classes gives you immense power. While certainly not a parallel, in some sense .NET attributes can be viewed as a more powerful equivalent of C++ template arguments. For example, you could "configure" a .NET class with a custom attribute in the same way you configure a template by specifying a certain argument. You could write a C# class that would look like the following:
[Key(typeof(String)), Value(typeof(String))]
class MyHashTable {
    •••
}
This resembles the following C++ template:
template<class Key, class Value>
class MyHashTableTemplate {
    •••
};
typedef MyHashTable<string, string> MyHashTable;
Of course, there are many significant differences between these C# and C++ examples, but in both cases the hashtable implementation is able to vary its behavior based on the attributes/template parameters that are supplied.
Let's continue this comparison between .NET attributes and C++ templates. The first thing to note is that you can specify a variable number of custom .NET attributes, as opposed to a fixed number of arguments in a template. On the other hand, there might be a performance penalty when you choose the reflection route: C++ templates are expanded at compile time, whereas .NET attributes need to be reflected over at run time. That said, custom .NET attributes are a great extensibility mechanism.
As an example, my team has designed and created a reusable framework for command-line parsing. If you have ever written a command-line application that takes a lot of parameters, you undoubtedly had to write a lot of embedded switch statements with complex parsing rules (depending on whether the parameters are optional, repeating,and so on). Usually you end up with spaghetti code. We developed a simple, straightforward, reusable codebase that parses the command line according to simple grammar-based rules, then invokes the appropriate methods in our classes.
When designing the system, we decided the command-line tool should be able to deal with multiple tasks, depending on the specified command-line options. For every separate task, we would like to invoke a certain callback, and we can establish the link between callbacks and tasks using custom attributes. Here is an example:
// Binding the ListFiles routine to the "/l" option
[Pattern("/l <directoryName>")]
[Description("Lists the directory contents.")]
public int ListFiles([DirectoryPath]string directoryName) {
    Console.WriteLine("Listing files under {0}...", directoryName);
    foreach(FileInfo fi in new DirectoryInfo(directoryName).GetFiles())
    {
        Console.WriteLine(fi.Fullname);
    }
    return 0;
}
In this code, I've bound the method ListFiles with the command-line option "/l <directoryName>". When the arguments match the pattern, the infrastructure will extract the parameter "directoryName" and then invoke the method, supplying the parameter parsed from the command line to the method.
A simplified version of our parsing and dispatch routine is shown in Figure 5. We use the Regex class to match the command-line parameters against the bound pattern. If we find a match, we enumerate the method parameters and, for each parameter name, we extract the value of the corresponding Regex named group. With these values we build a parameter array which will be used to invoke the matching method. The GetAttribute routine supplies the magic ingredient to get the value of our custom parameters in the first class. GetAttribute extracts the attribute value in string format using the standard .NET GetCustomAttributes API. The attributes used by our infrastructure (namely PatternAttribute and DescriptionAttribute) are derived from an attribute class GenericTextAttribute. This attribute defines a Text property that stores the value of our attribute:
public class GenericTextAttribute : Attribute {
    public GenericTextAttribute(string textParam) {
        _text = textParam;
    }
    public string Text { get { return _text; } }
    private string _text;
}
As a bonus, the command-line framework implements a usage method that takes advantage of reflection to dynamically create a usage message for the user. This method will be called automatically whenever we invoke our application with the command-line option "/?" and will simply iterate through all methods that are bound to a PatternAttribute attribute, writing the description and the pattern to the console.
As you have seen, .NET offers many powerful ways to write reusable code. Not only do you have all the traditional techniques like inheritance and components (and generics in the .NET Framework 2.0), but reflection can offer simple and elegant solutions where classic techniques fail to meet these requirements.

Send your questions and comments to  netqa@microsoft.com.


Page view tracker