From the July 2001 issue of MSDN Magazine.

MSDN Magazine

Visual Studio .NET: Managed Extensions Bring .NET CLR Support to C++

Chris Sells
This article assumes you�re familiar with C++, COM, and .NET
Level of Difficulty     1   2   3 
SUMMARYIf you're a longtime C++ programmer, the thought of migrating to Visual Studio .NET might make you wary at first. However, a new feature of Visual C++, the managed extensions for C++, allows you to build apps with the .NET Framework. When you use managed C++, your code is handled by the common language runtime (CLR). This provides advanced features like garbage collection, interoperability, and more. This article explains why you'd want to use the managed extensions, how to use them, how to mix managed and unmanaged code, and how your managed code can interoperate with programs written in other languages.
T he C++ language has been around a long time. It was first developed by Bjarne Stroustrup in 1983 and was subsequently ratified as an ANSI standard in 1997. C++ evolved a great deal during those 14 years to meet the needs of a multi-platform programmer community. However, even before the C++ standard was ratified, Microsoft began extending Visual C++® to meet the specific needs of Windows®-centric programmers, adding extensions to each new version of the compiler.
      Now with the introduction of Microsoft .NET, the C++ compiler team at Microsoft has again allowed C++ programmers to use their language of choice to build and use components for the new platform. Be warned, however, that this is not your father's C++. Things are going to be different.
      (Also be warned that this article is based on the publicly available Beta 1 of the .NET Framework SDK and the Visual Studio .NET tools. Things are guaranteed to change between now and the release of .NET, although the concepts should all be the same.)

.NET Distilled

      As a developer for Windows, you can't help but know about the new .NET platform. But as a refresher, the major features are:
  • Simplified language interoperability with rich type support, including support for cross-language inheritance, typed exceptions, constructors, and shared base types.
  • Garbage collection, providing optimized, automatic memory management.
  • Robust versioning, allowing multiple versions of a component to coexist peacefully on a single machine or in single process.
  • Microsoft intermediate language (MSIL), which allows for code verification and retargeting.
      These features are implemented by the .NET common language runtime (CLR), which provides these services to individual components. (The CLR is a DLL that executes .NET components.) The components themselves contain the metadata and implementation, itself a mixture of processor-specific code and IL. The metadata provides support for language interoperability, garbage collection, versioning, and, along with the IL, verifiability.
      C# was built from the ground up to support all of the major features of .NET, so it's no surprise that it handles them naturally. For example, a simple C# class looks like this:
  // talker.cs
namespace MsdnMagSamples {
  public class Talker {
    public string Something = "something";
    public void SaySomething() {
        System.Console.WriteLine(Something);
    }
  }
}

This class (in a file called talker.cs) can be compiled like so:
  c:\> csc /t:library /out:talker.dll talker.cs

      You'll notice that C# is very similar to C++, except for one very important difference: once I've compiled the Talker class into a .NET assembly (roughly a DLL or an EXE that exposes .NET components), all that's required to expose this component to all interested .NET clients, regardless of their implementation language, is that leading keyword: public. No specific entry points are required. No mappings between .NET types and C# types are needed. The compiled component provides all the metadata that the .NET runtime needs to expose the Talker class. A Visual Basic .NET client, for example, can use the metadata to access the Talker class like this:
  'talkercli.vb 
Public Module TalkerClient
  Sub Main()
    Dim t as new MsdnMagSamples.Talker
    t.Something = "Hello, World"
    t.SaySomething()
  
    System.Console.WriteLine("Goodnight, Moon")
  End Sub
End Module

This talkercli.vb file can then be compiled with the following command line:
  c:\> vbc /t:exe /out:talkercli.exe /r:talker.dll talkercli.vb

      In contrast, a simple port of the Talker C# class to a C++ DLL will not even allow other C++ clients to access it (without some compiler tricks and assumptions), let alone clients of other languages. Unfortunately, C++ programmers are familiar with this particular limitation. Arguably, the entire history of programming for Windows can be chronicled as an effort to expose components in one language to clients of another. DLLs use C-style functions. COM uses interfaces. As a C++ programmer, implementing DLL entry points wasn't pleasant because it didn't feel object-oriented enough. On the other hand, COM made heavy use of object-orientation, but because no types were standardized except the interface, C++ programmers incurred large code overhead to expose even the simplest functionality to COM clients.
      As with DLLs and COM, Microsoft has continued to support multiple languages by adding support for .NET to their own languages and encouraging other language vendors to do the same. You should expect to find your old Microsoft favorites—like Visual Basic®, C++, and JScript®, as well as something like two dozen third-party and research languages—supporting .NET. In fact, I sat next to a guy doing an APL-to-.NET port at the last PDC. If .NET is going to support APL, you can be pretty sure your language of choice is going to be there.

Managed C++ Clients

      Of course, my language of choice is C++ and Microsoft supports C++ in .NET with something called the managed extensions to C++, more commonly known as managed C++. In managed C++, the code and components generated by using the managed extensions are handled by the CLR—that is, they're garbage collected, allowed access to other managed types, versioned, and so on. For example, a simple managed C++ program can access the C# Talker class, as you can see in Figure 1. The class is compiled like so:
  C:\> cl /CLR talkcli.cpp /link /subsystem:console

      This client sample brings up several interesting points about managed C++. First, notice the use of the new #using directive. (See Figure 2 for a full list of new managed C++ keywords and directives.) The first #using directive tells the managed C++ compiler to pull in the metadata that describes all of the core .NET types. These types are included in the top-level System namespace, which contains a huge number of nested classes and namespaces that separate and classify the .NET Framework. Namespaces in managed C++ work just like they do in C++, so you won't be surprised to learn that System::Console::WriteLine is calling the static WriteLine method on the Console class nested in the System namespace. The second #using directive is pulling in my custom C# component. Here I'm using a "using namespace" statement to save some typing when I access the Talker class, just like you're used to doing in unmanaged C++.
      Second, you'll notice that I am using main as my entry point. Managed C++ programs are still C++ programs, so console apps need a main. Likewise, once you've brought in the Talker class via #using, it's just normal C++ to create an instance via new, set the property, and invoke the method. The only difference is the use of the leading "S" on the string literals to designate them as managed, instead of unmanaged. The managed C++ compiler is perfectly happy to deal with unmanaged string literals, too, and will convert both ANSI and Unicode for you, but it's a little less efficient to do the conversion, so I'm avoiding it. If you're interested, you can use the .NET disassembly tool, ILDASM, on talkcli.exe to see the difference in the generated IL when you use the S prefix and when you don't. In general, ILDASM is a fantastic tool for .NET programming that you should become familiar with.
      Third, to enable the managed extensions and to make your code managed, use the new /CLR option on the compiler. If you'd like to do the same thing in Visual Studio .NET, you can set the Use Managed Extensions option in the project's Property pages.
      In general, pulling .NET types into managed C++ looks and feels much more like using native C++ types than COM and \ATL ever did. The real major difference is the one I haven't mentioned yet: what happens when you use new to create a managed type? And why didn't I bother to call delete (you thought I forgot)?

Garbage Collection

      A few years ago, I was avoiding work of some kind or another, so I tried an experiment. I wrote a raw COM client that created a COM object, accessed some interfaces, called some methods, and did something with the results. Of course, along the way I had to explicitly release the resources I'd acquired, such as interface pointers, BSTRs, and so on. To complete the experiment, I ported my code to use smart types—C++ classes that knew how to manage their own resources—so that I didn't have to release my resources manually. By using smart types like CComPtr and CComBSTR, I was able to reduce the number of lines of code in my COM client by something like 40 percent. Of course, you can draw any line you like through a single data point, but I think you would agree that resource management code in C++ is a significant percentage of the code you write.
      Some lucky programmers don't have to deal with this kind of thing. Some languages such as COM-based scripting languages and versions of Visual Basic before .NET use reference counting to manage objects. Each additional reference is another lock on the object and when all references are gone, the object is destroyed. Unfortunately, reference counting has a major problem—reference cycles. Having just shipped a fairly large COM-based project, I can tell you that reference cycle bugs are a very real problem and enormously difficult to track down. Frankly, most folks don't even bother to find these bugs, let alone fix them (itself a difficult job). So, instead of going the reference counting route in .NET, Microsoft went another route: garbage collection.
      The .NET CLR uses a garbage collector to periodically walk the list of all objects that have been created, letting them know if they're no longer required by calling their Finalize method, and returning the memory to the managed heap from whence it came. When I used the new keyword in the Talker client, I was allocating memory from the .NET managed heap, which means I never have to remember to call delete. And, because objects are no longer tracked with reference counts, you don't have to worry about reference cycles. With .NET and garbage collection, Microsoft has taken memory leaks and reference cycles off the list of potential bugs in your components.

Resource Management

      There is a problem when the garbage collector does its magic, however. C++ programmers who use smart types or stack-based objects are used to objects going away at scope boundaries. This will no longer happen for managed types.
      If the resources associated with an object are only memory-based, this is no problem. The garbage collector will certainly spring into action when it runs out of memory, and often long before that. However, if the resources such as file handles, database connections, socket connections, and so on that are held in your objects are not memory-based, those resources cannot be automatically reclaimed with any determinism. In other words, even though the garbage collector will probably call your managed object's Finalize method at some time in the future, you cannot know when that will happen. (Unless you want to force it using System::GC::Collect, which causes all objects to be collected, not just the one you're working with.) If the resource being held is one you'd like to use again soon, you can't wait for the garbage collector to run. Instead, the client of the object holding the contentious resource must let the object know to release the resource manually.
      Because of this, most managed types that hold non-memory resources implement something called the Disposer Pattern. The Disposer Pattern requires a client of a managed type to call a specific method, usually called Close or Dispose, when the client is done. To support this pattern in the face of exceptions, managed C++ provides an extension to the try-catch block called __finally, which is guaranteed to be called whether or not there's an exception.
  void WriteToLog(const char* psz) {
  MyManagedFile* file;
  try {
    file = new MyManagedFile("log.txt");
    file->WriteLine(psz);
  }
  __finally {
    if( file ) file->Close();
  }
}

      If you're a C++ programmer, I'm guessing your first reaction to this is "ick." Your second reaction is probably to ask, "Why not create the object on the stack and let the destructor call Close?" Unfortunately, managed classes can't be created on the stack. They have to be created on the managed heap so that the garbage collector can manage them. .NET does have something called value types that can be allocated on the stack but, unfortunately, they are not allowed to have destructors. In fact, managed types don't really have destructors (the method that's called when the object goes out of scope) at all in the C++ sense. They only have a Finalize method, which is called by the garbage collector whenever it feels like it (which is what led to this problem in the first place).
      If your third reaction to this state of affairs is to throw your hands up and go back to unmanaged types, before you go, thank your lucky stars that you picked managed C++ over C# or Visual Basic .NET. Neither of the latter two languages can get away from writing code hampered by the drawbacks I've described. However, the facilities of managed C++ allow you to mix managed and unmanaged types in the same file. You can build an unmanaged smart type that can Finalize managed types for you. In fact, you're going to use something Microsoft provides in the .NET SDK called gcroot (defined in gcroot.h):
  template <class T> struct gcroot {...};

      The gcroot class is an unmanaged template class for caching a pointer to a managed type. (.NET doesn't support managed templates yet.) This is especially useful when you'd like to cache managed types as member data in unmanaged types, which managed C++ doesn't support directly. The gcroot class uses a managed class called System::Runtime::InteropServices::GCHandle to massage the managed pointer back and forth between a managed pointer and an integer, which is how gcroot can cache it. The gcroot class also provides an operator-> to expose the managed pointer. However, its dtor does not call Finalize to let the object know you're done. You can do that with a specialization of gcroot, like this one:
  template <typename T>
struct final_ptr : gcroot<T> {
  final_ptr() {}
  final_ptr(T p) : gcroot<T>(p) {}
  ~final_ptr() {
    T p = operator->();
    if( p ) p->Finalize();
  }
};

This class is just a prototype that's probably going to need some work as the .NET platform winds its way towards release. Check the MSDN® Online developer center (https://msdn.microsoft.com/net) for updates.
      Using this class reduces the client code to the following:
  void WriteToLog(const char* psz) {
  final_ptr<MyManagedFile*> file = new MyManagedFile("log.txt");
  file->WriteLine(psz);
}

      The ability to write and use this helper class shows the power of managed C++ when compared to simpler languages such as C# and Visual Basic .NET, which require more complicated code to handle non-memory resources.

Managed Classes and Interfaces

      When you use the managed extensions to C++ to compile, by default you're going to get managed code (which gives you access to managed types), but unmanaged types. If you'd like your classes to be managed, you need to use the new managed C++ keyword: __gc. Once you've done that, if you'd like to make your class available to the outside world, you can do so using the public keyword. Figure 3 shows an example of implementing your .NET Talker class in managed C++. This class can be compiled like so:
  cl /LD /CLR talker.cpp /o talker.dll

      Three things are interesting enough to point out here. The first is that I gave the managed class a destructor, or at least it looks like I did. Remember that I said that .NET types don't really have a destructor, only an optional Finalize method? Well, because C++ programmers are so used to the notation, the managed C++ team decided to map managed C++ destructors to an implementation of Finalize, plus a call to the base's Finalize method. The C# team did this, too, but remember that in neither case is this really a destructor in the traditional C++ sense.
      The second interesting thing is that I'm exposing public data members directly to .NET clients. Data members are called fields in .NET, which implies that they are data with no code. And, just like public data members are a bad idea for C++ classes, fields are a bad idea for .NET classes because they don't allow values to be calculated, validated, or to be read-only. In C++ you use getters and setters functions to expose data members. In .NET, you want to do the same via properties. A property is a function that exposes data, but in a way that allows the component to add code. A property is designated in managed C++ via the __property keyword and a leading get_ or set_ prefix, like so:
  public __gc class Talker {
private:
  String* m_something;
public:
  __property String* get_Something() {
    return m_something;
  }
  __property void set_Something(String* something) {
    m_something = something;
  }
•••
};

      If you wanted to calculate output or validate input, you can do so in the get_ or set_ function, respectively. Likewise, if you wanted to make the property read-only or write-only, you simply have to remove the appropriate set_ or get_ function. On the client, the notation is the same for both fields and properties:
  t->Something = "Greetings Planet Earth";

      Be careful about switching access to data between fields and properties, however. It might seem that it would be easy to start with a field and then switch to a property as your design warrants. Unfortunately, the underlying IL is different for field access and property access, so clients that were bound to your field will raise a runtime exception if that field changes to a property. If you make the change, your clients must be recompiled.
      Take another look at the managed C++ Talker class in Figure 3 and notice that it is being directly exposed to all .NET clients. That's a trick that COM can't do. COM was only allowed to expose functionality via interfaces. A public method on a C++ class that implemented a COM interface wasn't available unless that method was part of the interface. .NET takes away the need for separating exposed functionality into interfaces. However, interfaces are still important for exposing generic functionality. To define a .NET interface in managed C++, you use the __interface keyword along with the __gc keyword, like so:
  public __gc __interface ICanTalk {
  void Talk();
};
public __gc class Talker : public ICanTalk {
•••
// ICanTalk
  void Talk() { SaySomething(); }
};

      Since the client has access to the Talker class, it can call ICanTalk methods along with the other public methods. Or, if the client has a reference to a base type (all managed types derive ultimately from System::Object), it can cast for the type. A managed C++ client can cast via dynamic_cast, which has been updated to support .NET type navigation, or a new cast called __try_cast, which will throw an exception if it fails:
  void MakeTalk(Object* obj) {
  try {
    ICanTalk* canTalk = __try_cast<ICanTalk*>(obj);
    canTalk->Talk();
  }
  catch( InvalidCastException* ) {
    Console::WriteLine(S"Can't talk right now...");
  }
}

Mixing Managed and Unmanaged Code

      When you set the /CLR option on the files in your project, you're going to get managed code, which will give you access to managed types. If you'd like a section of your code to remain unmanaged, you may do so using a new #pragma statement:
  // mixed.cpp
...managed code by default...
#pragma unmanaged
...unmanaged code...
#pragma managed
...managed code...

      This #pragma allows you to mix managed and unmanaged code in a single module. While I'll leave the ethical ramifications of this to you to determine, having unmanaged code in your module is no different than pulling in unmanaged libraries or DLLs, which is something that you're bound to want to do for quite a while (even programmers who use Visual Basic still call DLL functions). As soon as you want to call unmanaged code from managed code, you've got to be careful if you're trying to pass pointers to managed types.
      For example, imagine you wanted to call VarI4FromI2, passing in the pointer to a long from the managed heap, like so:
  HRESULT __stdcall VarI4FromI2(short sIn, long* plOut);
__gc struct ShortLong {
  short n;
  long  l;
};
void main() {
  ShortLong*  sl = new ShortLong;
  sl->n = 10;
  VarI4FromI2(sl->n, &sl->l); // Compile-time error
}

Luckily, the compiler will prevent such behavior, because as soon as you're allowed to pass a managed pointer to unmanaged code, the garbage collector loses track of it and the next time it runs, it could easily move the object to which the pointer points.
      To guard against this happening, you must explicitly pin the object inside of a scope such that the garbage collector knows not to move the object. This can be done using the __pin keyword:
  void main() {
  ShortLong*  sl = new ShortLong;
  sl->n = 10;
  long __pin* pn = &sl->l;
  VarI4FromI2(sl->n, pn);
}

Once the pinned variable goes out of scope, the lock is removed on the managed memory and the garbage collector is free to move it around again at will.

Value Types

      So far, I've been talking about defining and using .NET reference types. A reference type is something that's allocated from the managed heap and destroyed by the garbage collector. A .NET value type, on the other hand, is a type that is allocated on the stack (unless it's a member of a reference type) and is destroyed when the stack is unwound. Value types are meant to be used for very simple composite types without the overhead of being managed by the garbage collector. For example, a typical value type would be declared in managed C++ using the __value keyword:
  __value struct Point {
  Point(long _x, long _y) : x(_x), y(_y) {}
  long x;
  long y;
};

      Notice that my value Point type has a constructor. One other constructor that all value types have is the default constructor that sets all members to zero. For example, you can allocate this value type in two ways:
  Point pt1;       // (x, y) == (0, 0)
Point pt2(1, 2); // (x, y) == (1, 2)

      What's particularly interesting about value types is that they can be treated as a reference type as well. This is useful when you'd like to pass a value type to a method that takes a reference type argument—that is, to add it to a collection. For example, to output the values of my Point's x and y values using WriteLine, I might think to do the following:
  Console::WriteLine(S"({0}, {1})", pt.x, pt.y);

      Unfortunately, that doesn't compile, because WriteLine expects a format string and a list of references to objects of type System.Object. (WriteLine uses the base method ToString to request an Object's string format for printing.) However, you can convert value types to reference types by boxing them. The act of boxing a value type allocates the corresponding reference type on the managed heap and copies the values into the new memory. To box a value type under managed C++, use the __box operator:
  Console::WriteLine(S"({0}, {1})", __box(pt.x), __box(pt.y));

      Value types are a way to build simple, efficient types, while boxing allows you to gain the polymorphic benefits of reference types on demand.

Attributes

      If C++ was built on classes and COM was built on interfaces, the core building block of .NET has got to be metadata. The various managed C++ language features I've shown all depend on metadata in some way. However, managed C++ doesn't expose new keywords or compiler directives to provide access to all of the metadata that you can set on your assemblies or your classes. Frankly, it couldn't do so and leave much room for variable and type names—especially since the available metadata attributes are fully extensible.
      To support all types of metadata now and in the future, managed C++ adds a whole new syntax: attribute blocks. Attribute blocks are designated with square brackets before the type to be attributed. For example, .NET supports something called an indexer, which is really the managed equivalent of an operator overload for the array operation (represented as square brackets in C++ and C#). However, there is no __indexer keyword. Instead, managed C++ requires that a class be marked with an attribute designating the indexer for that class.
  [System::Reflection::DefaultMemberAttribute(S"Item")]
public __gc class MyCollection {
  __property String* get_Item(int index);
};

The attribute in question, DefaultMemberAttribute, is actually a class defined in the System::Reflection namespace, and the "Item" string is the constructor argument that designates the Item property as the indexer for the MyCollection class.
      In addition to placing attributes on classes (and members of classes), you can also put attributes on assemblies themselves. In managed C++, this is done by using a leading assembly: prefix and attributing a semicolon at global scope. For example, if you wanted to set the description attribute on an assembly, you could do this:
  using namespace Reflection;
[assembly:AssemblyTitle(S"My MSDN Magazine Samples")];

      In fact, the compiler team is so enamored of attributes, that they added a bunch of attributes to let you write unmanaged code while you wait for .NET. For example, if you use the __interface keyword without the __gc keyword, you'll be generating a COM interface, not a .NET interface. The new compiler provides the same facility for other constructs as well, but you should keep in mind that these attributes are not .NET. They're a language mapping to provide smoother integration between C++ and COM and result in IDL and ATL code being generated behind the scenes. For more information about these unmanaged extensions to C++, see "+ Attributes: Make COM Programming a Breeze with New Feature in Visual Studio .NET," in the April 2001 issue of MSDN Magazine.

Where Are We?

      Unfortunately, as powerful and flexible as managed C++ is, it's not the native language of .NET, which means that books, articles, courses, code samples, and so on, are not going to be written in managed C++—they're going to be written in C#. But this should come as no surprise. C++ has never been the native language of any popular platform. Unix and Win32® have C. The Mac has Pascal. NeXT had Objective C (of all things). COM has Visual Basic. Only the BeOS has C++ as its native language, and when was the last time you wrote any BeOS code? The fact that .NET favors C# merely means that another language will be translated into the C++ equivalent, as has been done since 1983. Figure 4 shows a list of major constructs you'll see in C# and how they map to the corresponding managed C++ syntax.
      Where do all these new features leave you with your existing code? Managed C++ was developed specifically to provide you with a gentle way to move your existing code into the new world. If you'd like to use managed types, all you have to do is flip the /CLR switch and recompile. Your existing code will continue to work the way you'd expect, including your ATL or MFC projects.
      Maybe you'd like to use managed types from the .NET Framework or maybe you'd like to use some that you or your team has built from scratch in C# or Visual Basic .NET. Either way, the #using directive brings them in. Perhaps you'd like to expose managed wrappers around existing C++ types, like you've been doing with COM wrappers for years. If so, public __gc will get you there.
      In fact, Microsoft has done an amazing job of letting you mix managed and unmanaged types and code in managed C++, letting you decide which of your code to move to .NET and when.
For related articles see:
https://msdn.microsoft.com/msdnmag/issues/01/04/attributes/attributes.asp
For background information see:
https://msdn.microsoft.com/msdnmag/issues/0900/framework/framework.asp
https://msdn.microsoft.com/msdnmag/issues/1000/framework2/framework2.asp
https://msdn.microsoft.com/msdnmag/issues/1100/GCI/GCI.asp
https://msdn.microsoft.com/msdnmag/issues/1200/GCI2/GCI2.asp
Chris Sells is the director of software at DevelopMentor (https://www.develop.com), where he has a lot of C++ COM code that needs to work in the new world of .NET. You can reach him at https://staff.develop.com/csells.