January 2009

Volume 24 Number 01

CLR Inside Out - Best Practices For Managed And Native Code Interoperability

By Jesse Kaplan | January 2009

Contents

When Is Managed-Native Interop Appropriate?
Interop Technologies: Three Choices
Interop Technologies: P/Invoke
Interop Technologies: COM Interop
Interop Technologies: C++/CLI
Considerations for Your Interop Architecture
API Design and Developer Experience
Performance and the Location of Your Interop Boundary
Lifetime Management

In some ways, it may seem odd to see a column like this one appearing in MSDN Magazine in early 2009—managed and native code interoperability has been supported in the Microsoft .NET Framework, in more or less the same form, since version 1.0 in 2002. In addition, you can readily find detailed API- and tool-level docs and thousands of pages of detailed support documents. Yet the missing piece in all of this is comprehensive, high-level architectural guidance describing when to use interop, what architectural considerations you should take into account, and which interop technology to use. This is the gap I'm going to begin to fill here.

When Is Managed-Native Interop Appropriate?

There has not been a great deal written about when it's appropriate to use managed-native interop, and much of what can be found on the subject is conflicting. Sometimes the guidance is not based on actual practical experience, either. So before I begin, let me state that everything I'm writing about today is guidance that we developed based on the experiences the interop team has had helping internal and external customers of all sizes.

In synthesizing this experience, we put together three products that serve as excellent examples of successful uses of interop and a representative set of the types of uses of interop. Visual Studio Tools for Office is the managed extensibility toolset for Office and is the first application I think of when I think of interop. It represents the classic use of interop—a large native application that wants to enable managed extensions or add-ins. Next on my list is Windows Media Center, an application that was built from the ground up as a mixed managed and native application. Windows Media Center was developed using mostly managed code with some portions—those dealing directly with the TV tuner and other hardware drivers—built in native code. Finally there is Expression Design, an application with a large, preexisting native code base that wants to leverage new managed technologies, in this case Windows Presentation Foundation (WPF), to provide a next-generation user experience.

These three applications address the three most common reasons to use interop: to allow managed extensibility of preexisting native applications, to allow most of an application to take advantage of the benefits of managed code while still writing the lowest-level pieces in native code, and to add a differentiated, next-generation user experience to an existing native application.

In the past, available guidance would have suggested simply rewriting the entire application in managed code in these instances. Trying to follow this advice and seeing how many people simply refuse to follow it has made it clear that this simply isn't an option for most existing applications. Interop is going to be the vital technology that allows developers to maintain their existing investment in native code while still being able to take advantage of the new managed environment. If you are planning to rewrite your application for other reasons, managed code may be a good choice for you. But you generally don't want to rewrite your application merely to use new managed technologies and avoid interop.

Interop Technologies: Three Choices

There are three main interop technologies available in the .NET Framework, and which one you choose will be determined in part by the type of API you are using for interop and in part by your requirements and need to control the boundary. Platform Invoke, or P/Invoke, is primarily a managed-to-native interop technology that allows you to call C-style native APIs from managed code. COM interop is a technology that allows you either to consume native COM interfaces from managed code or export native COM interfaces from managed APIs. Finally there is C++/CLI (formerly known as managed C++), which allows you to create assemblies that contain a mix of managed and native C++ compiled code and is designed to serve as a bridge between managed and native code.

Interop Technologies: P/Invoke

P/Invoke is the simplest of the three technologies, and it was designed primarily for providing managed access to C-style APIs. With P/Invoke you need to wrap each API individually. This can be a very good choice if you have a few APIs to wrap and their signatures are not very complex. P/Invoke gets substantially harder to use, however, if the unmanaged APIs have a lot of arguments that don't have good managed equivalents such as variable-length structures, void *s, overlapping unions, and so on.

The .NET Framework base class libraries (BCLs) contain many examples of APIs that are really just thick wrappers around large numbers of P/Invoke declarations. Nearly all functionality in the .NET Framework that wraps unmanaged Windows APIs is built using P/Invoke. In fact, even Windows Forms is built almost entirely on the native ComCtl32.dll using P/Invoke.

There are a few very valuable resources that can make using P/Invoke significantly easier. First, the Web site pinvoke.net has a wiki, originally set up by Adam Nathan from the CLR interop team, that has a large number of user-contributed signatures for a wide variety of common Windows APIs.

There is also a very handy Visual Studio add-in that makes it easy to consult pinvoke.net from within Visual Studio. For APIs that are not covered on pinvoke.net, whether they are from your own libraries or someone else's, the interop team has released a P/Invoke signature-generating tool called the P/Invoke Interop Assistant, which automatically creates signatures for native APIs based on a header file. The accompanying screenshot shows the tool in action.

Creating Signatures in P/Invoke Interop Assistant

Interop Technologies: COM Interop

COM interop allows you either to consume COM interfaces from managed code or expose managed APIs as COM interfaces. You can use the TlbImp tool to generate a managed library that exposes managed interfaces for talking to a specific COM tlb. And TlbExp performs the opposite task and will generate a COM tlb with interfaces that correspond to the ComVisible types in a managed assembly.

COM interop can be a very good solution for you if you are already using COM inside your application or as its extensibility model. It is also the easiest way to maintain full-fidelity COM semantics between managed and native code. In particular, COM interop is an excellent choice if you are interoperating with a Visual Basic 6.0-based component, as the CLR follows basically the same COM rules Visual Basic 6.0 does.

COM interop is less useful if your application does not already use COM internally, or if you don't need full fidelity COM semantics and its performance is not acceptable for your application.

Microsoft Office is the most prominent example of an application that uses COM interop as its bridge between managed and native code. Office was a great candidate for COM interop, as it has long used COM as its extensibility mechanism and was most commonly used from Visual Basic for Applications (VBA) or Visual Basic 6.0.

Originally Office relied entirely on TlbImp and the thin interop assembly as its managed object model. Over time, however, the Visual Studio Tools for Office (VSTO) product was built into Visual Studio, providing a richer and richer development model that incorporated many of the principles described in this column. When using the VSTO product today, it is sometimes as easy to forget that COM interop is serving as the foundation of VSTO as it is to forget that P/Invoke is the foundation of much of the BCLs.

Interop Technologies: C++/CLI

C++/CLI is designed to be a bridge between the native and managed world, and it allows you to compile both managed and native C++ into the same assembly (even the same class) and make standard C++ calls between the two portions of the assembly. When you use C++/CLI, you choose which portion of the assembly you want to be managed and which you want to be native. The resulting assembly is a mix of MSIL (Microsoft intermediate language, found in all managed assemblies) and native assembly code. C++/CLI is a very powerful interop technology that gives you almost complete control over the interop boundary. The downside is that it forces you to take almost complete control over the boundary.

C++/CLI can be a good bridge if static-type checking is needed, if strict performance is a requirement, and if you need more predictable finalization. If P/Invoke or COM interop meets your needs, they are generally simpler to use, especially if your developers are not familiar with C++.

There are a few things to keep in mind when considering C++/CLI. The first thing to remember is that if you are planning to use C++/CLI to provide a faster version of COM interop, COM interop is slower than C++/CLI because it does a lot of work on your behalf. If you only loosely use COM in your application and don't require full fidelity COM interop, then this is a good trade-off.

If, however, you use a large portion of the COM spec, you'll likely find that once you add back the pieces of COM semantics you need into your C++/CLI solution, you'll have done a lot of work and will have performance no better than what is provided with COM interop. Several Microsoft teams have gone down this road only to realize this and move back to COM interop.

The second major consideration for using C++/CLI is to remember that this is only intended to be a bridge between the managed and native worlds and not intended to be a technology you use to write the bulk of your application. It is certainly possible to do so, but you'll find that developer productivity is much lower than in a pure C++ or pure C#/Visual Basic environment and that your application runs much slower to boot. So when you use C++/CLI, compile only the files you need with the /clr switch, and use a combination of pure managed or pure native assemblies to build the core functionality of your application.

Considerations for Your Interop Architecture

Once you have decided to use interop in your application and decided which technology to use, there are a few high-level considerations when architecting your solution, including your API design and the developer experience for those coding against the interop boundary. Also take into account where you place your native managed transitions and the performance impact this can have on your application. And finally, you need to consider lifetime management and whether you need to do anything to bridge the gap between the garbage-collected world of the managed environment and the manual/deterministic lifetime management of the native world.

API Design and Developer Experience

When thinking about your API design, there are several questions you need to ask yourself: Who is going to be coding against my interop layer, and should I optimize for improving his experience or for minimizing the cost of building the boundary? Are the developers coding against this boundary the same ones writing the native code? Are they other developers in your company? Are they third-party developers extending your application or using it as a service? What is their sophistication level? Are they comfortable with native paradigms or are they only happy when writing managed code?

Your answers to these questions will help determine where you end up on the continuum between a very thin wrapper over the native code and a rich managed object model that uses native code under the covers. With a thin wrapper, all the native paradigms will bleed through, and developers will be very aware of the boundary and the fact that they are coding against a native API. With a thicker wrapper you can almost completely hide the fact that native code is in the picture—the file system APIs in the BCL are an excellent example of a very thick interop layer that provides a first-class managed-object model.

Performance and the Location of Your Interop Boundary

Before you spend too much time optimizing your application, it's important that you determine whether or not you have an interop performance problem. Many applications use interop in performance-critical sections and need to pay close attention here. But many others are going to be using interop in response to user mouse clicks and will not see tens, hundreds, or even thousands of interop transitions causing delays to their users. That said, when you do take a look at the performance of your interop solution, you should have two goals in mind: reducing the number of interop transitions you make and reducing the amount of data passed on each transition.

A given interop transition with a given amount of data crossing between the managed and native world is basically going to have a fixed cost. This fixed cost will be different depending upon your choice of interop technology, but if you made that choice because you needed the features of that technology, then you won't be able to change it. This means your focus should be reducing the chattiness of the boundary and then reducing the amount of data crossing it.

How you accomplish this depends largely upon your application. But a common, and adaptable, strategy with which many have been successful is to move the isolation boundary by writing a bit of code on the side of the boundary that defined the busy and data-heavy interface. The basic idea is to write an abstraction layer that batches calls into the very busy interface or, even better, to move the portion of your application logic that needs to interact with this API across the boundary and only pass inputs and results across the boundary.

Lifetime Management

The differences in lifetime management between the managed and native worlds is often one of the biggest challenges for interop customers. The fundamental difference between the garbage collection-based system in the .NET Framework and the manual and deterministic system in the native world can often manifest itself in surprising ways that can be difficult to diagnose.

The first problem you might notice in an interop solution is the lengthy amount of time some managed objects hold on to their native resources even after the managed world is finished using them. This often causes problems when the native resource is very scarce and depends upon being released as soon as its callers are finished using it (database connections are a great example of this).

When these resources are not scarce, you can simply rely on the garbage collector calling an object's finalizer and letting that finalizer release the native resources (either implicitly or explicitly). When resources are scarce, the managed dispose pattern can be very useful. Instead of exposing native objects directly to managed code, you should put at least a thin wrapper around them that implements IDisposable and follows the standard dispose pattern. This way, if you find resource exhaustion to be a problem, you can explicitly dispose of these objects in your managed code and release the resources as soon as you're done with them.

The second lifetime management problem that commonly impacts applications is one that developers often perceive as a stubborn garbage collection: their memory use keeps rising, but for some reason the garbage collector is running infrequently and objects are kept alive. Often they will keep adding calls to GC.Collect to force the issue.

The root of this problem is usually that there are a lot of very small managed objects that hold onto, and keep alive, very large native data structures. What happens is that the garbage collector is self-tuning and tries to avoid wasting time doing unnecessary or unhelpful collections. And, in addition to looking at the current memory pressure of the process, it looks at how much memory each garbage collection frees when deciding whether to do another one.

When it runs in this scenario, though, it sees that each collection only frees a small amount of memory (remember it only knows about how much managed memory is freed) and doesn't realize that freeing those small objects can significantly decrease overall pressure. This leads to a situation where fewer and fewer collections happen even as memory use keeps increasing.

The solution to this problem is to give hints to the garbage collector as to the real memory cost of each of these small managed wrappers over the native resources. We added a pair of APIs in the .NET Framework 2.0 that allow you to do just that. You can use the same type of wrappers you used to add the dispose patterns to scarce resources but repurpose them to provide hints to the garbage collector instead of explicitly having to free the resources yourself.

In the constructor for this object you simply call the method GC.AddMemoryPressure and pass in the approximate cost in native memory of the native object. You can then call GC.RemoveMemoryPressure in the object's finalizer method. This pair of calls will help the garbage collector understand the true cost of these objects and the real memory freed when releasing them. Note that it is important to make sure you perfectly balance your calls to Add/RemoveMemoryPressure.

The third common disconnect in lifetime management between the managed and native worlds is not so much about the management of individual resources or objects but rather of whole assemblies or libraries. Native libraries can be easily unloaded when an application is done with them, but managed libraries cannot be unloaded on their own. Instead, the CLR has isolation units called AppDomains that can be individually unloaded and will clean up all assemblies, objects, and even threads running in that domain when unloaded. If you are building a native application and are accustomed to unloading your add-ins when you are done with them, you'll find that using different AppDomains for each of your managed add-ins will give you the same flexibility you had in unloading individual native libraries.

Send your questions and comments to clrinout@microsoft.com.

Jesse Kaplan currently is the Program Manager of Managed/Native Interoperability for the CLR team at Microsoft. His past responsibilities include compatibility and extensibility.