Chapter 7 — Improving Interop Performance

Article
07/14/2010

Retired Content
This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Improving .NET Application Performance and Scalability

J.D. Meier, Srinath Vasireddy, Ashish Babbar, and Alex Mackman
Microsoft Corporation

May 2004

Related Links

Home Page for Improving .NET Application Performance and Scalability

Chapter 5 — Improving Managed Code Performance

Checklist: Enterprise Services Performance

Send feedback to Scale@microsoft.com

patterns & practices Library

Summary: This chapter provides performance and scalability guidelines to help design and build .NET Framework components that interoperate with COM components using P/Invoke and COM interop. This chapter shows how to efficiently release unmanaged resources and COM objects in a timely manner. Follow guidelines presented in this chapter to minimize marshaling overhead and thread-switching in your code.

Objectives
Overview
How to Use This Chapter
Architecture
Performance and Scalability Issues
Design Considerations
Implementation Considerations
Marshaling
Marshal.ReleaseComObject
Code Access Security (CAS)
Threading
Monitoring Interop Performance
Summary
Additional Resources

Objectives

Choose which type of interop to use.
Marshal data efficiently.
Release COM objects in a timely manner.
Release unmanaged resources in a timely manner.
Avoid unnecessary thread switches.
Monitor interop performance.

Overview

When you build applications and components for the Microsoft® .NET Framework version 1.1, sometimes you need to write code that calls unmanaged libraries, such as the Microsoft Win32® API and COM components. The common language runtime (CLR) provides several options for interoperability (referred to as interop) between managed code and unmanaged code.

This chapter presents proven strategies and best practices for designing and writing high-performance interop that increases your application's potential to scale. The chapter begins with an overview of interop architecture that highlights the various forms of interop. Next, it describes the main performance and scalability issues associated with interop. An awareness of these issues increases your chances of avoiding common pitfalls. The chapter then provides a set of design guidelines along with specific coding techniques that you can use to optimize your code's interop performance.

Note Calling managed objects from unmanaged COM clients is outside the scope of this chapter.

How to Use This Chapter

Use this chapter to apply proven strategies and best practices for designing and writing high-performance interop code. To get the most out of this chapter, do the following:

Jump to topics or read from beginning to end. The main headings in this chapter help you locate the topics that interest you. Alternatively, you can read the chapter from beginning to end to gain a thorough appreciation of performance and scalability design issues.
Use the checklist. Use the "Checklist: Interop Performance" checklist in the "Checklists" section of this guide to quickly view and evaluate the guidelines presented in this chapter.
Use the "Architecture" section of this chapter to understand how interop works. By understanding the architecture, you can make better design and implementation choices.
Use the "Design Considerations" section of this chapterto understand the higher level decisions that will affect implementation choices for interop code.
Read Chapter 13, "Code Review: .NET Application Performance." See the "Interop" section for specific guidance.
Measure your application performance. Read the "Interop" and ".NET Framework Technologies" sections of Chapter 15, "Measuring .NET Application Performance" to learn about the key metrics that can be used to measure application performance. It is important that you be able to measure application performance so that performance issues can be accurately targeted.
Test your application performance. Read Chapter 16, "Testing .NET Application Performance" to learn how to apply performance testing to your application. It is important that you apply a coherent testing process and that you be able to analyze the results.
Tune your application performance. Read Chapter 17, "Tuning .NET Application Performance" to learn how to resolve performance issues identified through the use of tuning metrics.

Architecture

The .NET Framework provides three forms of interop:

Platform Invoke (P/Invoke). This feature allows users of managed languages, such as C# and Microsoft Visual Basic® .NET, to call libraries within standard Microsoft Windows® DLLs, such as the Win32 API, or custom DLLs.
It Just Works (IJW). This feature allows users of Managed Extensions for C++ to directly call libraries within standard Windows DLLs.
COM Interop. This feature allows users of managed languages to activate and interact with COM objects through COM interfaces.

Platform Invoke (P/Invoke)

To call a function in a standard Windows DLL by using P/Invoke, you must add a declaration to map a callable managed method to the target unmanaged function. In C#, you use the [DllImport] attribute to make a P/Invoke declaration; in Visual Basic .NET, you use the Declare statement. For example:

// in C#
[DllImport("kernel32.dll")]
public static extern bool Beep(int frequency, int duration);

' *** in Visual Basic .NET
Declare Function Beep Lib "kernel32" (ByVal frequency As Integer, _
                                      ByVal duration As Integer) As Boolean

After you have written the declaration, you invoke the unmanaged function by calling the managed method defined within the declaration. After the CLR determines that it must dispatch the call to an unmanaged function, the process shown in Figure 7.1 occurs.

Ff647812.ch07-pinvoke(en-us,PandP.10).gif

Figure 7.1: Calling unmanaged code by using P/Invoke

The process consists of the following steps shown in Figure 7.1.

The runtime intercepts the call to unmanaged code and identifies the target method in the Export Name Table. If a matching method name is found, the method is invoked. For methods that accept ANSI string parameters, the runtime searches for "methodName" and "methodNameA." For methods that accept Unicode string parameters, it searches for "methodName" and "methodNameW."
Parameters are marshaled. These parameters can be marked as [in], [out], or ref. Blittable types (such as System.Byte and System.Int32) do not need to be marshaled and are passed directly across to the unmanaged code. Non-blittable types (such as System.Array) are marshaled (converted), based on default marshaling rules and marshaling hints that you can specify by using attributes such as [MarshalAs(UnmanagedType.LPStr)].
The native code is executed.
Return values are marshaled back. This includes any parameters marked as ByRef, [out], or [in][out] together with a return value, if there is one.

More Information

For more information about using P/Invoke, see "Platform Invoke Tutorial" in C# Programmer's Reference on MSDN® at https://msdn.microsoft.com/en-us/library/aa288468(VS.71).aspx and "Interoperating with Unmanaged Code" in .NET Framework Developer's Guide on MSDN at https://msdn.microsoft.com/en-us/library/sd10k43k(VS.71).aspx.

IJW and Managed Extensions for C++

IJW provides C++ programmers with a more straightforward way to call functions in standard Windows DLLs. When you use IJW, you do not need to use [DllImport] attribute declarations for the unmanaged APIs. Instead, you just include the appropriate header file and then link to the associated import library. For example:

#include "stdafx.h"
#using <mscorlib.dll>
using namespace System;
using namespace System::Runtime::InteropServices;
#include <stdio.h>

int main() {
   String * pStr = S"Hello World!";
   char* pChars = (char*)Marshal::StringToHGlobalAnsi(pStr).ToPointer();
   puts(pChars);
   Marshal::FreeHGlobal(pChars);
}

IJW gives you more control and more potential to optimize the way that parameters are marshaled as calls transition back and forth between managed and unmanaged code. You must implement IJW in code rather than through the use of attributes. Although this requirement adds complexity to interop code, it allows you to work with the IntPtr type and to marshal data manually for maximum efficiency.

More Information

For more information about using IJW, see "Platform Invocation Services" in the Managed Extensions for C++ Migration Guide on MSDN at https://msdn.microsoft.com/en-us/library/aa712982(VS.71).aspx.

COM Interop

COM interop allows you to easily create and instantiate COM components. To use COM interop and to be able to make early-bound calls to COM components, you must generate (or acquire) an interop assembly. (An interop assembly is not required for late binding.) Figure 7.2 shows the COM interop process.

Ff647812.ch07-cominterop(en-us,PandP.10).gif

Figure 7.2: COM interop

An interop assembly is an assembly that contains managed types that allow you to program indirectly against unmanaged COM types. You must compile your managed application with a reference to an interop assembly in order to program against COM objects and to interact with them by using early binding.

Primary Interop Assemblies (PIAs)

The company that produces a COM DLL can produce a primary interop assembly (PIA) to make the DLL accessible from managed applications. Only the publisher can produce a PIA, which is digitally signed with a strong name. A PIA offers two major advantages over a standard interop assembly:

It ensures type compatibility by providing unique type identity, because it is signed by its publisher and is labeled with the PrimaryInteropAssembly attribute.
The PIA can be registered so that it is recognized by Microsoft Visual Studio® .NET. Once you have registered a PIA on a development workstation, Visual Studio .NET uses the preexisting PIA instead of generating a new interop assembly when you add a reference to the COM DLL with which it is associated.

For more information, see MSDN article, "Primary Interop Assemblies (PIAs)," at https://msdn.microsoft.com/en-us/library/aa302338.aspx.

Runtime Callable Wrapper

A managed reference is never directly bound to a COM object. Instead, the COM interop layer of the CLR inserts a special proxy, known as the runtime callable wrapper (RCW), between the caller and the object. All calls made to a COM object must go through the RCW. Depending on the apartment models of the .NET thread and the COM object, the RCW can point to a proxy, or it can point directly to the COM object as shown in Figure 7.3.

Ff647812.ch07-rcw(en-us,PandP.10).gif

Figure 7.3: The runtime callable wrapper (RCW)

The RCW is responsible for marshaling parameters as execution flow transitions between managed code and unmanaged code. Calling into unmanaged COM code from managed code is made easy by the CLR; however, it carries a performance cost. The following steps are performed:

Perform data marshaling.
Fix the calling convention.
Protect callee-saved registers.
Switch thread mode to ensure that the garbage collector does not block unmanaged threads.
Erect an exception-handling frame on calls into unmanaged code.
Optionally take control of the thread.

More Information

For more information about using COM interop, see the following resources on MSDN:

"Exposing COM Components to the .NET Framework" in .NET Framework Developer's Guide at https://msdn.microsoft.com/en-us/library/z6tx9dw3.aspx.
"COM Interop Part 1: C# Client Tutorial" in C# Programmer's Reference at https://msdn.microsoft.com/en-us/library/aa645736(VS.71).aspx.
"COM Interop Part 2: C# Server Tutorial" in C# Programmer's Reference at https://msdn.microsoft.com/en-us/library/aa645738(VS.71).aspx.

Performance and Scalability Issues

The main issues that can adversely affect the performance and scalability of your application's interop code are summarized in the list that follows. Most apply to server-side Web applications that call COM components on a per-request basis under load. Subsequent sections of this chapter provide strategies and technical information to help prevent or resolve these issues:

Marshaling parameter types inefficiently. As interop calls transition between managed code and unmanaged code, parameters that are marshaled inefficiently or unnecessarily waste system processing cycles.
Not disposing COM objects for server applications under load. The lifetime of a COM object is managed through reference counting. If your Web application calls COM components on a per-request basis under load, then failure to call Marshal.ReleaseComObject at the appropriate time prolongs the lifetime of the COM object. This adversely affects your server's resource (including memory) utilization, particularly if you use single-threaded apartment (STA) COM objects.
Usingchatty interfaces that require excessive round trips. It is inefficient to write managed code against COM objects that expose interfaces requiring multiple round trips to perform a single logical operation. The classic example is a COM object that requires the caller to assign multiple property values before executing a method.
Not disposing of unmanaged resources in a timely manner. The CLR garbage collector executes an object's Finalize method at a time that is not predictable. If time-critical resources are released in a Finalize method, your code can hold on to these resources much longer than it should.
Aggressively pinning short-lived objects. If you unnecessarily extend the life of a buffer beyond the duration of the P/Invoke call, you can fragment the managed heap.
Incurring overhead due to late binding and reflection. Late binding is based on reflection and requires many more processing cycles for a caller to execute a method than does early binding.
Incurring overhead due to unnecessary thread switching. The majority of COM objects are apartment-threaded and can run only under the COM STA model. Failure to match threading models between the calling thread and the COM object can result in cross-apartment calls. Cross-apartment calls usually require a thread switch, which further degrades performance.
Overhead of Unicode to ANSI string conversions. Interop calls to older functions in the Win32 API can result in the conversion of string data between the Unicode format used by the CLR and the older ANSI format. This conversion is costly in terms of processing cycles and should be avoided or reduced as much as possible.

Design Considerations

To help ensure that your application's interop code is optimized for performance, consider the following best practice design guidelines:

Design chunky interfaces to avoid round trips.
Reduce round trips with a facade.
Implement IDisposable if you hold unmanaged resources across client calls.
Reduce or avoid the use of late binding and reflection.

Design Chunky Interfaces to Avoid Round Trips

When you design code that is to be called through P/Invoke or COM interop, design interfaces that reduce the number of calls required to complete a logical unit of work. This guideline is particularly important for interfaces that handle calls to COM components located on a remote server, where the performance impact of using chatty interfaces is significant.

The following code fragment illustrates a chatty component interface that uses property getters and setters and requires the caller to cross the managed/unmanaged code boundary three times, performing data marshaling, security checks, and thread switches each time.

MyComponent.Firstname = "bob";
MyComponent.LastName = "smith";
MyComponent.SaveCustomer();

The following code fragment shows a chunky interface designed to perform the same tasks. The number of round trips between managed and unmanaged code is reduced to one, which significantly reduces the overhead required to complete the logical operation.

MyComponent.SaveCustomer( "bob", "smith");

Reduce Round Trips with a Facade

Often, you cannot design the interfaces of the unmanaged libraries that you use because they are provided for you. If you must use a preexisting unmanaged library with a chatty interface, consider wrapping the calls in a facade. Implement the facade on the boundary side that exposes the chatty interface. For example, given a chatty Win32 API, you would create a Win32 facade. Creating a .NET facade would still incur the same number of managed/unmanaged boundary crossings. The following is an example of a chatty unmanaged interface wrapped with an unmanaged facade.

public bool MyWrapper( string first, string last )
{
  ChattyComponent myComponent = new ChattyComponent();
  myComponent.Firstname = first;
  myComponent.LastName = last;
  return myComponent.SaveCustomer();
}

Performance is improved because the facade reduces the required number of round trips crossing the managed/unmanaged boundary. You can apply the same principle to calling a chatty interface within a COM DLL created with Microsoft Visual Basic 6. You can create a facade DLL in Visual Basic 6 to reduce the required number of round trips, as shown in the following example.

Function MyWrapper(first As String, last As String ) As Boolean
  Dim myComponent As ChattyComponent
  Set myComponent = New ChattyComponent
  myComponent.Firstname = first
  myComponent.LastName = last
  MyWrapper  = myComponent.SaveCustomer
End Function

Implement IDisposable if You Hold Unmanaged Resources Across Client Calls

Holding shared server resources across remote client calls generally reduces scalability. When you build managed objects, you should acquire and release shared unmanaged resources on a per-request basis whenever possible. The platform can then provide optimizations, such as connection pooling, to reduce the resource-intensive operations for per-request calls.

If you acquire and release resources within a single request, you do not need to explicitly implement IDisposable and provide a Dispose method. However, if you hold on to server resources across client calls, you should implement IDisposable to allow callers to release the resources as soon as they are finished with them.

More Information

For more information, see "Finalize and Dispose Guidelines" in Chapter 5, "Improving Managed Code Performance"

Reduce or Avoid the Use of Late Binding and Reflection

COM objects support two styles of binding: early binding and late binding. You use early binding when you program against the types defined within an interop assembly. You can use late binding to program against a COM object from managed code by using reflection.

To use late binding in C# or C++, you must explicitly program against types defined inside the System.Reflection namespace. In Visual Basic .NET, the compiler adds in automatic support for late binding.

To use late binding from Visual Basic .NET, you must disable the Option Strict compile-time setting and program against the Object type. The following code activates a COM object, by using a string-based ProgID, and calls methods by using late binding.

Option Strict Off
Imports System
Imports System.Runtime.InteropServices
Class MyApp
  Shared Sub Main()
    Dim ComType1 As Type = Type.GetTypeFromProgID("ComLibrary1.Customer")
    Dim obj As Object = Activator.CreateInstance(ComType1)
    '*** call to COM object through late binding
    obj.Load("C123")
    Dim result As String = obj.GetInfo()
  End Sub
End Class

Late binding provides significantly poorer performance than early binding because it requires that a caller discover method bindings at run time by using the IDispatch interface. In addition, it requires the conversion of parameters and return values to the COM VARIANT data type as they are passed between caller and object. For these reasons, you should avoid late binding where possible. However, it does provide a few noteworthy advantages.

When you use late binding from managed code, you eliminate the need to generate and deploy an interop assembly. You also avoid dependencies on GUIDs, such as the CLSID and the default interface identifier (IID) for a COM CoClass. This can be useful if you are working with several different versions of a Visual Basic 6 DLL that has been rebuilt without using the binary compatibility mode of Visual Basic 6. Code that uses late binding works with different builds of a COM DLL, even when the value for the default IID has changed.

ASP.NET and Late Binding

If you have an ASP.NET client, calls such as Server.CreateObject and Server.CreateObjectFromClsid use reflection, which slows performance. If you use the <object> tag to create a COM object, calls to that object are serviced by using late binding as well.

The use of late binding always involves tradeoffs. You gain code that is more flexible and adaptable, but at the expense of type safety, run-time performance, and scalability.

Implementation Considerations

When moving from application design to development, you must consider interoperability between managed and unmanaged code. Key interop performance measures include response times, speed of throughput, and resource management.

Response times can be improved by marshaling parameter types efficiently between managed and unmanaged code. By using blittable parameter types and reducing unnecessary data transfer, interop calls use fewer processing cycles.

Pinning short-lived objects results in inefficient memory management. Garbage collection can be improved by pinning only long-lived objects.

Throughput can be increased by using an appropriate COM threading model, and by eliminating unnecessary thread switches.

By following best practice implementation guidelines, you can increase the performance of code interop. The following sections highlight performance considerations when developing interop code.

Marshaling

Parameters that you pass to and from Win32 APIs and COM interfaces are marshaled between managed and unmanaged code. For certain types, referred to as blittable types, the memory representation of the type is the same for managed and unmanaged code. As a result, blittable types are extremely efficient types for marshaling, because no conversion is required. Non-blittable types require more effort. The degree of effort varies according to the type and size of data.

Marshaling is a potential performance bottleneck for interop. To optimize your code's marshaling performance, follow these guidelines:

Explicitly name the target method you call.
Use blittable types where possible.
Avoid Unicode to ANSI conversions where possible.
Use IntPtr for manual marshaling.
Use [in] and [out] to avoid unnecessary marshaling.
Avoid aggressive pinning of short-lived objects.

Explicitly Name the Target Method You Call

If you call a method and the CLR does not find an exact match, the CLR searches for a suitable match. This search slows performance. Be explicit with the name of the function you want to call. When you use the DllImport attribute, you can set the ExactSpelling attribute to true to prevent the CLR from searching for a different function name.

Use Blittable Types Where Possible

Instances of certain types are represented differently in managed code than they are in unmanaged code. These types are known as non-blittable types and require marshaling as they are passed back and forth between managed and unmanaged code. Instances of other types have the same in-memory representation in both managed code and unmanaged code. These types are known as blittable types and do not require conversion as they are passed back and forth. Therefore, the use of blittable parameter types in interop calls requires fewer processing cycles for types conversion than the use of non-blittable parameter types.

If you have the option of choosing what types will be used in an interface, use blittable types. For example, when you are designing interfaces for code that will be called through P/Invoke, try to use blittable data types, such as Byte, SByte, Int32, Int64, and IntPtr.

Tables 7.1 and 7.2 list commonly used blittable and non-blittable types.

Table 7.1: Blittable Types

Single

Double

SByte

Int16

Uint16

Int32

Uint32

Int64

Uint64

IntPtr

UintPtr

Formatted types containing only blittable types

Single-dimensional array of blittable types

Table 7.2: Non-Blittable Types

Char

String

Object

Boolean

Single-dimensional array of non-blittable types

Multi-dimensional array of non-blittable types

More Information

For more information about using blittable versus non-blittable types, see "Blittable and Non-Blittable Types" in the .NET Framework Developer's Guide on MSDN at https://msdn.microsoft.com/en-us/library/75dwhxf7.aspx.

Avoid Unicode to ANSI Conversions Where Possible

Converting strings from Unicode to ANSI and vice versa is an expensive operation. The CLR stores string characters in Unicode format. When you call functions in the Win32 API, you should call the Unicode version of the API (for example, GetModuleNameW) instead of the ANSI version (for example, GetModuleNameA).

When you cannot avoid Unicode to ANSI conversion, you may be able to use IJW and marshal strings manually by using the IntPtr type. For example, if you need to make several calls to a Win32 API function, you may not need to convert a string value between Unicode and ANSI with each call. IJW and manual marshaling allows you to convert the string once and then to make several calls with the string in the ANSI form.

Use IntPtr for Manual Marshaling

By declaring parameters and fields as IntPtr, you can boost performance, albeit at the expense of ease of use, type safety, and maintainability. Sometimes it is faster to perform manual marshaling by using methods available on the Marshal class rather than to rely on default interop marshaling. For example, if large arrays of strings need to be passed across an interop boundary, but the managed code needs only a few of those elements, you can declare the array as IntPtr and manually access only those few elements that are required.

Use [in] and [out] to Avoid Unnecessary Marshaling

Use the [in] and [out] attributes carefully to reduce unnecessary marshaling. The COM interop layer of the CLR uses default rules to decide if some parameter needs to be marshaled in before the call and out after the call. These rules are based on the level of indirection and type of the parameter. Some of these operations may not be necessary depending on the method's semantics.

Parameters that are passed by reference are marked as [in][out] and are marshaled in both directions. For example:

instance string  marshal( bstr)  FormatNameByRef(
                                    [in][out] string&  marshal( bstr) First,
                                    [in][out] string&  marshal( bstr) Middle,
                                    [in][out] string&  marshal( bstr) Last) 
runtime managed internalcall

If you have control over the design of your COM components, modify the calling convention to marshal data only in the direction that it is needed.

Avoid Aggressive Pinning of Short-Lived Objects

Pinning short-lived objects unnecessarily extends the life of a memory buffer beyond the duration of the P/Invoke call. Pinning prevents the garbage collector from relocating the bytes of the object in the managed heap, or relocating the address of a managed delegate.

The garbage collector often relocates managed objects when it compacts the managed heap. Because the garbage collector cannot move any pinned object, the heap can quickly become fragmented, reducing the available memory.

There is often no need to explicitly pin objects. For example, there is no need to explicitly pin a managed array of primitive types, such as char and int, or to pin strings, StringBuilder objects, or delegate instances before making P/Invoke calls, because the P/Invoke marshaling layer ensures that they are pinned for the duration of the call.

It is acceptable to pin long-lived objects, which are ideally created during application initialization, because they are not moved relative to short-lived objects. It is costly to pin short-lived objects for a long period of time, because compacting occurs most in Generation 0 and the garbage collector cannot relocate pinned objects. This results in inefficient memory management that can adversely affect performance.

More Information

For more information, see "Copying and Pinning" in the .NET Framework Developer's Guide on MSDN at https://msdn.microsoft.com/en-us/library/23acw07k.aspx.

Marshal.ReleaseComObject

COM object lifetime is managed by reference counting. When an object's reference count reaches zero, its destructor code is executed and then the object's memory is freed. Because you do not know when the garbage collector will run, managed RCW objects that hold references to COM objects can prolong the lifetime of the COM object, and can delay the release of unmanaged resources. In server applications, particularly those under heavy load, this can place unwanted resource pressures on your server. To address this issue:

Consider calling ReleaseComObject in server applications.
Do not force garbage collections with GC.Collect.

Consider Calling ReleaseComObject in Server Applications

When you reference a COM object, you actually maintain a reference to an RCW. The RCW holds an internal pointer to the COM object's IUnknown interface. During finalization of the RCW, the CLR finalizer thread calls the RCW's finalizer, which in turn calls IUnknown::Release to decrement the COM object's reference count. When the reference count reaches zero, the COM object is released from the memory.

.NET memory management is nondeterministic, which can cause problems when you need to deterministically release COM objects in server applications, such as ASP.NET applications. You can use Marshal.ReleaseComObject to help solve this problem.

Note You should only call ReleaseComObject when you absolutely have to.

How ReleaseComObject Works

An RCW maintains an internal marshaling count, which is completely separate from the COM object reference count. When you call ReleaseComObject, the RCW's internal marshaling count is decremented. When the internal marshaling count reaches zero, the RCW's single reference count on the underlying COM object is decremented. At this point, the unmanaged COM object is released and its memory is freed, and the RCW becomes eligible for garbage collection.

The CLR creates exactly one RCW for each COM object. The RCW maintains a single reference count on its associated COM object, irrespective of how many interfaces from that COM object have been marshaled into the managed process in which the RCW is located. Figure 7.4 shows the relationship between the RCW, its clients in a managed process, and the associated COM object.

Ff647812.ch08-releasecomobject(en-us,PandP.10).gif

Figure 7.4: RCW's relationship to managed code and an unmanaged COM object

If multiple interface pointers have been marshaled, or if the same interface has been marshaled multiple times by multiple threads, the internal marshaling count in the RCW will be greater than one. In this situation, you need to call ReleaseComObject in a loop. For more information, see "How to Call ReleaseComObject" later in this chapter.

When to Call ReleaseComObject

Client code that uses a managed object that exposes a Dispose method should call the Dispose method as soon as it is finished with the object to ensure that resources are released as quickly as possible. Knowing when to call ReleaseComObject is trickier. You should call ReleaseComObject when:

You create and destroy COM objects under load from managed code. If there is sufficient load on your application to necessitate quick COM object disposal and recovery of resources, consider ReleaseComObject. This is generally the case for server workloads. For example, you may need to call ReleaseComObject if your ASP.NET page creates and destroys COM objects on a per-request basis.
Your ASP.NET code calls a serviced component that wraps and internally calls a COM component. In this case, you should implement Dispose in your serviced component and your Dispose method should call ReleaseComObject. The ASP.NET code should call your serviced component's Dispose method.
Your COM component relies on an eager release of its interface pointers to IUnknown. One approach is to assume that eager release is unnecessary. Then, if you find that you have scaling problems because a specific COM component must be eagerly released, come back and add the ReleaseComObject for it. In general, if you are calling COM from managed code under load (for example, in server scenarios), you need to consider ReleaseComObject.

When Not to Call ReleaseComObject

You should not call ReleaseComObject in the following circumstances:

If you use the COM object across client calls, do not call ReleaseComObject unless you are completely done. An exception is generated if you try to access an object that is already released.
If you use the COM object from multiple threads (such as when you cache or pool the object), do not call ReleaseComObject until you are completely done. An exception is generated if you try to access an object that is released.

If you do not call ReleaseComObject, the RCWs are cleaned up in one of two ways:

When a garbage collection happens, the finalizer thread releases RCWs that are not in use.
When a COM object is activated or when an interface pointer enters the runtime for the first time. If this occurs on an MTA thread the runtime will clean up all the RCWs no longer in use in the current context. If this occurs on an STA thread the runtime will clean up all the RCWs no longer in use in all contexts in that STA apartment

How to Call ReleaseComObject

When you call ReleaseComObject, follow these guidelines:

Evaluate whether you need a loop to release all interfaces. In most cases, you can simply call ReleaseComObject once. For example, in cases where you acquire a COM object interface pointer, work with it, and then release it, you should not implement a loop. This usage pattern is typical in server applications.

In cases where you have a marshaling count greater than one, you need to use a loop. This is the case when the marshaling count is incremented every time the pointer to IUnknown is marshaled into managed code from unmanaged code and ends up with the same RCW. Therefore you need to call ReleaseComObject in a loop until the returned marshaling count equals zero.

For example, if you call an unmanaged method ten times in a loop on the same thread, and the method returns the same object ten times, the underlying wrapper will have a marshaling count of ten. In this case, you must call ReleaseComObject ten times in a loop. This can occur in cases where you use ActiveX controls, where your code might query a contained property multiple times.

A simple approach is to call ReleaseComObject in a loop until its return value (the unmanaged reference count) reaches zero as shown below.
```
while(Marshal.ReleaseComObject(yourComObject)!=0);
```
If any thread subsequently attempts to access the released COM object through the RCW, a NullReferenceException exception is generated.

Note At the time of this writing, the .NET Framework 2.0 (code-named "Whidbey") provides a method named FinalReleaseComObject that will bypass the marshaling count logic. This means that you will not need to use a loop to repeatedly call ReleaseComObject.

Use a finally block. It is good practice to place calls to ReleaseComObject in a finally block as shown in the following example to ensure that it is called, even in the event of an exception.

// Create the COM object
Account act = new Account();
try
{

  // Post money into the account
  act.Post(5, 100);
}
finally
{
  // Make sure that the underlying COM object is immediately freed
  System.Runtime.InteropServices.Marshal.ReleaseComObject(act);

}

Setting objects to null or Nothing. It is common practice for Visual Basic 6 developers to set an object reference to Nothing as follows.
```
Set comObject = Nothing
```
If you would set a reference to null or Nothing to make a graph of objects unreachable in a pure managed scenario, you would use the same technique with graphs that contain managed objects and/or references to unmanaged objects.

More Information

For more information about when to call ReleaseComObject when you use serviced components, see Chapter 8, "Improving Enterprise Services Performance"

Do Not Force Garbage Collections with GC.Collect

A common approach for releasing unmanaged objects is to set the RCW reference to null, and call System.GC.Collect followed by System.GC.WaitForPendingFinalizers. This is not recommended for performance reasons, because in many situations it can trigger the garbage collector to run too often. Code written by using this approach significantly compromises the performance and scalability of server applications. You should let the garbage collector determine the appropriate time to perform a collection.

Code Access Security (CAS)

The CLR provides code access security (CAS) as a defensive measure against malicious code. CAS helps to ensure that assemblies have been granted sufficient permissions to be able to perform their work. For example, when code within an assembly attempts to call unmanaged code, CAS runs a security check to ensure that the assembly has been granted the UnmanagedCode permission.

CAS carries out security checks at run time. The checks involve a stack walk to ensure that each method in the current call stack has sufficient rights to perform the requested operation. The stack-walking procedure involves an expensive set of operations. However, the stack walk is important because it protects against luring attacks, in which malicious code in an untrusted assembly coerces code in a trusted assembly to perform sensitive operations on its behalf.

Consider the following measures to improve the performance of calling unmanaged code:

Consider using SuppressUnmanagedCode for performance-critical trusted scenarios
Consider using TLBIMP /unsafe for performance-critical trusted scenarios

Caution These performance optimizations introduce a security risk. To reduce risk, review your APIs. Make sure that you do not expose unmanaged resources to third-party callers and that your code is not susceptible to luring attacks.

Consider Using SuppressUnmanagedCode for Performance-Critical Trusted Scenarios

When designing APIs that do not expose sensitive resources or do not perform security-sensitive operations based on user input, use the SuppressUnmanagedCode attribute to eliminate the stack walk associated with the method call. For example:

// in C#
[DllImport("kernel32.dll"), SuppressUnmanagedCodeSecurity]
public static extern bool Beep(int frequency, int duration);

Use this technique only for performance-critical code in trusted scenarios. Perform thorough code reviews of such APIs to ensure that they are not susceptible to luring attacks.

More Information

For more information, see "Use SuppressUnmanagedCodeSecurity with Caution" in see Chapter 8, "Code Access Security in Practice," in Improving Web Application Security: Threats and Countermeasures on MSDN at https://msdn.microsoft.com/en-us/library/ms994921.aspx.

Consider Using TLBIMP /unsafe for Performance-Critical Trusted Scenarios

You can disable the full CAS stack walk for the unmanaged code permission by building interop assemblies with the TLBIMP /unsafe switch. This switch instructs TLBIMP to generate RCW code that performs link demands, rather than full demands for the unmanaged code permission. The /unsafe switch causes native method declarations to be decorated with SuppressUnmanagedCodeSecurityAttribute, which checks only the immediate caller when an interop call is made.

This technique results in faster calls between managed code and the COM objects created from the associated COM DLL. Use of this command-line switch is shown here.

C:\>tlbimp mycomponent.dll /out:UnSafe_MyComponent.dll /unsafe

Note If your assembly causes stack walks for other types of permission, such stack walks are not suppressed by using the /unsafe switch. Using this switch only suppresses the full stack walk for the unmanaged code permission.

Perform thorough code reviews of such APIs to ensure that they are not susceptible to luring attacks.

Threading

When you call a COM object from managed code, if the COM object's apartment model is incompatible with that of the calling thread, a thread switch occurs and the call is marshaled to the correct apartment. To optimize performance, you need to ensure that the apartment model of the calling thread is compatible.

To keep the overhead of thread switches to a minimum, follow these guidelines:

Reduce or avoid cross-apartment calls.
Use ASPCOMPAT when you call STA objects from ASP.NET.
Use MTAThread when you call free-threaded objects.
Avoid thread switches by using Neutral apartment COM components.

Reduce or Avoid Cross-Apartment Calls

When you call a COM object from a managed application, make sure that the managed code's apartment matches the COM object's apartment type. By using matching apartments, you avoid the thread switch associated with cross-apartment calls.

You should create apartment-threaded objects on a managed thread with an apartment type of STA. You should create free-threaded objects on a managed thread with an apartment type of multithreaded apartment (MTA). Objects marked Both can run on either STA or MTA without penalty. Table 7.3 shows the relationship between the component threading model and an unmanaged thread's apartment type.

Table 7.3: Threading Model and Thread Apartment Type

Component Threading Model	Unmanaged Thread's Apartment Type
Single*	STA
Apartment	STA
Both	Either**
Neutral	Either**
Free	MTA

*Avoid this where possible. A thread switch may still be necessary if your STA thread is not the Main STA (the first STA thread in the process). In addition, you create contention problems if multiple client threads use single-threaded objects in the same process, because the client threads all share this main STA.

** MTA is recommended. Otherwise, problems may occur. For example, an object's finalizer can block while it waits for STA threads, deadlocks can occur, and so on.

The way in which you set the managed thread's apartment type depends on the type of managed application.

Use ASPCOMPAT When You Call STA Objects from ASP.NET

All .NET threads are MTA threads by default. Therefore, cross-apartment calls and thread switches do not occur when you create and call COM objects with an apartment type of Free, Both, or Neutral. However, cross-apartment calls and thread switches occur when you create and call apartment-threaded COM objects. All objects created with Visual Basic 6 and earlier are apartment-threaded. To call an apartment-threaded COM object from an ASP.NET application without a cross-apartment call and a thread switch, mark your ASP.NET pages with the ASPCOMPAT attribute as follows so that the ASP.NET runtime will process your pages using STA threads.

<%@Page language="vb" aspcompat="true" %>

Note that you should not instantiate components in the page constructor, because they are executed on an MTA thread before the request is scheduled to use a thread from the STA thread pool. Therefore, instantiating components in the page constructor still incurs an apartment switch along with a thread switch. Instead, you should instantiate them in event handlers such as Page_Load or Page_Init. The components will then be executed on a thread from the STA thread pool.

More Information

For more information, see Knowledge Base article 308095, "PRB: Creating STA Components in the Constructor in ASP.NET ASPCOMPAT Mode Negatively Affects Performance," at https://support.microsoft.com/default.aspx?scid=kb;en-us;308095.

Calling Apartment-Model Objects from Web Services

Web services created using ASP.NET use MTA threads exclusively, and you cannot change that behavior. That means that using apartment-threaded COM objects, such as Visual Basic 6 components, from an ASP.NET Web service always involves cross-apartment calls and thread-switching. Therefore, if possible, you should avoid using apartment-threaded COM objects from Web services created using ASP.NET. The runtime has been optimized in case you must make cross-apartment calls, but they incur significantly more processing overhead than intra-apartment calls.

Use MTAThread When You Call Free-Threaded Objects

WinForm applications use STA threads by default. Therefore, no thread switches occur when you create and call methods on apartment-threaded COM objects. However, a thread switch occurs when you call free-threaded COM objects. To address this problem, you can switch the default thread type for a WinForm application by using the MTAThread attribute on the entry point method Main as follows.

[System.MTAThread]
static void Main()
{
  Application.Run(new Form1());
}

Avoid Thread Switches by Using Neutral Apartment COM Components

If you are developing a COM component with C++ that you plan to call from managed code, you should try to create a COM component marked as Neutral. Thread-neutral COM objects always use the caller's thread to execute. A lightweight proxy is used, and no thread switching occurs.

Monitoring Interop Performance

You should monitor interop performance to determine the exact impact it has on the performance of your application. To monitor performance, you can use performance counters and the CLR Spy tool.

Use performance counters for P/Invoke and COM Interop.
Use CLRSpy to identify Interop problems.

Use Performance Counters for P/Invoke and COM Interop

You can use the following performance counters from the .NET CLR Interop performance object:

# of CCWs. Indicates the number of COM callable wrappers (CCWs) that are referenced by unmanaged COM code. CCWs are used as proxy objects when unmanaged COM code calls managed .NET objects.
# of marshalling. Indicates how many times P**/**Invoke and COM interop data marshaling has occurred and counts boundary crossings that occur in both directions. This counter does not count occurrences of marshaling that become inlined. Stubs perform the marshaling, and at times the code is short enough to be inlined.
# of Stubs. Indicates the current number of stubs created by the CLR. Stubs perform data marshaling for P/Invoke and COM+ interop calls.

Use CLR Spy to Identify Interop Problems

The CLR Spy tool is also useful for monitoring the performance of your interop code in a managed application. You can download CLR Spy from the GotDotNet site at [Content link no longer available, original URL:"https://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=c7b955c7-231a-406c-9fa5-ad09ef3bb37f"] .

Summary

Your choices for writing interop code include P/Invoke, IJW, and COM interop. P/Invoke provides a way to access functions within standard Windows DLLs from managed languages such as C# and Visual Basic .NET. IJW allows C++ programmers to access functions within standard Windows DLLs with a greater degree of control. Although IJW requires more code due to the need for manual marshaling, it also provides the greatest opportunities for optimizing marshaling in more complex scenarios. COM interop allows you to access COM components for managed code.

This chapter has introduced you to the major areas that you need to consider to optimize your application's use of interop. It has also shown you specific coding techniques that you should use to boost the performance of your interop code.

Additional Resources

For more information about interop performance, see the following resources:

For a printable checklist, see "Checklist: Interop Performance" in the "Checklists" section of this guide.
Chapter 4 — Architecture and Design Review of a .NET Application for Performance and Scalability
Chapter 13 — Code Review: .NET Application Performance
Chapter 14 — Improving SQL Server Performance
Chapter 15 — Measuring .NET Application Performance
Chapter 16 — Testing .NET Application Performance
Chapter 17 — Tuning .NET Application Performance
MSDN article, "An Overview of Managed/Unmanaged Code Interoperability," at https://msdn.microsoft.com/en-us/library/ms973872.aspx.
"Interoperating with Unmanaged Code" in .NET Framework Developer's Guide on MSDN at https://msdn.microsoft.com/en-us/library/sd10k43k(VS.71).aspx.
Microsoft .NET/COM Migration and Interoperability on MSDN at https://msdn.microsoft.com/en-us/library/ms978506.aspx.
Knowledge Base article 816152, "HOW TO: Use COM Components in Visual Studio .NET with Visual C# .NET," at https://support.microsoft.com/default.aspx?scid=kb;en-us;816152.
Knowledge Base article 306801, "HOW TO: Interoperate with a COM Server That Returns Conformant Arrays by using Visual Basic.NET," at https://support.microsoft.com/default.aspx?scid=kb;en-us;306801.
For more information on performing a code review of your interop code, see "Interop" in Chapter 13, "Code Review: .NET Application Performance"

Retired Content
This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Chapter 7 — Improving Interop Performance

Improving .NET Application Performance and Scalability

Contents

Objectives

Overview

How to Use This Chapter

Architecture

Platform Invoke (P/Invoke)

IJW and Managed Extensions for C++

COM Interop

Primary Interop Assemblies (PIAs)

Runtime Callable Wrapper

Performance and Scalability Issues

Design Considerations

Design Chunky Interfaces to Avoid Round Trips

Reduce Round Trips with a Facade

Implement IDisposable if You Hold Unmanaged Resources Across Client Calls

Reduce or Avoid the Use of Late Binding and Reflection

ASP.NET and Late Binding

Implementation Considerations

Marshaling

Explicitly Name the Target Method You Call

Use Blittable Types Where Possible

Avoid Unicode to ANSI Conversions Where Possible

Use IntPtr for Manual Marshaling

Use [in] and [out] to Avoid Unnecessary Marshaling

Avoid Aggressive Pinning of Short-Lived Objects

Marshal.ReleaseComObject

Consider Calling ReleaseComObject in Server Applications

How ReleaseComObject Works

When to Call ReleaseComObject

When Not to Call ReleaseComObject

How to Call ReleaseComObject

More Information

Do Not Force Garbage Collections with GC.Collect

Code Access Security (CAS)

Consider Using SuppressUnmanagedCode for Performance-Critical Trusted Scenarios

Consider Using TLBIMP /unsafe for Performance-Critical Trusted Scenarios

Threading

Reduce or Avoid Cross-Apartment Calls

Use ASPCOMPAT When You Call STA Objects from ASP.NET

Calling Apartment-Model Objects from Web Services

Use MTAThread When You Call Free-Threaded Objects

Avoid Thread Switches by Using Neutral Apartment COM Components

Monitoring Interop Performance

Use Performance Counters for P/Invoke and COM Interop

Use CLR Spy to Identify Interop Problems

Summary

Additional Resources

Additional resources