2012

Volume 27

Windows Runtime and the CLR - Underneath the Hood with .NET and the Windows Runtime

By Shawn Farkas | 2012

The Windows Runtime (WinRT) provides a large set of new APIs to Windows Experience developers. The CLR 4.5, which ships as part of the Microsoft .NET Framework 4.5 in Windows 8, enables developers writing managed code to use the APIs in a natural way, just as though they were another class library. You can add a reference to the Windows Metadata (WinMD) file that defines the APIs you want to call and then call into them just as with a standard managed API. Visual Studio automatically adds a reference to the built-in WinRT APIs to new Windows UI projects, so your app can simply begin using this new API set.

Under the hood, the CLR provides the infrastructure for managed code to consume WinMD files and transition between managed code and the Windows Runtime. In this article, I’ll show some of these details. You’ll come away with a better understanding of what occurs behind the scenes when your managed program calls a WinRT API.

Consuming WinMD Files from Managed Code

WinRT APIs are defined in WinMD files, which are encoded using the file format described in ECMA-335 (bit.ly/sLILI). Although WinMD files and .NET Framework assemblies share a common encoding, they’re not the same. One of the main differences in the metadata stems from the fact that the WinRT type system is independent of the .NET type system.

Programs such as the C# compiler and Visual Studio use the CLR metadata APIs (such as IMetaDataImport) to read .NET Framework assembly metadata and can now read metadata from WinMD files as well. Because the metadata is not exactly the same as a .NET assembly, the CLR metadata reader inserts an adapter between the metadata APIs and the WinMD file being read. This enables WinMD files to be read as though they were .NET assemblies (see Figure 1).

The CLR Inserts a Metadata Adapter Between WinMD Files and the Public Metadata Interface
Figure 1 The CLR Inserts a Metadata Adapter Between WinMD Files and the Public Metadata Interface

Running ILDasm helps you understand the modifications that the CLR metadata adapter performs on a WinMD file. By default, ILDasm shows the contents of a WinMD file in its raw form, as it’s encoded on disk without the CLR metadata adapter. However, if you pass ILDasm the /project command-line parameter, it enables the metadata adapter, and you can see the metadata as the CLR and managed tools will read it.

By running copies of ILDasm side by side—one with the /project parameter and one without—you can easily explore the changes that the CLR metadata adapter makes to a WinMD file.

The Windows Runtime and .NET Type Systems

One of the major operations the metadata adapter performs is to merge the WinRT and .NET type systems. At a high level, five different categories of WinRT types can appear in a WinMD file and need to be considered by the CLR. These are listed in Figure 2. Let’s look at each category in more detail.

Figure 2 WinRT Types for a WinMD File

Category Examples
Standard WinRT types Windows.Foundation.Collections.PropertySet, Windows.Networking.Sockets.DatagramSocket
Primitive types Byte, Int32, String, Object
Projected types Windows.Foundation.Uri, Windows.Foundation.DateTime
Projected interfaces Windows.Foundation.Collections.IVector, Windows.Foundation.Iclosable
Types with .NET helpers Windows.Storage.Streams.IInputStream, Windows.Foundation.IasyncInfo

Standard WinRT Types While the CLR has special support for many categories of types exposed by the Windows Runtime, the vast majority of WinRT types are not treated specially by the CLR at all. Instead, these types appear to .NET developers unmodified, and they can be used as a large class library to enable writing Windows Store applications.

Primitive Types This set of primitive types is encoded into a WinMD file using the same ELEMENT_TYPE enumeration that .NET assemblies use. The CLR automatically interprets these primitive types as though they were the .NET equivalents.

For the most part, treating WinRT primitive types as .NET primitive types just works. For instance, a 32-bit integer has the same bit pattern in the Windows Runtime as it does in .NET, so the CLR can treat a WinRT DWORD as a .NET System.Int32 without any trouble. But two notable exceptions are strings and objects.

In the Windows Runtime, strings are represented with the HSTRING data type, which is not the same as a .NET System.String. Similarly, ELEMENT_TYPE_OBJECT means System.Object to .NET, while it means IInspectable* to the Windows Runtime. For both strings and objects, the CLR needs to marshal objects at run time to convert between the WinRT and .NET representations of the types. You’ll see how this marshaling works later in this article.

Projected Types There are some existing fundamental .NET types that have equivalents in the WinRT type system. For example, the Windows Runtime defines a TimeSpan structure and a Uri class, both of which have corresponding types in the .NET Framework.

To avoid forcing .NET developers to convert back and forth between these fundamental data types, the CLR projects the WinRT version to its .NET equivalent. These projections are effectively merge points that the CLR inserts between the .NET and WinRT type systems.

For example, the Windows Runtime Syndication­Client.RetrieveFeedAsync API takes a WinRT Uri as its parameter. Instead of requiring .NET developers to manually create a new Windows.Foundation.Uri instance to pass to this API, the CLR projects the type as a System.Uri, which lets .NET developers use the type they’re more familiar with.

Another example of a projection is the Windows.Founda­tion.HResult structure, which is projected by the CLR to the System.Exception type. In .NET, developers are used to seeing error information conveyed as an exception rather than as a failure HRESULT, so having a WinRT API such as IAsycn­Info.ErrorCode express error information as an HResult structure won’t feel natural. Instead, the CLR projects HResult to Exception, which makes a WinRT API such as IAsyncInfo.ErrorCode more usable for .NET developers. Here’s an example of the IAsyncInfo ErrorCode property encoded in Windows.winmd:

.class interface public windowsruntime IAsyncInfo
{
  .method public abstract virtual
    instance valuetype Windows.Foundation.HResult
    get_ErrorCode()
}

And here’s the IAsyncInfo ErrorCode property after the CLR projection:

 

.class interface public windowsruntime IAsyncInfo
{
 .method public abstract virtual
   instance class [System.Runtime]System.Exception
   get_ErrorCode()
}

Projected Interfaces The Windows Runtime also provides a set of fundamental interfaces that have .NET equivalents. The CLR performs type projections on these interfaces as well, again merging the type systems at these fundamental points.

The most common examples of projected interfaces are the WinRT collection interfaces, such as IVector<T>, IIterable<T> and IMap<K,V>. Developers who use .NET are familiar with collection interfaces such as IList<T>, IEnumerable<T> and IDictionary<K,V>. The CLR projects the WinRT collection interfaces to their .NET equivalents and also hides the WinRT interfaces so that developers don’t have to deal with two functionally equivalent sets of types that do the same thing.

In addition to projecting these types when they appear as parameters and return types of methods, the CLR also must project these interfaces when they appear in the interface implementation list of a type. For example, the WinRT PropertySet type implements the WinRT IMap<string, object> interface. The CLR, however, will project PropertySet as a type that implements IDictionary<string, object>. When performing this projection, the members of PropertySet that are used to implement IMap<string, object> are hidden. Instead, .NET developers access PropertySet through corresponding IDictionary<string, object> methods. Here’s a partial view of PropertySet as encoded in Windows.winmd:

.class public windowsruntime PropertySet
  implements IPropertySet,
    class IMap`2<string,object>,
    class IIterable`1<class IKeyValuePair`2<string,object> >
{
  .method public instance uint32 get_Size()
  {
    .override  method instance uint32 class 
      IMap`2<string,object>::get_Size()
  }
}

And here’s a partial view of PropertySet after the CLR projection: 

.class public windowsruntime PropertySet
  implements IPropertySet,
    class IDictionary`2<string,object>,
    class IEnumerable`1<class KeyValuePair`2<string,object> >
{
  .method private instance uint32 get_Size()
  {
  }
}

Notice that three type projections occur: IMap<string,object> to IDictionary<string,object>, IKeyValuePair<string,object> to KeyValuePair<string, object>, and IIterable<IKeyValuePair> to IEnumerable<KeyValuePair>. Also notice that the get_Size method from IMap is hidden.

Types with .NET Framework Helpers The Windows Runtime has several types that don’t have a full merge point into the .NET type system but are important enough to most applications that the .NET Framework provides helper methods to work with them. Two of the best examples are the WinRT stream and async interfaces.

Although the CLR does not project Windows.Storage.Streams.IRandomAccess­Stream to System.Stream, it does provide a set of extension methods for IRandom­AccessStream that allows your code to treat these WinRT streams as though they were .NET streams. For example, you can easily read a WinRT stream with the .NET StreamReader by calling the OpenStreamForReadAsync extension method.

The Windows Runtime provides a set of interfaces representing asynchronous operations, such as the IAsyncInfo interface. In the .NET Framework 4.5, there’s built-in support for awaiting asynchronous operations, which developers want to use with WinRT APIs in the same way they do for .NET APIs.

To enable this, the .NET Framework ships with a set of GetAwaiter extension methods for the WinRT async interfaces. These methods are used by the C# and Visual Basic compilers to enable awaiting WinRT asynchronous operations. Here’s an example:

private async Task<string> ReadFilesync(StorageFolder parentFolder, 
  string fileName)
{
  using (Stream stream = await parentFolder.OpenStreamForReadAsync(fileName))
  using (StreamReader reader = new StreamReader(stream))
  {
    return await reader.ReadToEndAsync();
    }
}

Transitioning Between the .NET Framework and theWindows Runtime The CLR provides a mechanism for managed code to seamlessly call WinRT APIs, and for the Windows Runtime to call back into managed code.

At its lowest level, the Windows Runtime is built on top of COM concepts, so it’s no surprise that the CLR support for calling WinRT APIs is built on top of existing COM interop infrastructure.

One important difference between WinRT interop and COM interop is how much less configuration you have to deal with in the Windows Runtime. WinMD files have rich metadata describing all of the APIs they’re exposing with a well-defined mapping to the .NET type system, so there’s no need to use any MarshalAs attributes in managed code. Similarly, because Windows 8 ships with WinMD files for its WinRT APIs, you don’t need to have a primary interop assembly bundled with your application. Instead, the CLR uses the in-box WinMD files to figure out everything it needs to know about how to call WinRT APIs.

These WinMD files provide the managed type definitions that are used at run time to allow managed developers access to the Windows Runtime. Although the APIs that the CLR reads out of a WinMD file contain a method definition that’s formatted to be easily used from managed code, the underlying WinRT API uses a different API signature (sometimes referred to as the application binary interface, or ABI, signature). One example of a difference between the API and ABI signatures is that, like standard COM, WinRT APIs return HRESULTS, and the return value of a WinRT API is actually an output parameter in the ABI signature. I’ll show an example of how a managed method signature is transformed into a WinRT ABI signature when I look at how the CLR calls a WinRT API later in this article.

Runtime Callable Wrappers and COM Callable Wrappers

When a WinRT object enters the CLR, it needs to be callable as if it were a .NET object. To make this happen, the CLR wraps each WinRT object in a runtime callable wrapper (RCW). The RCW is what managed code interacts with, and is the interface between your code and the WinRT object that your code is using.

Conversely, when managed objects are used from the Windows Runtime, they need to be called as if they were WinRT objects. In this case, managed objects are wrapped in a COM callable wrapper (CCW) when they’re sent to the Windows Runtime. Because the Windows Runtime uses the same infrastructure as COM, it can interact with CCWs to access functionality on the managed object (see Figure 3).

Using Wrappers for Windows Runtime and Managed Objects
Figure 3 Using Wrappers for Windows Runtime and Managed Objects

Marshaling Stubs

When managed code transitions across any interop boundary, including WinRT boundaries, several things must occur:

  1. Convert managed input parameters into WinRT equivalents, including building CCWs for managed objects.
  2. Find the interface that implements the WinRT method being called from the RCW that the method is being called on.
  3. Call into the WinRT method.
  4. Convert WinRT output parameters (including return values) into managed equivalents.
  5. Convert any failure HRESULTS from the WinRT API into a managed exception.

These operations occur in a marshaling stub, which the CLR generates on your program’s behalf. The marshaling stubs on an RCW are what managed code actually calls before transitioning into a WinRT API. Similarly, the Windows Runtime calls into CLR-generated marshaling stubs on a CCW when it transitions into managed code.

Marshaling stubs provide the bridge that spans the gap between the Windows Runtime and the .NET Framework. Understanding how they work will help you gain a deeper understanding of what happens when your program calls into the Windows Runtime.

An Example Call

Imagine a WinRT API that takes a list of strings and concatenates them, with a separator string between each element. This API might have a managed signature such as:

public string Join(IEnumerable<string> list, string separator)

The CLR has to call the method as it’s defined in the ABI, so it needs to figure out the ABI signature of the method. Thankfully, a set of deterministic transformations can be applied to unambiguously get an ABI signature given an API signature. The first transformation is to replace the projected data types with their WinRT equivalents, which returns the API to the form in which it’s defined in the WinMD file before the metadata adapter loaded it. In this case, IEnumerable<T> is actually a projection of IIterable<T>, so the WinMD view of this function is actually:

public string Join(IIterable<string> list, string separator)

WinRT strings are stored in an HSTRING data type, so to the Windows Runtime, this function actually looks like:

public HSTRING Join(IIterable<HSTRING> list, HSTRING separator)

At the ABI layer, where the call actually occurs, WinRT APIs have HRESULT return values, and the return value from their signature is an output parameter. Additionally, objects are pointers, so the ABI signature for this method would be:

HRESULT Join(__in IIterable<HSTRING>* list, HSTRING separator, __out HSTRING* retval)

All WinRT methods must be part of an interface that an object implements. Our Join method, for instance, might be part of an IConcatenation interface supported by a StringUtilities class. Before making a method call into Join, the CLR must get a hold of the IConcatenation interface pointer to make the call on.

The job of a marshaling stub is to convert from the original managed call on an RCW to the final WinRT call on a WinRT interface. In this case, the pseudo-code for the marshaling stub might look like Figure 4 (with cleanup calls omitted for clarity).

Figure 4  Example of a Marshaling Stub for Making a Call from the CLR to the Windows Runtime

public string Join(IEnumerable<string> list, string separator)
{
  // Convert the managed parameters to WinRT types
  CCW ccwList = GetCCW(list);
  IIterable<HSTRING>* pCcwIterable = ccwList.QueryInterface(IID_IIterable_HSTRING);
  HSTRING hstringSeparator = StringToHString(separator);
  // The object that managed code calls is actually an RCW
  RCW rcw = this;
  // You need to find the WinRT interface pointer for IConcatenation
  // implemented by the RCW in order to call its Join method
  IConcatination* pConcat = null;
  HRESULT hrQI = rcw.QueryInterface(IID_ IConcatenation, out pConcat);
  if (FAILED(hrQI))
    {
      // Most frequently this is an InvalidCastException due to the WinRT
      // object returning E_NOINTERFACE for the interface that contains
      // the method you're trying to call
      Exception qiError = GetExceptionForHR(hrQI);
      throw qiError;
    }
    // Call the real Join method
    HSTRING returnValue;
    HRESULT hrCall = pConcat->Join(pCcwIterable, hstringSeparator, &returnValue);
    // If the WinRT method fails, convert that failure to an exception
    if (FAILED(hrCall))
    {
      Exception callError = GetExceptionForHR(hrCall);
      throw callError;
    }
    // Convert the output parameters from WinRT types to .NET types
    return HStringToString(returnValue);
}

In this example, the first step is to convert the managed parameters from their managed representation to their WinRT representation. In this case, the code creates a CCW for the list parameter and converts the System.String parameter to an HSTRING.

The next step is to find the WinRT interface that supplies the implementation of Join. This occurs by issuing a QueryInterface call to the WinRT object that’s wrapped by the RCW that the managed code called Join on. The most common reason an InvalidCastException gets thrown from a WinRT method call is if this QueryInterface call fails. One reason this can happen is that the WinRT object doesn’t implement all the interfaces that the caller expected it to.

Now the real action occurs—the interop stub makes the actual call to the WinRT Join method, providing a location for it to store the logical return value HSTRING. If the WinRT method fails, it indicates this with a failure HRESULT, which the interop stub converts into an Exception and throws. This means that if your managed code sees an exception being thrown from a WinRT method call, it’s likely that the WinRT method being called returned a failure HRESULT and the CLR threw an exception to indicate that failure state to your code.

The final step is to convert the output parameters from their WinRT representation to their .NET form. In this example, the logical return value is an output parameter of the Join call and needs to be converted from an HSTRING to a .NET String. This value can then be returned as the result of the stub.

Calling from the Windows Runtime into Managed Code

Calls that originate from the Windows Runtime and target managed code work in a similar manner. The CLR responds to the QueryInterface calls that the Windows Runtime Component makes against it with an interface that has a virtual function table that’s filled out with interop stub methods. These stubs perform the same function as the one I showed previously, but in reverse direction.

Let’s consider the case of the Join API again, except this time assume it’s implemented in managed code and is being called into from a Windows Runtime Component. Pseudo-code for a stub that allows this transition to occur might look like Figure 5.

Figure 5 Example of a Marshaling Stub for Making a Call from the Windows Runtime to the CLR

HRESULT Join(__in IIterable<HSTRING>* list, 
  HSTRING separator, __out HSTRING* retval)
{
  *retval = null;
  // The object that native code is calling is actually a CCW
  CCW ccw = GetCCWFromComPointer(this);
  // Convert the WinRT parameters to managed types
  RCW rcwList = GetRCW(list);
  IEnumerable<string> managedList = (IEnumerable<string>)rcwList;
  string managedSeparator = HStringToString(separator);
  string result;
  try
  {
    // Call the managed Join implementation
    result = ccw.Join(managedList, managedSeparator);
  }
  catch (Exception e)
  {
    // The managed implementation threw an exception -
    // return that as a failure HRESULT
    return GetHRForException(e);
  }
  // Convert the output value from a managed type to a WinRT type
  *retval = StringToHSTring(result);
  return S_OK;
}

First, this code converts the input parameters from their WinRT data types into managed types. Assuming the input list is a WinRT object, the stub must get an RCW to represent that object to allow managed code to use it. The string value is simply converted from an HSTRING to a System.String.

Next, the call is made into the managed implementation of the Join method on the CCW. If this method throws an exception, the interop stub catches it and converts it to a failure HRESULT that’s returned to the WinRT caller. This explains why some exceptions thrown from managed code called by Windows Runtime Components don’t crash the process. If the Windows Runtime Component handles the failure HRESULT, that’s effectively the same as catching and handling the thrown exception.

The final step is to convert the output parameter from its .NET data type to the equivalent WinRT data type, in this case converting the System.String to an HSTRING. The return value is then placed in the output parameter and a success HRESULT is returned.

Projected Interfaces

Earlier, I mentioned that the CLR will project some WinRT interfaces into equivalent .NET interfaces. For instance, IMap<K,V> is projected to IDictionary<K,V>. This means any WinRT map is accessible as a .NET dictionary, and vice versa. To enable this projection to work, another set of stubs is needed to implement the WinRT interface in terms of the .NET interface it’s projected to, and vice versa. For example, IDictionary<K,V> has a TryGetValue method, but IMap<K,V> doesn’t contain this method. To allow managed callers to use TryGetValue, the CLR provides a stub that implements this method in terms of methods that IMap does have. This might look similar to Figure 6.

Figure 6 Conceptual Implementation of IDictionary in Terms of IMap

bool TryGetValue(K key, out V value)
{
  // "this" is the IMap RCW
  IMap<K,V> rcw = this;
  // IMap provides a HasKey and Lookup function, so you can
  // implement TryGetValue in terms of those functions
  if (!rcw.HasKey(key))
    return false;
  value = rcw.Lookup(key);
  return true;
}

Notice that to do its work, this conversion stub makes several calls to the underlying IMap implementation. For instance, let’s say you wrote the following bit of managed code to see whether a Windows.Foundation.Collections.PropertySet object contains the key “NYJ”:

 

object value;
if (propertySet.TryGetValue("NYJ", out value))
{
  // ...
}

As the TryGetValue call determines whether the property set contains the key, the call stack might look like Figure 7.

Figure 7 Call Stack for TryGetValue Call

Stack Description
PropertySet::HasKey WinRT PropertySet implementation
HasKey_Stub Marshaling stub converting the dictionary stub’s HasKey call into a WinRT call
TryGetValue_Stub Stub implementing IDictionary in terms of IMap
Application Managed application code calling PropertySet.TryGetValue

Wrapping Up

The CLR support for the Windows Runtime allows managed developers to call WinRT APIs defined in WinMD files as easily as they can call managed APIs defined in a standard .NET assembly. Under the hood, the CLR uses a metadata adapter to perform projections that help to merge the WinRT type system with the .NET type system. It also uses a set of interop stubs to allow .NET code to call WinRT methods, and vice versa. Taken together, these techniques make it easy for managed developers to call WinRT APIs from their Windows Store applications.


Shawn Farkas has worked on the CLR for 10 years and is currently the development lead in charge of CLR Windows Runtime projection and .NET interop. Prior to the Microsoft .NET Framework 4.5, he worked on the CLR security model. His blog can be found at blogs.msdn.com/shawnfa.

Thanks to the following technical experts for reviewing this article: Ryan Byington, Layla Driscoll and Yi Zhang