February 2011

Volume 26 Number 02

Data Contract Inheritance - Known Types and the Generic Resolver

By Juval Lowy | February 2011

Ever since its first release, Windows Communication Foundation (WCF) developers have had to deal with the hassles of data contract inheritance, a problem called known types. In this article I first explain the origin of the problem, discuss the available mitigations in the Microsoft .NET Framework 3.0 and the .NET Framework 4, and then present my technique that can eliminate the problem altogether. You’ll also see some advanced WCF programming techniques.

By Value vs. by Reference

In traditional object-oriented languages such as C++ and C#, a derived class maintains an is-a relationship with its base class. This means that given this declaration, every B object is also an A object:

class A {...}
class B : A {...}

Graphically, this looks like the Venn diagram in Figure 1, in which every B instance is also an A instance (but not every A is necessarily a B).

image: Is-A Relationship

Figure 1 Is-A Relationship

From a traditional object-orientation domain-modeling perspective, the is-a relationship enables you to design your code against the base class while interacting with a subclass. This means you can evolve the modeling of domain entities over time while minimizing the impact on the application.

For example, consider a business contacts management application with this modeling of a base type called Contact and a derived class called Customer that specializes the contact by adding to it the attributes of a customer:

class Contact {
  public string FirstName;
  public string LastName;
}
class Customer : Contact {
  public int OrderNumber;
}

Any method in the application that’s written initially against the Contact type can accept Customer objects as well, as show in Figure 2.

Figure 2 Interchanging Base Class and Sub Class References

interface IContactManager {
  void AddContact(Contact contact);
  Contact[] GetContacts();
}
class AddressBook : IContactManager {
  public void AddContact(Contact contact)
  {...}
  ...
}
IContactManager contacts = new AddressBook();
Contact  contact1 = new Contact();
Contact  contact2 = new Customer();
Customer customer = new Customer();
contacts.AddContact(contact1);
contacts.AddContact(contact2);
contacts.AddContact(customer);

The reason the code in Figure 2 works at all has to do with the way the compiler represents the object state in memory. To support the is-a relationship between a subclass and its base class, when allocating a new subclass instance the compiler first allocates the base class portion of the state of the object, then appends directly after it the subclass portion, as shown in Figure 3.

image: Object State Hierarchy in Memory

Figure 3 Object State Hierarchy in Memory

When a method that expects a reference to a Contact is actually given a reference to a Customer, it still works because the Customer reference is a reference to a Contact as well. 

Unfortunately, this intricate setup breaks when it comes to WCF. Unlike traditional object orientation or the classic CLR programming model, WCF passes all operation parameters by value, not by reference. Even though the code looks like the parameters are passed by reference (as in regular C#), the WCF proxy actually serializes the parameters into the message. The parameters are packaged in the WCF message and transferred to the service, where they are then deserialized to local references for the service operation to work with.

This is also what happens when the service operation returns results to the client: The results (or outgoing parameters, or exceptions) are first serialized into a reply message and then deserialized back on the client side.

The exact form of the serialization that takes place is usually a product of the data contract the service contract is written against. For example, consider these data contracts:

[DataContract]
class Contact {...}
[DataContract]
class Customer : Contact {...}

Using these data contracts, you can define this service contract:

[ServiceContract]
interface IContactManager {
  [OperationContract]
  void AddContact(Contact contact);
  [OperationContract]
  Contact[] GetContacts();
}

With multitier applications, marshaling the parameters by value works better than by reference because any layer in the architecture is at liberty to provide its own interpretation to the behavior behind the data contract. Marshaling by value also enables remote calls, interoperability, queued calls and long-running workflows. 

But unlike traditional object orientation, the service operation written against the Contact class can’t by default work with the customer subclass. The reason is simple: If you do pass a subclass reference to a service operation that expects a base class reference, how would WCF know to serialize into the message the derived class portion?

As a result, given the definitions so far, this WCF code will fail:

class ContactManagerClient : ClientBase<IContactManager> : 
  IContactManager{
  ...
}
IContactManager proxy = new ContactManagerClient();
Contact contact = new Customer();
// This will fail: 
contacts.AddContact(contact);

The Known Type Crutches

With the .NET Framework 3.0, WCF was able to address the problem of substituting a base class reference with a subclass using the KnownTypeAttribute, defined as:

[AttributeUsage(AttributeTargets.Struct|AttributeTargets.Class,
  AllowMultiple = true)]
public sealed class KnownTypeAttribute : Attribute {
  public KnownTypeAttribute(Type type);
  //More members
}

The KnownType attribute allows you to designate acceptable subclasses for the data contract:

[DataContract]
  [KnownType(typeof(Customer))]
  class Contact {...}
  [DataContract]
  class Customer : Contact {...}

When the client passes a data contract that uses a known type declaration, the WCF message formatter tests the type (akin to using the is operator) and sees if it’s the expected known type. If so, it serializes the parameter as the subclass rather than the base class.

The KnownType attribute affects all contracts and operations using the base class, across all services and endpoints, allowing it to accept subclasses instead of base classes. In addition, it includes the subclass in the metadata so that the client will have its own definition of the subclass and will be able to pass the subclass instead of the base class.

When multiple subclasses are expected, the developer must list all of them:

[DataContract]
[KnownType(typeof(Customer))]
[KnownType(typeof(Person))]
class Contact {...}
[DataContract]
class Person : Contact {...}

The WCF formatter uses reflection to collect all the known types of the data contracts, then examines the provided parameter to see if it’s of any of the known types.

Note that you must explicitly add all levels in the data contract class hierarchy. Adding a subclass doesn’t add its base classes:

[DataContract]
[KnownType(typeof(Customer))]
[KnownType(typeof(Person))]
class Contact {...}
[DataContract]
class Customer : Contact {...}
[DataContract]
class Person : Customer {...}

Because the KnownType attribute may be too broad in scope, WCF also provides ServiceKnownTypeAttribute, which you can apply on a specific operation or on a specific contract.

Finally, in the .NET Framework 3.0, WCF also allowed listing the expected known types in the application config file in the system.runtime.serialization section. 

While using known types technically works just fine, you should feel some unease about it. In traditional object-oriented modeling you never want to couple the base class to any specific subclasses. The hallmark of a good base class is precisely that: a good base is a good base class for any possible subclass, and yet the known types issue makes it adequate only for subclasses it happens to know about. If you do all your modeling up-front when designing the system, that may not be a hindrance. In reality, over time, as the application evolves its modeling, you’ll encounter as-yet-unknown types that will force you to, at the very least, redeploy your application—and, more likely, to also modify your base classes. 

Data Contract Resolvers

To alleviate the problem, in the .NET Framework 4 WCF introduced a way of resolving the known types at run time. This programmatic technique, called data contract resolvers, is the most powerful option because you can extend it to completely automate dealing with the known type issues. In essence, you’re given a chance to intercept the operation’s attempt to serialize and deserialize parameters and resolve the known types at run time both on the client and service sides.

The first step in implementing a programmatic resolution is to derive from the abstract class DataContractResolver, defined as:

public abstract class DataContractResolver {
  protected DataContractResolver();
  
  public abstract bool TryResolveType(
    Type type,Type declaredType,
    DataContractResolver knownTypeResolver, 
    out XmlDictionaryString typeName,
    out XmlDictionaryString typeNamespace);
  public abstract Type ResolveName(
    string typeName,string typeNamespace, 
    Type declaredType,
    DataContractResolver knownTypeResolver);
}

Your implementation of TryResolveType is called when WCF tries to serialize a type into a message and the type provided (the type parameter) is different from the type declared in the operation contract (the declaredType parameter). If you want to serialize the type, you need to provide some unique identifiers to serve as keys into a dictionary that maps identifiers to types. WCF will provide those keys during deserialization so that you can bind against that type.

Note that the namespace key can’t be an empty string or a null. While virtually any unique string value will do for the identifiers, I recommend simply using the CLR type name and namespace. Set the type name and namespace into the typeName and typeNamespace out parameters.

If you return true from TryResolveType, the type is considered resolved, as if you had applied the KnownType attribute. If you return false, WCF fails the call. Note that TryResolveType must resolve all known types, even those types that are decorated with the KnownType attribute or are listed in the config file. This presents a potential risk: It requires the resolver to be coupled to all known types in the application and will fail the operation call with other types that may come over time. It’s therefore preferable as a fallback contingency to try to resolve the type using the default known types resolver that WCF would’ve used if your resolver was not in use. This is exactly what the knownTypeResolver parameter is for. If your implementation of TryResolveType can’t resolve the type, it should delegate to knownTypeResolver.

ResolveName is called when WCF tries to deserialize a type out of a message and the type provided (the type parameter) is different from the type declared in the operation contract (the declaredType parameter). In this case, WCF provides the type name and namespace identifiers so that you can map them back to a known type.

As an example, consider again these two data contracts:

[DataContract]
class Contact {...}
[DataContract]
class Customer : Contact {...}

Figure 4 lists a simple resolver for the Customer type.

Figure 4 The CustomerResolver

class CustomerResolver : DataContractResolver {
  string Namespace {
    get {
      return typeof(Customer).Namespace ?? "global";
    }   
  }
  string Name {
    get {
      return typeof(Customer).Name;
    }   
  }
  public override Type ResolveName(
    string typeName,string typeNamespace,
    Type declaredType,
    DataContractResolver knownTypeResolver) {
    if(typeName == Name && typeNamespace == Namespace) {
      return typeof(Customer);
    }
    else {
      return knownTypeResolver.ResolveName(
        typeName,typeNamespace,declaredType,null);
    }
  }
  public override bool TryResolveType(
    Type type,Type declaredType,
    DataContractResolver knownTypeResolver,
    out XmlDictionaryString typeName,
    out XmlDictionaryString typeNamespace) {
    if(type == typeof(Customer)) {
      XmlDictionary dictionary = new XmlDictionary();
      typeName      = dictionary.Add(Name);
      typeNamespace = dictionary.Add(Namespace);
      return true;
    }
    else {
      return knownTypeResolver.TryResolveType(
        type,declaredType,null,out typeName,out typeNamespace);
    }
  }
}

The resolver must be attached as a behavior for each operation on the proxy or the service endpoint. The ServiceEndpoint class has a property called Contract of the type ContractDescription:

public class ServiceEndpoint {
  public ContractDescription Contract
  {get;set;}
  // More members
}

ContractDescription has a collection of operation descriptions, with an instance of OperationDescription for every operation on the contract:

public class ContractDescription {
  public OperationDescriptionCollection Operations
  {get;}
  // More members
}
public class OperationDescriptionCollection : 
  Collection<OperationDescription>
{...}

Each OperationDescription has a collection of operation behaviors of the type IOperationBehavior:

public class OperationDescription {
  public KeyedByTypeCollection<IOperationBehavior> Behaviors
  {get;}
  // More members
}

In its collection of behaviors, every operation always has a behavior called DataContractSerializerOperationBehavior with a DataContractResolver property:

public class DataContractSerializerOperationBehavior : 
  IOperationBehavior,... {
  public DataContractResolver DataContractResolver
  {get;set}
  // More members
}

The DataContractResolver property defaults to null, but you can set it to your custom resolver. To install a resolver on the host side, you must iterate over the collection of endpoints in the service description maintained by the host:

public class ServiceHost : ServiceHostBase {...}
public abstract class ServiceHostBase : ... {
  public ServiceDescription Description
  {get;}
  // More members
}
public class ServiceDescription {   
  public ServiceEndpointCollection Endpoints
  {get;}
  // More members
}
public class ServiceEndpointCollection : 
  Collection<ServiceEndpoint> {...}

Suppose you have the following service definition and are using the resolver in Figure 4:

[ServiceContract]
interface IContactManager {
  [OperationContract]
  void AddContact(Contact contact);
  ...
}
class AddressBookService : IContactManager {...}

Figure 5 shows how to install the resolver on the host for the AddressBookService.

Figure 5 Installing a Resolver on the Hos

ServiceHost host = 
  new ServiceHost(typeof(AddressBookService));
foreach(ServiceEndpoint endpoint in 
  host.Description.Endpoints) {
  foreach(OperationDescription operation in 
    endpoint.Contract.Operations) {
    DataContractSerializerOperationBehavior behavior = 
      operation.Behaviors.Find<
        DataContractSerializerOperationBehavior>();
      behavior.DataContractResolver = new CustomerResolver();
  }
}
host.Open();

On the client side, you follow similar steps, except you need to set the resolver on the single endpoint of the proxy or the channel factory. For example, given this proxy class definition:

class ContactManagerClient : ClientBase<IContactManager>,IContactManager
{...}

Figure 6 shows how to install the resolver on the proxy in order to call the service of Figure 5 with a known type.

Figure 6 Installing a Resolver on the Proxy

ContactManagerClient proxy = new ContactManagerClient();
foreach(OperationDescription operation in 
  proxy.Endpoint.Contract.Operations) {
  DataContractSerializerOperationBehavior behavior = 
    operation.Behaviors.Find<
    DataContractSerializerOperationBehavior>();
   
  behavior.DataContractResolver = new CustomerResolver();
}
Customer customer = new Customer();
...
proxy.AddContact(customer);

The Generic Resolver

Writing and installing a resolver for each type is obviously a lot of work, requiring you to meticulously track all known types—something that’s error-prone and can quickly get out of hand in an evolving system. To automate implementing a resolver, I wrote the class GenericResolver, defined as:

public class GenericResolver : DataContractResolver {
  public Type[] KnownTypes
  {get;}
  public GenericResolver();
  public GenericResolver(Type[] typesToResolve);
  public static GenericResolver Merge(
    GenericResolver resolver1,
    GenericResolver resolver2);
}

GenericResolver offers two constructors. One constructor can accept an array of known types to resolve. The parameterless constructor will automatically add as known types all classes and structs in the calling assembly and all public classes and structs in assemblies referenced by the calling assembly. The parameterless constructor won’t add types originating in a .NET Framework-referenced assembly.

In addition, GenericResolver offers the Merge static method that you can use to merge the known types of two resolvers, returning a GenericResolver that resolves the union of the two resolvers provided. Figure 7 shows the pertinent portion of GenericResolver without reflecting the types in the assemblies, which has nothing to do with WCF.

Figure 7 Implementing GenericResolver (Partial)

public class GenericResolver : DataContractResolver {
  const string DefaultNamespace = "global";
   
  readonly Dictionary<Type,Tuple<string,string>> m_TypeToNames;
  readonly Dictionary<string,Dictionary<string,Type>> m_NamesToType;
  public Type[] KnownTypes {
    get {
      return m_TypeToNames.Keys.ToArray();
    }
  }
  // Get all types in calling assembly and referenced assemblies
  static Type[] ReflectTypes() {...}
  public GenericResolver() : this(ReflectTypes()) {}
  public GenericResolver(Type[] typesToResolve) {
    m_TypeToNames = new Dictionary<Type,Tuple<string,string>>();
    m_NamesToType = new Dictionary<string,Dictionary<string,Type>>();
    foreach(Type type in typesToResolve) {
      string typeNamespace = GetNamespace(type);
      string typeName = GetName(type);
      m_TypeToNames[type] = new Tuple<string,string>(typeNamespace,typeName);
      if(m_NamesToType.ContainsKey(typeNamespace) == false) {
        m_NamesToType[typeNamespace] = new Dictionary<string,Type>();
      }
      m_NamesToType[typeNamespace][typeName] = type;
    }
  }
  static string GetNamespace(Type type) {
    return type.Namespace ?? DefaultNamespace;
  }
  static string GetName(Type type) {
    return type.Name;
  }
  public static GenericResolver Merge(
    GenericResolver resolver1, GenericResolver resolver2) {
    if(resolver1 == null) {
      return resolver2;
    }
    if(resolver2 == null) {
      return resolver1;
    }
    List<Type> types = new List<Type>();
    types.AddRange(resolver1.KnownTypes);
    types.AddRange(resolver2.KnownTypes);
    return new GenericResolver(types.ToArray());
  }
  public override Type ResolveName(
    string typeName,string typeNamespace,
    Type declaredType,
    DataContractResolver knownTypeResolver) {
    if(m_NamesToType.ContainsKey(typeNamespace)) {
      if(m_NamesToType[typeNamespace].ContainsKey(typeName)) {
        return m_NamesToType[typeNamespace][typeName];
      }
    }
    return knownTypeResolver.ResolveName(
      typeName,typeNamespace,declaredType,null);
  }
  public override bool TryResolveType(
    Type type,Type declaredType,
    DataContractResolver knownTypeResolver,
    out XmlDictionaryString typeName,
    out XmlDictionaryString typeNamespace) {
    if(m_TypeToNames.ContainsKey(type)) {
      XmlDictionary dictionary = new XmlDictionary();
      typeNamespace = dictionary.Add(m_TypeToNames[type].Item1);
      typeName      = dictionary.Add(m_TypeToNames[type].Item2);
      return true;
    }
    else {
      return knownTypeResolver.TryResolveType(
      type,declaredType,null,out typeName,
      out typeNamespace);
    }
  }
}

The most important members of GenericResolver are the m_TypeToNames and the m_NamesToType dictionaries. m_TypeToNames maps a type to a tuple of its name and namespace. m_NamesToType maps a type namespace and name to the actual type. The constructor that takes the array of types initializes those two dictionaries. The TryResolveType method uses the provided type as a key into the m_TypeToNames dictionary to read the type’s name and namespace. The ResolveName method uses the provided namespace and name as keys into the m_NamesToType dictionary to return the resolved type.

While you could use tedious code similar to Figure 5 and Figure 6 to install GenericResolver, it’s best to streamline it with extension methods. To that end, use my AddGenericResolver methods of GenericResolverInstaller, defined as:

public static class GenericResolverInstaller {
  public static void AddGenericResolver(
    this ServiceHost host, params Type[] typesToResolve);
  public static void AddGenericResolver<T>(
    this ClientBase<T> proxy, 
    params Type[] typesToResolve) where T : class;
  public static void AddGenericResolver<T>(
    this ChannelFactory<T> factory,
    params Type[] typesToResolve) where T : class;
}

The AddGenericResolver method accepts a params array of types, which means an open-ended, comma-separated list of types. If you don’t specify types, that will make AddGenericResolver add as known types all classes and structs in the calling assembly plus the public classes and structs in referenced assemblies. For example, consider these known types:

[DataContract]
class Contact {...}
[DataContract]
class Customer : Contact {...}
[DataContract]
class Employee : Contact {...}

Figure 8 shows several examples of using the AddGenericResolver extension method for these types.

Figure 8 Installing GenericResolver

// Host side
ServiceHost host1 = new ServiceHost(typeof(AddressBookService));
// Resolve all types in this and referenced assemblies
host1.AddGenericResolver();
host1.Open();
ServiceHost host2 = new ServiceHost(typeof(AddressBookService));
// Resolve only Customer and Employee
host2.AddGenericResolver(typeof(Customer),typeof(Employee));
host2.Open();
ServiceHost host3 = new ServiceHost(typeof(AddressBookService));
// Can call AddGenericResolver() multiple times
host3.AddGenericResolver(typeof(Customer));
host3.AddGenericResolver(typeof(Employee));
host3.Open();
// Client side
ContactManagerClient proxy = new ContactManagerClient();
// Resolve all types in this and referenced assemblies
proxy.AddGenericResolver();
Customer customer = new Customer();
...
proxy.AddContact(customer);

GenericResolverInstaller not only installs the GenericResolver, it also tries to merge it with the old generic resolver (if present). This means you can call the AddGenericResolver method multiple times. This is handy when adding bounded generic types:

[DataContract]
class Customer<T> : Contact {...}
ServiceHost host = new ServiceHost(typeof(AddressBookService));
// Add all non-generic known types
host.AddGenericResolver();
// Add the generic types 
host.AddGenericResolver(typeof(Customer<int>,Customer<string>));
host.Open();

Figure 9 shows partial implementation of GenericResolverInstaller.

Figure 9 Implementing GenericResolverInstaller

public static class GenericResolverInstaller {
  public static void AddGenericResolver(
    this ServiceHost host, params Type[] typesToResolve) {
    foreach(ServiceEndpoint endpoint in 
      host.Description.Endpoints) {
      AddGenericResolver(endpoint,typesToResolve);
    }
  }
  static void AddGenericResolver(
    ServiceEndpoint endpoint,Type[] typesToResolve) {
    foreach(OperationDescription operation in 
      endpoint.Contract.Operations) {
      DataContractSerializerOperationBehavior behavior = 
        operation.Behaviors.Find<
        DataContractSerializerOperationBehavior>();
      GenericResolver newResolver;
      if(typesToResolve == null || 
        typesToResolve.Any() == false) {
        newResolver = new GenericResolver();
      }
      else {
        newResolver = new GenericResolver(typesToResolve);
      }
      GenericResolver oldResolver = 
        behavior.DataContractResolver as GenericResolver;
      behavior.DataContractResolver = 
        GenericResolver.Merge(oldResolver,newResolver);
    }
  }
}

If no types are provided, AddGenericResolver will use the parameterless constructor of GenericResolver. Otherwise, it will use only the specified types by calling the other constructor. Note the merging with the old resolver if present.

The Generic Resolver Attribute

If your service relies on the generic resolver by design, it’s better not to be at the mercy of the host and to declare your need for the generic resolver at design time. To that end, I wrote the GenericResolverBehaviorAttribute:

[AttributeUsage(AttributeTargets.Class)]
public class GenericResolverBehaviorAttribute : 
  Attribute,IServiceBehavior {
  void IServiceBehavior.Validate(
    ServiceDescription serviceDescription,
    ServiceHostBase serviceHostBase) {
    ServiceHost host = serviceHostBase as ServiceHost;
    host.AddGenericResolver();
  }
  // More members
}

This concise attribute makes the service independent of the host:

GenericResolverBehaviorAttribute derives from IServiceBehavior, which is a special WCF interface and is the most commonly used extension in WCF. When the host loads the service, the host calls the IServiceBehavior methods—specifically the Validate method—which lets the attribute interact with the host. In the case of GenericResolverBehaviorAttribute, it adds the generic resolver to the host.

And there you have it: a relatively simple and flexible way to bypass the hassles of data contract inheritance. Put this technique to work in your next WCF project.


Juval Lowy is a software architect with IDesign providing .NET and architecture training and consulting. This article contains excerpts from his recent book, “Programming WCF Services, 3rd Edition” (O'Reilly, 2010). He’s also the Microsoft Regional Director for the Silicon Valley. Contact Lowy at idesign.net.

Thanks to the following technical experts for reviewing this article: Glenn Block and Amadeo Casas Cuadrado