Pure C++: CLR Generics Versus C++ Templates

Article
10/18/2019

Pure C++

CLR Generics Versus C++ Templates

Stanley B. Lippman

Contents

Parameter List Redux
Additional Template Functionality Within the Parameter List
Type Argument Constraints
Generic Constraints
Wrapping It Up

Visual Studio® 2005 brings the type parameter model of generic programming to the Microsoft® .NET Framework. C++/CLI supports two type parameter mechanisms—common language runtime (CLR) generics and C++ templates. In my last column, I looked at the characteristics that these two mechanisms have in common (see Pure C++: Generic Programming Under .NET). In this column, I'll look at some of the differences—in particular, differences in the parameter list and type constraint model.

Parameter List Redux

The parameter list works like the signature of a function: it identifies the number and kind of each parameter, and associates a unique identifier with each one so that each parameter can be uniquely referred to within the template definition.

The parameter serves as a placeholder within the definition of the template or generic type. The user creates object instances of the type by providing actual values to bind to the parameters. The instantiation of a parameterized type is not a simple text substitution such as that which a macro expansion mechanism might employ. Rather, it consists of binding the actual user values to the associated formal parameter within the definition.

Under generics, each parameter represents either an Object type or a type derived from Object. As you'll see later, this constrains the kinds of operations you can perform on or through objects declared through a type parameter. You can offset these constraints by providing more explicit (and therefore less restrictive for the generic author) constraints of your own. These explicit constraints refer to a base class or set of interfaces from which the actual type argument must derive.

In addition to supporting type parameters, templates support expression and template parameters. Plus, templates support default parameter values. These are resolved positionally rather than by name. Under both mechanisms, type parameters are introduced with either the class or typename keyword.

Additional Template Functionality Within the Parameter List

In addition to type parameters, templates permit two other kinds of parameters: non-type parameters and template parameters. Let's briefly look at each.

A non-type parameter is constrained to a constant expression. What should spring to mind immediately is a numeric or string literal constant. For example, if you chose to provide a fixed-size stack, you might decide to specify a non-type size parameter as well as the element type parameter, thus partitioning your categories of stack instances by both element type and size. For example, see the fixed-size stack with a non-type parameter in Figure 1.

Figure 1 Fixed-Size Stack with a Non-Type Parameter

// fixed-size stack with non-type parameter 
template <class elemType, int size>
public ref class tStack
{
    // no reason to use a vector now... 
    array<elemType> ^m_stack;
    int top;

public:
    tStack() : top( 0 )  
           { m_stack = gcnew array<elemType>( size ); }

};

In addition, it would be convenient if the template class designer could specify a default value for either or both parameters. For example, it may prove optimal to have a buffer default to 1KB in size. Under the template mechanism, a parameter can be provided with a default value, as you see here:

// template declaration with default value 
template <class elemType, int size = 1024>
public ref class FixedSizeStack {};

A user could then override the default size value by providing an explicit second value like so:

// A stack of at most 128 String instances 
FixedSizeState<String^, 128> ^tbs = gcnew FixedSizeStack<String^, 128>;

Otherwise, by not providing a second argument, he uses the associated default value, like this:

// A stack of at most 1024 String instances 
FixedSizeStack<String^> ^tbs = gcnew FixedSizeStack<String^>;

This use of default parameter values is an essential design aspect of the Standard Template Library (STL). For example, the following declarations are taken from the ISO-C++ standard:

// examples of default type parameter values in ISO-C++ namespace std 
{
     template <class T, class Container = deque<T> > 
     class queue;

     template <class T, class Allocator = allocator<T> > 
     class vector;
     // ...
}

You could also provide a default element type, like this:

// template declaration with default element type as well 
template <class elemType=String^, int size=1024>
public ref class tStack {};

But that would be more difficult to justify from a design standpoint since it is unlikely in general to converge on a single default type for a container.

A pointer can also serve as a non-type parameter since the address of an object or function is known at compile time and is therefore a constant expression. For example, you may want to provide a third parameter to the stack class, one that indicates a callback handler when some condition arises. A judicious use of a typedef can do wonders with the otherwise seemingly complex declaration, as shown here:

typedef void (*handler)( ... array<Object^>^ );
template <class elemType, int size, handler cback >
public ref class tStack {};

Of course, you can provide a default value for your handler—in this case, the address of an existing method. For example, this declaration of your buffer provides for a default size and handler:

void defaultHandler( ... array<Object^>^ ){ ... }

template < class elemType, 
           int size = 1024, 
           handler cback = &defaultHandler >
public ref class tStack {};

Because default values are positional rather than named, there is no way to provide an overriding handler without also providing an explicit size, even if that size were to duplicate the default value. Here are some ways in which your modified stack might be used:

void demonstration() 
{
    // default size and handler
    tStack<String^> ^ts1 = nullptr; 
          
    // default handler
    tStack<String^, 128> ^ts2 = gcnew tStack<String^, 128>;

    // override all three parameters
    tStack<String^, 512, &yourHandler> ^ts3;
}

The second additional kind of parameter supported by templates is that of a template template parameter—say that five times fast! That is, the template parameter itself represents a template. For example:

// template template parameter
template <template <class T> class arena, class arenaType>
class Editor { 
    arena<arenaType> m_arena; 
    // ...
};

The Editor template class lists two template parameters, arena and arenaType. arenaType is a template type parameter; you can pass it an int, String, yourType, myType, and so on. arena is a template template parameter. Any template class with a single template type parameter can be bound to arena. And m_arena is an instance of that template class bound to the arenaType template type parameter. For example:

// a template Buffer class
template <class elemType>
public ref class tBuffer {};

void f()
{
    Editor<tBuffer,String^> ^textEditor;
    Editor<tBuffer,char>    ^blitEditor;
    // ...
}

Type Argument Constraints

If you use your parameterized type simply as a container for the storage and retrieval of elements, as I did in my last column's stack class, then the whole topic of constraints can be ignored. It is only when you need to invoke an operation on a type parameter (such as when you compare two objects to see whether one object of that type is equal to or less than another, or when you invoke the name of a method or nested type through a type parameter) that the issue of constraints becomes a concern. For example, let's reconsider the typename declaration that was in my last column:

template <class T>
ref class Demonstration {
    int method() {
         typename T::A *aObj; 
         // ...
    }
};

While this successfully declares aObj, it also constrains the type arguments that can successfully bind to your class template. For example, if you write the following code, the declaration of aObj turns out to be illegal (in this particular case) and the compiler flags it as an error:

int demoMethod()
{
    Demonstration<int> ^demi = 
        gcnew Demonstration<int>( 1024 );
    return dm->method();
}

The specific constraint, of course, is that the type argument must contain a nested declaration of a type explicitly named A. It doesn't do you any good if it is called B or C or Z. A more general constraint is that the type argument must represent a class; otherwise, the use of the T:: scope operator is not permitted. My use of an int type argument violates both of these constraints. For example, the Visual C++® compiler generates the following error:

error C2825: 'T': must be a class or namespace when followed by '::'

One criticism of the C++ template mechanism is the absence of any formal syntax to delineate these sorts of type constraints. (Note that in his original design paper on parameterized types, Bjarne Stroustrup discusses having considered providing an explicit constraint syntax, but being dissatisfied with it and choosing not to provide the facility at that time.) That is, the user can only recognize the implicit constraints of a template either by reading the source code or the associated documentation or by compiling her code and reading the ensuing compiler error message.

What can you do if you need to provide a type argument that is unsuited to the template? On the one hand, very little. Any class you write has certain assumptions built into it, and those assumptions represent constraints on its use. It is rather difficult to design a class that is appropriate for every situation; it is even more difficult to design a template class that is appropriate for not just every situation but for every possible type argument as well.

On the other hand, there are a number of template facilities that allow the user some elbow room. For example, a class template member function is not bound to the type argument until there is a use of that function in your code. So, if you can use the template class without using the method that invalidates your type argument, you're okay.

If that is not feasible, a second alternative is to provide a specialized version of that method that is specifically associated with your type argument. In this case, you would provide a specialized instance of Demonstration<int>::method, or, more generally, you could provide a specialized implementation of the entire template class when provided with an integer type argument.

In general, when you say that a parameterized type can support an infinite variety of types, you are primarily speaking of passive uses of parameterization—that is, the storage and retrieval of a type rather than actively manipulating it.

As the designer of a template, you must always be conscious of the implicit constraints your implementation places on the type arguments and work to ensure that they are not superfluous. For example, it is reasonable to demand that a type argument provide support for the equality and less-than operators; however, it is somewhat less reasonable to demand that it provide support for the less-than-or-equal or XOR bitwise operators. (You can loosen the dependence on operators either by factoring these operations into an interface or by requiring an additional parameter that represents a function, delegate, or function object.) For example, Figure 2 shows a likely implementation of a search method by a native C++ programmer making use of the built-in equality operator.

Figure 2 Template-Unfriendly Search Implementation

template <class elemType, int size=1024>
ref class Container 
{
    array<elemType> ^m_buf;
    int next;

public:
    bool search( elemType et )
    {
        for each ( elemType e in m_buf )
            if ( et == e )
                 return true;

        return false;
    }

    Container() 
    { 
        m_buf = gcnew array<elemType>(size); 
        next = 0; 
    }

    void add( elemType et )
    { 
         if ( next >= size ) 
              throw gcnew Exception; 
         m_buf[ next++ ] = et; 
    }

    elemType get( int ix )
    {
        if ( ix < next ) 
             return m_buf[ ix ]; 
        throw gcnew Exception;  
    }
    
    // ...
};

There is nothing wrong with this implementation of a search function. However, it is somewhat template-unfriendly because the type argument is tightly coupled to the equality operator. A more flexible alternative is to provide a second search method that allows the user to pass in an object to use for comparison operations. You can do this with a function member template. The function member template provides for an additional type parameter. Take a look at Figure 3.

Figure 3 Using a Template

template <class elemType, int size=1024>
ref class Container 
{
    // everything else the same ...

    // this is a function member template ...
    // it can reference both of its containing class parameters, and
    // its own parameters ...

    template <class Comparer>
    bool search( elemType et, Comparer comp )
    {
        for each ( elemType e in m_buf )
            if ( comp( et, e ) )
                 return true;

        return false;
    }
    // ...
};

Now the user has a choice of methods by which to search the container: the tightly coupled equality operator search that is more efficient but not appropriate for all types, and the more flexible member template search that requires a comparison type to be passed in.

What sort of type might serve this comparison purpose? The function object is a common C++ design pattern for this purpose. For example, here is a quick and dirty function object to compare two string objects for equality:

class EqualGuy {
public: 
    bool operator()( String^ s1, String^ s2 )
    {
         return s1->CompareTo( s2 ) == 0;
    }
};

The code in Figure 4 shows how you might invoke both the member function template version of search together with the more traditional version.

Figure 4 Two Search Functions

int main()
{
    Container<String^> ^sxc = gcnew Container<String^>;
    sxc->add( "Pooh" );
    sxc->add( "Piglet" );
    // ... and so on ...

    // the member template search ...
    if ( sxc->search( "Pooh", EqualGuy() ) )
         Console::WriteLine( "found" );
    else Console::WriteLine( "not found" );

    // the traditional equality operator search ...
    if ( sxc->search( "Pooh" ) )
         Console::WriteLine( "found" );
    else Console::WriteLine( "not found" );
}

Once you get your head wrapped around templates, you discover that there is hardly anything you can't accomplish using them. Or at least it feels that way.

Generic Constraints

Unlike templates, a generic definition supports a formal constraint syntax for describing prerequisite conditions on the type parameters it can legally bind to. Before I drill down into the constraint facility, let's briefly consider the question why generics choose to provide a constraint facility while templates chose not to. The primary answer, I believe, is the different binding times between the two mechanisms.

Templates are bound during compilation, and therefore an invalid type prevents the program from building. The user must either immediately address the problem or retreat back to a non-template programming solution. The integrity of the executing program is not at risk.

Generics, on the other hand, are bound by the runtime, and that is a bit too late to discover that a user-specified type is invalid. And so the Common Language Infrastructure (CLI) needs some static—that is, compile-time—mechanism to insure that the runtime only binds valid types. The constraint list associated with a generic type is a compile-time filter that, when violated, prevents the program from building.

Let's look at an example. Figure 5 shows my container class reimplemented as a generic type. Its search method presumes that the type parameter is derived from IComparable and that it therefore implements an instance of its CompareTo method. (Notice that the container's size is provided by the user during construction rather than as a second, non-type parameter. Generics, you'll recall, do not support non-type parameters.)

Figure 5 Container Implemented as a Generic Type

generic <class elemType>
public ref class Container 
{
    array<elemType> ^m_buf;
    int next;
    int size;

public:

    // implicit constraint that elemType
    // implement the IComparable interface

    bool search( elemType et )
    {
        for each ( elemType e in m_buf )
            if ( et->CompareTo( e ))
                 return true;

        return false;
    }

    Container( int sz ) 
    { 
        m_buf = gcnew array<elemType>(size = sz); 
        next = 0; 
    }

    // add() and get() are the same ... 

};

This generic class implementation fails to compile with the following fatal compile diagnostic:

error C2039: 'CompareTo' : is not a member of 'System::Object'

You might mumble to yourself, okay, so what? Nobody is claiming that it is a member of System::Object. However, in this case, you are wrong. By default, a generic type parameter makes the strongest possible constraint: it constrains all its types to be of type Object. That constraint is guaranteed to be true since only CLI types are permitted to bind to a generic type, and, of course, all CLI types are derived more or less directly from Object. So by default as an author of a generic type you have a very safe but limited set of operations that you can apply.

You might think, well, the heck with being flexible, I just want to get the darn thing to compile, and so you substitute an equality operator for the CompareTo method, but that causes the compiler to bark even louder:

error C2676: binary '==' : 'elemType' does not define this operator
or a conversion to a type acceptable to the predefined operator

Again, what is going on is that each type parameter begins its life metaphorically walled in by the four public methods of Object: ToString, GetType, GetHashCode, and Equals. In effect, this work in listing constraints on the individual type parameters represents a progressive loosening of this initial strong constraint. Stated another way, your task as author of the generic type is to expand the permitted operations on a type parameter in a verifiable way using the conventions of the generic constraint list. Let's take a look at how to do that.

The list of constraints are referred to as the constraint clause, and it is introduced with the non-reserved word "where". It is placed between the parameter list and the type declaration. The actual constraints consist of the names of one or more interface types and/or a single class type. These constraints represent base types that the associated type parameter is expected to implement or be derived from. The public set of operations of each type is added to the operations available to be exercised by the type parameter. So, in order to allow your elemType parameter to invoke CompareTo, you must add a constraint clause associating it with the IComparable interface, like so:

generic <class elemType>
  where elemType : IComparable
public ref class Container 
{
    // the body of the class is unchanged ... 
};

The constraint clause, which expands the set of operations permitted to be invoked by an elemType instance, is the union of the public operations of the implicit Object constraint and the explicit IComparable constraint. The generic definition now compiles and is available for actual use. When you specify an actual type argument, as you see in the following code, it is the compiler that verifies that the actual type argument meets the constraints associated with the type parameter to which it will bind:

int main()
{
    // ok: String and int implements IComparable
    Container<String^> ^sc; 
    Container<int> ^ic; 

    // error: StringBuilder does not implement IComparable    
    Container<StringBuilder^> ^sbc; 
}

The compiler flags any violations, such as the definition of sbc. But the actual binding and construction of the generic type is done by the runtime.

A generic type, then, is verified for not violating its constraints both at its definition point (when the compiler processes your implementation) and at each construction point (when the compiler checks the type argument against the associated constraints). A failure at either point results in a compile-time error.

The constraint clause can contain one entry per type parameter. The entries are not required to be specified in the same order as that of the parameter list. Multiple constraints for a parameter are separated by a comma. The constraints must be unique within the list associated with each parameter, but of course can occur in more than a single constraint list. For example:

generic <class T1, class T2, class T3>
  where T1 : IComparable, ICloneable, Image
  where T2 : IComparable, ICloneable, Image
  where T3 : ISerializable, CompositeImage
public ref class Compositor 
{
    // ...
};

In this case, you have three constraint clauses specifying both interface types and a single class type (listed at the end of each list). The constraints are additive—meaning that a type argument must meet all of the listed constraints and not merely a subset. As my colleague Jon Wray points out, as you expand the set of operations in your role as author of the generic type, thereby loosening your constraints, you in turn increase the constraints you place on users of the generic type in their choice of a type argument.

The T1, T2, and T3 clauses could have been specified in any order. It is not permitted, however, to specify the constraint list associated with a type parameter across two or more where clauses. For example, the following is flagged as a syntax violation:

generic <class T1, class T2, class T3>
  // error: two entries for same parameter not permitted
  where T1 : IComparable, ICloneable
  where T1 : Image
public ref class Compositor 
{
    // ...
};

A class constraint type must be an unsealed reference class. (Neither value classes nor sealed classes are permitted since they do not allow inheritance.) Four specific System namespace classes are prohibited from appearing within a constraint clause: System::Array, System::Delegate, System::Enum, and System::ValueType. Because the CLI only supports single inheritance, a constraint clause only supports the inclusion of one class type. The constraint types must be at least as accessible as the generic type or function. For example, you cannot declare a public generic type and list one or more constraints with internal visibility.

Any of the type parameters can be bound to a constraint type. Here is a simple example:

generic <class T1, class T2>
  where T1 : IComparable<T1>
  where T2 : IComparable<T2>
public ref class Compositor 
{
    // ... 
};

Constraints are not inherited. For example, if I derive the following class from Compositor, the IComparable constraints on Compositor's T1 and T2 are not applied to the same-named parameters of the BlackWhite_Compositor class:

generic <class T1, class T2>
public ref class BlackWhite_Compositor : Compositor
{
    // ... 
};

This is something of a design inconvenience when those parameters are used in conjunction with the base class. In order to guarantee the integrity of Compositor, BlackWhite_Compositor must propagate the Compositor constraints on all parameters that are passed to the Compositor subobject. For example, the correct declaration looks like this:

generic <class T1, class T2>
  where T1 : IComparable<T1>
  where T2 : IComparable<T2>
public ref class BlackWhite_Compositor : Compositor
{
    // ...
};

Wrapping It Up

As you've seen, under C++/CLI, you can choose either CLR generics or C++ templates.Now you should have the information to make an informed choice based on your particular needs. Under both mechanisms, a parameterized type that goes beyond the storage and retrieval of elements contains assumptions as to the operations that each type parameter must support.

With templates, these assumptions are implicit. This benefits the author of a template who has a greater degree of freedom in what she can implement. However, this does not benefit the consumer of a template who is often faced with an undocumented set of constraints on the permissible type arguments. While violations of this constraint set result in compile-time errors and is therefore not a threat to the integrity of the run-time execution, use of a template class can prove frustrating. The design bias of the mechanism leans in favor of the implementor.

With generics, these assumptions are made explicitly and are associated with a set of base types listed within the constraint clause. This benefits the consumer of the generic and verifies that any generic passed to the runtime for type construction is correct. This provides some constraints on the design freedom of the author of a generic, in my opinion, and makes certain template design idioms a little more difficult to support. Violations of the formal constraints, either at the point of definition or at the point of a user-specified type argument, result in compile-time errors. The design bias of the mechanism leans in favor of the consumer.

Send your questions and comments for Stanley to purecpp@microsoft.com.

Stanley B. Lippman is an Architect with the Visual C++ team at Microsoft. He began working on C++ with its inventor, Bjarne Stroustrup, in 1984 at Bell Laboratories. In between, he worked in feature animation at Disney and DreamWorks, was a Distinguished Consultant with JPL, and was a Software Technical Director on Fantasia 2000.

Additional resources