Generative Programming: Modern Techniques to Au...

We were unable to locate this content in de-de.

Here is the same content in en-us.

From the December 2001 issue of MSDN Magazine.
MSDN Magazine

Generative Programming: Modern Techniques to Automate Repetitive Programming Tasks

Chris Sells
This article assumes you're familiar with C++ and object-oriented design
Level of Difficulty     1   2   3 
Download the code for this article: GenProg.exe (136KB)
Browse the code for this article at Code Center: Code Generator
SUMMARY Even when developers have recurring computer-readable metadata to process and a clear idea of how code should be structured, they can still find themselves in need of a technique to automatically generate the code to avoid the drudge work of repeatedly writing and tweaking it.
      Generative programming is a technique that addresses this problem. Using generative programming techniques, you can solve software engineering problems in families, rather than individually, saving time and coding effort. This article describes these techniques, and builds a sample template-driven code generator. The article also lists existing utilities that have been built with generative programming techniques, as well as actual code generators.
L ate last year an interesting question related to ATL event sinks was posted on the Microsoft® ATL mailing list. The author of the question wanted to know where he could find an event sink code generator for dispinterface-based events because he was "sick of writing ATL event sinks by hand, with all those _ATL_FUNC_INFOs."
      This programmer wanted to generate C++ ATL code for handling events fired on a COM event interface. Microsoft Visual C++® 6.0 provides a wizard for doing so if the events originate from a COM control hosted on a dialog. Otherwise, you're on your own. Generally speaking, that means you're manually converting COM event interface method signatures as defined by dispinterfaces in IDL like the one shown in Figure 1 into a class of the form shown in Figure 2.
      If you only have to do this once, manual translation using the old "101-key TextWizard" is not a big deal. If you have to do it more often than that, it starts to become a chore, which is probably what motivated the question.

The Big Picture

      What this question addresses is a specific instance of a more general problem. To broaden the scope, I would paraphrase the question like so: "I have computer-readable metadata and a clear pattern of the code I'd like generated. Why can't the computer do the generation for me?" In this case, the metadata is COM type information stored in a .tlb file on his computer's hard drive. Windows® provides the parser for this particular metadata format, and the pattern of code that should be generated is fairly obvious. The problem is going from the metadata and the pattern code to the code itself. The solution is something that is called generative programming.
      The term was coined in the book Generative Programming by Krzysztof Czarnecki and Ulrich Eisenecker (Addison-Wesley, 2000). It refers to a system of programming that concentrates on solving software engineering problems in families instead of one at a time. As a simple example, an array of integers is a specific solution to a specific problem, but the standard template library (STL) vector template is meant to address the whole family of problems solved by arrays. Specifically, the vector template is a generator because it describes the pattern of code for the compiler to generate. In fact, generic programming, as provided in C++ via templates, is a primitive form of generative programming. You'll encounter other forms later in this article.
      Once a family member has been defined, applying generative programming produces implementation code from three things. According to Czarnecki and Eisenecker's book these are: a means of specifying family members, the implementation components from which members can be assembled, and the configuration knowledge mapping between a specification of a member and a finished member." In other words, generative programming builds implementations from the code patterns and lower-level components, and the metadata that brings the two together.
      As another example of generative programming, think of the last wizard you ran. If it was a wizard in Visual C++, it was probably the MFC AppWizard, the ATL COM AppWizard, or one of the ATL ObjectWizards. Pick a wizard, any wizard. Now, show it to your friends, but don't show me. OK. Now, was the wizard-generated code structured just exactly the same as it always is? That represents the pattern of code that your wizard generates. Did it generate code to target the MFC or ATL class library? Those are the lower-level components being used to implement your system. Did you pick options that governed exactly what code was generated? That's the metadata that glues the code patterns to the components.
      Generative programming is as old as the hills and you use it all day, every day. You're using it when you employ C-style macros to produce C code, when you feed IDL to the MIDL compiler to produce proxy-stub code, when you run a wizard or instantiate a template, and when you use #import to generate those wacky C++ wrappers around your COM interfaces. Unfortunately, as useful as those means of generative programming are, they're not as useful as having the tools to build generative programs yourself. I'll take a crack at solving the original problem and see what I can do.

Generating the Code

      In this case, the family of problem to be solved is how to generate a COM event sink. Three things are needed for a solution:
  • A generator to produce the code. For that, I'll use a simple JScript®-based program.
  • The low-level components to code against. The question asked for ATL's IDispEventImpl, so that's what I'll use.
  • The metadata to drive the generator will be provided by the Microsoft Type Library Information object library, which comes with Microsoft Visual Studio® 6.0 and is documented in the December 2000 MSDN® Magazine article "Visual Basic®: Inspect COM Components Using the TypeLib Information Object Library" and the TypeLibraryInformation object help file (see Knowledge Base article Q224331).
      Bringing these three parts together helps solve the problem presented by the programmer I told you about earlier, as you can see in Figure 3. Unfortunately, for as little as this script does, it's pretty difficult to read. C/C++ developers know this style of generative programming as "printf" code generation because it involves numerous print statements to generate the desired output. Web pages are often generated by a Common Gateway Interface (CGI) program using the same technique, as you can see in the following code:
   // hi.js
   function echo(s) { WScript.Echo(s); }

   echo("content-type: text/html");
   echo("");
   for( var i = 3; i != 8; ++i ) {
     echo("<font size=" + i + ">Hi</font><br>");
   }
      Like the script in handlers.js, CGI programs can quickly get complicated, making it very easy to lose the structure of the output in the sea of logic statements. This is why more than a million CGI developers have turned to the inside-out style of ASP, which emphasizes the structure of the output over the logic. Here's an example:
   <%@ language=jscript %>
   <%// hi.asp %>
   <% for( var i = 3; i != 8; ++i ) { %>
   <font size=<%= i %>>Hi</font><br>
   <% } %>
Surfing to this ASP page on your Web server will yield the following output in your browser:
   <font size=3>Hi</font><br>
   <font size=4>Hi</font><br>
   <font size=5>Hi</font><br>
   <font size=6>Hi</font><br>
   <font size=7>Hi</font><br>
      For those of you unfamiliar with the syntax of ASP, the sections inside the angle brackets are commands to the ASP engine or script engine. The <%@ language %> block lets ASP know the language preference—JScript or VBScript. The <% %> blocks are JScript statements; the first is a comment and the second is a for statement, wrapping the text to be output. The <%= %> block outputs the value of the expression inside the block—in this case, the loop variable. Everything else is just text to be output. Typical ASP syntax highlighting makes it even simpler to differentiate the text to be output from the logic (see Figure 4).

Figure 4 Highlighted ASP Code
Figure 4 Highlighted ASP Code

      This formatting makes it easy to start with your target output and drop in expressions (through <%= %> blocks) and statements (through <% %>) blocks, while allowing you to understand the structure of the output in a way that's much more friendly than the printf style of code generation.

ASP for Your Code

      Let's use the same ASP formatting to convert handlers.js to this ASP style. The results are shown in Figure 5. Immediately, it's easier to see the raw text to be output—it's anything not inside a <% %> block. In this generative programming scheme, the generator becomes the ASP engine, driven by handlers.asp.

Figure 5 Handlers.js in ASP-style Code
Figure 5 Handlers.js in ASP-style Code

      A drawback of using ASP as your template syntax is that you'll need to upload the template file onto a Web server and generate your code using an HTTP request. However, if you were to reuse the ASP syntax and build your own parser for it, you'd have a template-driven generator for your custom generative programming tasks. If you built it as an Active Scripting host like ASP is built, you could easily pull in metadata of any kind that was modeled as a COM object model (which includes text files, XML files, SQL Schemas, COM type information, and UML files, among other things). Of course, since what you're generating is just text, you could use it to target any components in any language that you choose. Such a tool would give you a leg up on your general-purpose generative programming needs.

A Template-driven Generator

      Some years ago, I was so attracted to the template syntax of ASP that I decided to replicate it for my own generative programming needs. This work yielded a simple tool called Textbox, which I've included in the download for this article at the link at the top of this article. Textbox is an Active Scripting host that I use as my own simple ASP interpreter. It's merely a prototype for future work, but it's freely available for your use and will allow you to interpret handlers.asp.
      Textbox can be used to parse handlers.asp from the command line, like so:
  C:> textbox.exe handlers.asp
  typeLibName=mshtml.tlb \
  ifaceName=HTMLWindowEvents
  className=CMyEventSink
Running textbox.exe on the ASP template I created yields code just like what the programmer was after.

Producing the Templates

      If you're going to be using generative programming, keep in mind that now, instead of writing the code, you'll be writing the templates that produce the code. Since Textbox is based on Active Scripting engines, it's flexible enough to do just about anything—branching, looping, subroutines, pulling in COM objects, and so on. Also, via the magic of Microsoft script debugging, you also get a full debugging environment with breakpoints, an immediate window, data tips, and more.
      However, as flexible and powerful as this is, you don't want to start building templates from scratch. You want to start with an instance of the code. For example, when building the handler template, I started with an example of what the output code was supposed to look like. By starting with code that I knew worked, it was easy to insert template blocks for the variable code and leave the invariant code alone.
      Of course, it's possible to write bad code templates just like any other kind of code, so I'll recommend a few tricks that I think ASP programmers could benefit from as well.

Writing Picture Functions

      In handlers.asp, notice how I have the line that produces sink map entries wrapped in a for loop (see Figure 5). This is a very common thing to do, but the logic surrounding that line of code can also obscure it. It's always cleaner to separate the logic from the code being output as much as possible. One way to do that is to use the response.writeLn intrinsic, which will output a string into the stream, as you can see here:
function outputSinkEntry(m) {
  xcode.writeLn("SINK_ENTRY_EX(" + sinkID + ", " + iid +
                ", 0x" + hex(m.memberID, 8) + ", " + m.name + ")");
}
This allows you to write the sink map like so:
BEGIN_SINK_MAP(<%= className %>)
<%
for( var i = 1; i != members.count + 1; ++i ) {
   outputSinkEntry(members.item(i));
}
%>
END_SINK_MAP()
      So the code to generate the sink map as a whole is cleaner, but the outputSinkEntry function is just as ugly as the previous printf-style code generation. However, if I write outputSinkEntry as a "picture function," things are nice again:
<% function outputSinkEntry(m) { %>
  SINK_ENTRY_EX(<%=sinkID%>, <%=iid%>, 0x<%=hex(m.memberID, 8)%>,
                <%=m.name%>)
<% } %>
      A picture function is standard ASP code packaged as a function that can be called from anywhere. Picture functions take advantage of how Textbox (and ASP) are implemented. Essentially, Textbox is a preprocessor, which translates ASP syntax into script. When Textbox sees a <% %> block, it feeds it directly to the scripting engine. When Textbox sees anything outside of a block or inside of a <%= %> block, it wraps the whole thing in a response.write statement. Because of that, the outputSinkEntry function translates internally to the following:
function outputSinkEntry(m) {
xcode.write("  SINK_ENTRY_EX(");
xcode.write(sinkID);
xcode.write(", ");
xcode.write(iid);
xcode.write(", 0x");
xcode.write(hex(m.memberID, 8));
xcode.write(", ");
xcode.write(m.name);
xcode.write(")");
xcode.writeLn("");
}
The previous code looks remarkably like the first response.writeLn version, but because I get to write it as a picture function, none of the ugliness applies.

Removing For Loops

      Another thing that programmers can do to clean up code is to remove the for loops. All these loops do is walk each item in the collection and call a function with the item from the collection. Because JScript allows passing functions as first-class objects, you can easily build an STL-style for_each algorithm like so:
function for_each(coll, pfn) {
  for( var i = 1; i != coll.count + 1; ++i ) pfn(coll(i));
}
Then you can use for_each from the Textbox template like this:
BEGIN_SINK_MAP(<%= className %>)
<% for_each(members, outputSinkEntry); %>
END_SINK_MAP()

Final Code Template

      Applying these stylistic techniques results in the code template shown in Figure 6. Because there is a real language underlying the code template parser, there are all kinds of opportunities to use the facilities of the language to promote a clean, readable style that makes maintenance and extensibility much easier in the future.

Figure 6 New Template
Figure 6 New Template

Generative Programming Uses

      Template-driven generative programming is fairly primitive compared to what Czarnecki and Eisenecker outlined in their book, but it still changed the way I do my job day to day. I find myself using it every time I just can't stand to invoke clipboard inheritance again. I can't list all the ways programmers use generative programming at DevelopMentor, but some of the utilities that have been built include:
  • XSD to C++ via Gen<X>=Happiness. This template turns an XML Schema into a hierarchy of C++ types that know how to serialize themselves to and from XML instance documents, validating along the way.
  • Automating the Build Process with Gen<X>. This is a family of automated build scripts generated from an XML manifest.
  • Gen<X> DocGen. This is an accurate, complete COM component reference document driven from IDL and XML reference comments (see Figure 7).
Figure 7 GenX DocGen
Figure 7 Gen<X> DocGen
  • Gen<X> DispEvent 1.5. This serves as the backend for wizards. In fact, I collaborated with Tim Tabor to build a wiz-ard to solve the programmer's specific problem (see Figure 8).
Figure 8 GenX DispEvent
Figure 8 Gen<X> DispEvent
  • A major part of the diagnosis application that DevelopMentor ships with their products checks to make sure that all of the COM components are installed and creatable. The data structure that drives this process is generated with a template that digs through the product's IDL files looking for coclass statements that aren't marked [noncreatable].
      The tools mentioned here are available from http://www.develop.com/genx/genxchange.asp.
      Generative programming has become an invaluable tool in my programmer's toolbox. It lets me spend creative cycles solving a family of problems and lets the computer do the grunt work.
For background information see:
International Symposium on Generative Programming and Component-Based Software Engineering
Generative Programming by Krzysztof Czarnecki and Ulrich Eisenecker (Addison-Wesley, 2000).

Chris Sells is an independent consultant and DevelopMentor instructor, specializing in .NET and COM. He is the inventor of Gen<X>, DevelopMentor's generative programming tool for Windows. Reach Chris at http://www.sellsbrothers.com.

Page view tracker