Graphics: Manipulate Digital Images in Internet Explorer with the DirectX Transform SDK

Alex Lerner

This article assumes you�re familiar with COM, ATL, C++, HTML, Scripting

Level of Difficulty     1   2   3 

Download the code for this article: Transform.exe (212KB)

Browse the code for this article at Code Center: NEGATIVETRANS

SUMMARYThe Microsoft DirectX Transform is a Microsoft DirectX media API that can be used to create animated effects as well as to create and edit digital images for Windows-based applications. Scripting and HTML can be used to display an existing transform on a Web page, and improved transform support in Microsoft Internet Explorer 5.5 makes it easy to use transforms.
      This article provides step-by-step instructions for writing a transform as an ATL project and shows an example of an image transform. C++ is used to instantiate, configure, and display transforms in this project.

I am totally fascinated with digital imagery. I spend countless hours tinkering with programs like Adobe Photoshop and Microsoft® Image Composer. Given that I am artistically disabled, various ready-made image-manipulation programs are just perfect for the purpose of expressing my artistic abilities. Examining the effects of a particular Adobe digital filter gives me ideas about how I might design a similar filter myself. Once in a while, I even come up with my own image-manipulation algorithm.
      In the first part of this article I'll explain how to use a transform that comes with Microsoft Internet Explorer or one purchased from a third party. I'll provide HTML code to demonstrate the eight required steps. Then I'll show how to do the same from C++ and how the DXETool in the DirectX® Transform SDK (part of the DirectX Media SDK) can save you some effort. In the second part of the article I'll explain how to write your own transform.
      Due to a DLL conflict, DirectX transforms will not work on machines with DirectX 8.0 and either Windows 95 or Windows 98 Gold installed.

Design Decisions and Tools

      Like any other programming artist, I want to expose my image-manipulation functionality to the world. So how do I package my code? How do I reach the largest number of potential users? Well, a COM component immediately comes to mind. I can simply wrap my imaging code inside an ActiveX® control, add a couple of custom properties, and it's done. The ActiveX control would work in all major programming environments as well as in Web pages viewed with Internet Explorer.
      The downside to this do-it-myself approach is that in addition to implementing the image-manipulation code, I also have to worry about the coding baggage involved in the support of various image file formats, correctly supporting image transparency during the display, and possibly dithering the image on the 256-color displays. Luckily, the DirectX Transform SDK can help with these tasks.
      The Transform API is a plug-in model that enables you to create animated effects for Microsoft DirectAnimation and Windows-based applications. It is also a standalone set of two-dimensional graphics tools for writing graphics applications and procedural surfaces. In a nutshell, it is a framework for developing image filters. The framework takes care of such things as file formats and image display, while the programmer concentrates on the image-manipulation code itself.

Displaying Transforms on the Web

      Figure 1 shows how to use a transform on a Web page. The code can be written in JScript® or VBScript; I have written this example in JScript.
      The following steps are for instantiating, configuring, and using a transform on a Web page:

  1. Create a DirectAnimation control object
  2. Create a SCRIPT block with a DirectAnimation library
  3. Create the transform
  4. Define the transform's inputs
  5. Define a behavior for the transform
  6. Set the transform custom properties
  7. Apply the transform
  8. Assign the output to the DirectAnimation control

The steps are straightforward and directly correspond to the code in Figure 1. However, a few steps require explanation.
      In step one, the DirectAnimation control is placed on the page:

  <OBJECT ID="DAControl"
  
STYLE="width:320;height:300"
CLASSID="CLSID:69AD90EF-1C20-11d1-8801-00C04FC29D46">
</OBJECT>

 

The CLSID (69AD90EF-1C20-11d1-8801-00C04FC29D46) points to danim.dll—the home of the DirectAnimation control. The DirectAnimation DLL contains two libraries: MeterLibrary and PixelLibrary. Both return the DAStatics object, which contains all the interesting functions such as ImportFile and ApplyDXTransform. For a complete list of functions contained in the DAStatics, take a look at the DirectAnimation documentation. MeterLibrary and PixelLibrary set the unit of measurement to meters or pixels, respectively. They also treat the direction of the positive y-axis differently—MeterLibrary considers a positive y-axis to be going up, while PixelLibrary considers it to be going down. For the purpose of this article I could have picked either library.
      In step four, the transform inputs are passed to the transform using the ApplyDXTransform API, which takes an array of DAImage objects as one of its arguments. The file formats supported by ImportImage are .png, .jpg, .bmp, and .gif.
      If the transform is an animation, you need to define a DABehavior in step five to produce the animation. This behavior specifies how a DANumber, which represents the progress of the transform, will change its value over time. In my code, the variable forward is a DANumber that changes from 0 to 1 in five seconds, while the variable back changes from 1 to 0 in five seconds. Both use the DirectAnimation Interpolate function to create these behaviors. The variable progress is created as a sequenced behavior of the forward and back behaviors, repeated forever. When going over the transform implementation, you will see that the transform code, in turn, makes calls to the GetEffectProgress function to access the m_Progress variable set by the caller.

Displaying Transforms in Internet Explorer 5.5

      Cascading Style Sheets (CSS) were first introduced with Internet Explorer 4.0. CSS supports many features involving the presentation and layout of HTML elements. One feature that directly concerns this article is filter support.
      Visual filters are extensions to Internet Explorer 4.0 behaviors that create on-screen effects. The syntax for a filter property in a STYLE attribute is:

  filter:filtername(custom properties)
  

 

Here's an example of how filters are written as STYLE attributes:

  <IMG ID=sample SRC="SomeImage.jpg" 
  
STYLE="filter:blur(strength=50) flipv()">

 

The previous example applies two filters in a row—first the image is blurred, then it is flipped vertically. What is interesting about that single line of code is the contrast between its simplicity and the programming effort it would take to achieve the same result using transforms. The filter code does not rely on the DirectAnimation control and it does not use scripting. The good news is that the custom transforms have been seamlessly integrated into Internet Explorer 5.5 (see Figure 2).
      To use the transform on a Web page, you must identify the individual transform by its progid. The general syntax for specifying (or naming) a transform is:

  filter:progid:SomeCustomProgId(custom properties)
  

 

And the following is a specific example:

  <IMG ID=oImg SRC="SomeImage.jpg"
  
STYLE="filter:progid:NegaTransform.Negative(NegativeThreshold=0,
Duration=5)">

 

As you can see with Internet Explorer 5.5, the transform support is built so that the syntax is simpler, smaller, and easier to understand than the scripting approach required before version 5.5.

Displaying Transforms using C++

      To display transforms using C++, the steps are the same as I've just shown, but let's dive in a bit deeper and do everything by hand. I have created a simple MFC project called TestHarness to demonstrate all the interfaces that are used when displaying the transforms. The test harness will also show what is expected from you as the transform developer.
      The starting point of all transform-related activities is the transform factory. Through the transform factory you instantiate individual transforms using CoCreateInstance:

  CoCreateInstance
  
(
CLSID_DXTransformFactory, NULL, CLSCTX_INPROC,
IID_IDXTransformFactory, (void **)&m_pTransFact
);

 

Once you have the factory, you can instantiate the transform itself:

  m_pTransFact->CreateTransform
  
(
NULL, 0, NULL, 0, NULL, NULL,
_uuidof(Negative), IID_IDXTransform,
(void **)&m_pNegativeTrans
);

 

      The images produced by the transforms are displayed in the same way as regular bitmap images—the hDC of the image surface is BitBlt'ed to the hDC of the window. At this point you have instantiated the transform, but you still don't have any surfaces to work with. The transform factory interface derives from the IServiceProvider interface, which contains only one method: QueryService. (The methods supported by the IID_IDXTransformFactory are listed in Figure 3.) By calling QuerySurface you can get access to the surface factory, like so:

  m_pTransFact->QueryService
  
(
SID_SDXSurfaceFactory, IID_IDXSurfaceFactory,
(void **)&m_pSurfFact
);

 

      By using the surface factory, you can create the surfaces required. The output surface is produced using the CreateSurface method:

  m_pSurfFact->CreateSurface
  
(
NULL, NULL, &DDPF_PMARGB32, &bnds, 0, NULL,
IID_IDXSurface, (void**)&m_pOutputSurface
);

 

And the input surface is created by loading an image file:

  m_pSurfFact->LoadImage
  
(
pwcsInputFile, NULL, NULL,
&DDPF_PMARGB32, IID_IDXSurface, (void**)&m_pInputSurface
);

 

Figure 4 shows the methods from the IID_IDXSurfaceFactory and IID_IDXSurface interfaces.
      At this point I have done all the preliminary work and introduced all the players: the transform itself is represented by the IDXTransform interface pointer, and the input and output surfaces are represented by the IDXSurface interface pointers. All that is left is to connect it all together and execute the transform. The IDXTransform method that connects it all is Setup.

  HRESULT Setup
  
(
IUnknown * const *punkInputs,
ULONG ulNumInputs,
IUnknown * const *punkOutputs,
ULONG ulNumOutputs,
DWORD dwFlags
);

 

      The Setup method takes an array of inputs and array of outputs as its arguments—the dwFlags argument is not used. In my sample TestHarness code, the Setup method looks like the following:

  IUnknown* In[1];
  
IUnknown* Out[1];
In[0] = m_pInputSurface;
Out[0]= m_pOutputSurface;
m_pNegativeTrans->Setup( In, 1, Out, 1, 0 );

 

The IDXTransform method responsible for executing the transform is Execute. Figure 5 describes the methods for the IDXTransform interface.

Transform Custom Methods

      The image filters you will implement will most likely contain custom methods specific to the filter. For example, the transform I have prepared for this article contains a custom method called put_NegativeThreshold. The user of the transform can exercise the custom properties of the transform programmatically or interactively through a custom property page. To invoke the custom methods programmatically, you have to get to the transform's custom interface first. To do that, use QueryInterface on the IDXTransform pointer that you retrieved when you instantiated the transform by calling m_pTransFact->CreateTransform.

  m_pNegativeTrans->QueryInterface
  
(
_uuidof(INegative), (void **)&m_pNegative
);
m_pNegative->put_NegativeThreshold(255);

 

      I will get into the specifics of the custom transform interface later on, but at this point you should know that the custom transform interface is derived from the IDXEffect interface, as shown in the following code:

  interface IDXEffect : IDispatch
  
{
HRESULT get_Capabilities([out, retval] long *pVal);
HRESULT get_Progress([out, retval] float *pVal);
HRESULT put_Progress([in] float newVal);
HRESULT get_StepResolution([out, retval] float *pVal);
HRESULT get_Duration([out, retval] float *pVal);
HRESULT put_Duration([in] float newVal);
};

 

The animation inside the transform is accomplished through the put_Progress method. As the calling program repeatedly calls the put_Progress method, passing in values between 0 and 1, the transform regenerates the image, taking the current progress state into account.

Invoking Transform Property Page

      For step six you need to invoke a custom property page if you want to set or modify the transform's settings. I am not going to dwell on the instantiation of the property pages—you can examine the OnPropertyPage method inside the TestHarness app yourself (you can find all the code for this article at the link at the top of this article). What I do want to mention is the somewhat odd relationship between the property page Apply button, the transform, and the final image result.
      Let's walk through the following scenario: the user brings up the property page and modifies some custom values in it. The user then hits the Apply button, expecting the changes he made to the property page to be reflected in the image, but nothing happens. What's the problem? The answer is that when the user clicked the Apply button, the image was modified inside the transform object, but the calling program's window hadn't been repainted yet, so the new changes were not visible. There are two reasons it hadn't been repainted. First, it didn't know that the image had changed. Second, the property page dialog box is modal and the calling app is blocking on it, so the calling program can't do anything.
      A solution to this problem is to have a timer that continuously polls the transform, asking if it's dirty. If the answer is yes, then repaint the window. Fortunately, image transforms already have such a mechanism built in, called the generation ID. The generation ID is a DWORD representing a version number of the object. When you modify your image filter's output, you should increment the output's generation ID. Look at the put_NegativeThreshold implementation shown earlier—the call to SetDirty does nothing more than bump up the generation ID counter. The IDXTransform interface derives from the IDXBaseObject interface (see Figure 6), which supports the GetGenerationId method. My TestHarness program keeps track of the last generation ID. On WM_TIMER, the following code is executed:

  DWORD dwGenId;
  
// did the image change?
m_pNegativeTrans->GetGenerationId(&dwGenId);
if (m_dwLastExecGenId != dwGenId)
{
InvalidateRect(NULL);
// record this for next time
m_dwLastExecGenId = dwGenId;
}

 

So when the user invokes the transform property page and presses the Apply button, the transform generation ID is changed, the timer code eventually detects it, and the display updates.

DXETool

      The discussion of test harnesses wouldn't be complete without mentioning DXETool. This really cool tool ships with the DirectX SDK and allows you to test your transform without writing a test harness of your own. If I had mentioned this fact up front, you probably wouldn't have read the sections on how to do it yourself in C++. What is unusual about the DXETool is that it is not hardcoded to test a particular transform. Instead, it makes use of the fact that all transforms belong to the CATID_DXImageTransform category. By examining the registry, DXETool makes a list of all the transforms installed on your system by enumerating through the CATID_DXImageTransform category. DXETool is then able to exercise all those transforms by using the interfaces discussed so far. The code dealing with the transforms is very generic and custom properties are tweaked interactively through property pages. A simplified excerpt from the DXETool code is shown Figure 7.
      Before compiling DXETool, open stdafx.h and set the EMPTY_START variable to 1. Otherwise, DXETool will try to reference hardcoded paths and load image files that are no longer there.

Writing Your Own Transform

      Now that I have discussed how to instantiate, configure, and use a transform, I'll cover the development of the transform itself. In following these steps to build your own transform, replace the names CNegative and INegative with the names of your custom transform class and interface.

  1. Start a new ATL project and specify Dynamic Link Library as the server type.

  2. Open stdafx.h and add:

    #include <Dxtrans.h>;
    
    #include <Dtbase.h>;
    #include <Dxatlpb.h>;

     

    and then open stdfx.cpp and add:

    #include <Dtbase.cpp>;
    
    #include <Dxtguid.c>;
    #include <Atlctl.cpp>;

     

  3. Open the IDL file and add:

    import "Dxtrans.idl";
    

     

    Then insert a new ATL object into your project. The attributes of the new object are as follows: ThreadingModel is set to Both, Interface is set to Dual, Aggregation is set to Yes, and the Free Threaded Marshaler support is On. Now try compiling your project to make sure the DirectX files you have included in the project are found and compile successfully. If the files are not found, you probably need to install the DirectX SDK.

  4. Revisit your IDL file and change the derivation of your custom interface to IDXEffect rather than the default IDispatch. Then add a custom Property Page to your transform. Similar to the transform itself, set the Attributes tab Threading Model to Both, Interface to Dual, and Aggregation to Yes. Also check the Free Threaded Marshaler box.

  5. Examine the header file of your transform class and modify the class declaration to look like the following:

    class ATL_NO_VTABLE CNegative : 
    
    //public CComObjectRootEx<CComMultiThreadModel>,
    public CDXBaseNTo1,
    public CComCoClass<CNegative, &CLSID_Negative>,
    public IDispatchImpl<INegative, &IID_INegative,
    &LIBID_NEGATRANSFORMLib>,
    public IObjectSafetyImpl2<CNegative>,
    public IPersistStorageImpl<CNegative>,
    public ISpecifyPropertyPagesImpl<CNegative>,
    public IPersistPropertyBagImpl<CNegative>,
    public IOleObjectDXImpl<CNegative>

     

    The CDXBaseNTo1 base class contains a lot of useful functionality—I'v e already mentioned GetEffectProgress and SetDirty. See Figure 8 for a complete list of functions contained in the class. Most of the code and the work your transform does is going to reside in WorkProc, a virtual function declared in CDXBaseNTo1. Figure 9 shows the COM map.

  6. In the header file (negative.h in my example), replace

    DECLARE_REGISTRY_RESOURCEID(IDR_NEGATIVE)
    

     

    with:

    DECLARE_REGISTER_DX_TRANSFORM(IDR_NEGATIVE,
    
    CATID_DXImageTransform)

     

    As I have mentioned before, all transforms belong to a well-known category, allowing various tools to enumerate through a particular category to determine which transforms are installed on the system. For a 3D transform, the category name is CATID_DX3Dtransform.

  7. Declare the object as aggregatable by adding this line:

    DECLARE_POLY_AGGREGATABLE(CNegative)
    

     

    Then add the following to the class declaration:

    DECLARE_IDXEFFECT_METHODS(0)
    

     

    The DECLARE_IDXEFFECT_METHODS macro argument specifies the value you want to have your transform return from the IDXEffect::get_Capabilities method. This can be a combination of the flags in the DXEFFECTTYPE enumeration or zero.

  8. Add a property map for the object by inserting the following within the class definition body:

    BEGIN_PROP_MAP( CNegative )
    
    PROP_ENTRY("NegativeThreshold", 1, CLSID_NegativePP)
    PROP_PAGE( CLSID_NegativePP )
    END_PROP_MAP()

     

    Then modify the constructor of your class to include:

    m_dwOptionFlags = DXBOF_CENTER_INPUTS;
    

     

    This instructs the transform container to center the transform output relative to the center of the display area. If you want the transform output to appear in the upper-left corner of the display surface, leave this line out.

  9. Now provide empty stubs for Setup and WorkProc functions declared in the CDXBaseNTo1 base class. Here's how these functions are implemented:

    HRESULT CNegative::OnSetup(DWORD dwFlags)
    
    {
    return S_OK;
    }
    HRESULT CNegative::WorkProc(const CDXTWorkInfoNTo1 &WI,
    BOOL *pbContinue)
    {
    return S_OK;
    }

     

    You're finished! Compile and link your project.

      Well, you're almost finished. So far you've provided a functional template that can be used as a basis for your transform development. All that is missing at this point are the specifics of your transform and its custom properties. Try running the DXETool—if you have done everything correctly, your transform should be among those listed.

OnSetup Function

      The argument flags accepted by the OnSetup method are passed in by the calling application through IDXTransform::Setup. By the time the Setup function is called inside your transform, all the input and output surfaces have been set. At that point you can perform various one-time initializations specific to your transform. You can also cache input and output surface characteristics by accessing the surfaces first through InputSurface and OutputSurface function calls. Since the negative transform I have implemented for this article is somewhat simplistic, I left the body of the OnSetup function empty.

WorkProc Function

      The WorkProc function is where all the pixel massaging takes place. WorkProc is invoked in response to the IDXTransform::Execute method called by the client. What's worth noting about WorkProc is that it locks the input and output surfaces before accessing them:

  HRESULT LockSurface(
  
const DXBNDS *pBounds,
ULONG ulTimeOut,
DWORD dwFlags,
REFIID riid,
void **ppPointer,
ULONG *pulGenerationId
);

 

      The dwFlags parameter controls the type of pointer you get back. The DXLOCKF_READ flag results in IDXARGBReadPtr being returned, and the DXLOCKF_READWRITE flag results in IDXARGBReadWritePtr pointer (see Figure 10 and Figure 11). The surfaces have to be locked because they can be accessed and modified from multiple threads. Any number of threads can use LockSurface to obtain read pointers to the same surface or regions within that surface. When requesting a read/write pointer, however, the method checks to see whether the requested bounds overlap any regions that are currently read/write locked. If there is no overlap, the method locks the specified region of the surface and returns the read/write pointer.
      Once the surfaces have been locked, I allocate space for one image row. This is where the new calculated pixel values are going to be placed before the entire row is transferred to the final destination. The pseudocode looks like this:

  Loop the height of the image
  
// get the current input row
pInImage->MoveToRow(y);
pInImage->UnpackPremult(pRowBuff, width, FALSE);

// massage the data in pRowBuff
// .........
// move to the current output row and
// place the data there
pOutImage->MoveToRow(OutY);
pOutImage->PackPremultAndMove(pRowBuff, width);

 

      If you look at the actual code in Negative.cpp you will notice that the output portion of the code looks like this:

  pOutImage->MoveToRow(OutY);
  
if (DoOver())
pOutImage->OverArrayAndMove
(
pScratchBuff, pRowBuff, width
);
else
pOutImage->PackPremultAndMove(pRowBuff, width);

 

The DoOver function checks for the type of output that should be performed by the transform—with or without the transparency support. It's up to you to pay attention to the flags set by the caller and act accordingly. These transform hints are set by the caller using the SetMiscFlags function:

      m_cpDXTransform->GetMiscFlags(&dwFlags);
  
dwFlags &= (~DXTMF_BLEND_WITH_OUTPUT);
if (bBlend)
dwFlags |= DXTMF_BLEND_WITH_OUTPUT;
m_cpDXTransform->SetMiscFlags(dwFlags);

 

      So if the user requests that transparency is turned on, the OverArrayAndMove function alpha blends an array of samples over the output.
      While the loop is iterating over the height of the image, I am continuously checking the value of the pbContinue variable (the second argument of the WorkProc function) because the transform algorithm can be stopped at any time by the transform Task Manager. The Task Manager is a generic service used to schedule and execute caller-defined tasks. See Figure 12 for a list of methods that make up the IDXTaskManager interface.

Going Out with a Bang

      In writing this article, I really wanted to demonstrate a cool image transform. At the same time, I did not want to make the transform so complex that it obscured the purpose of the article, which is to demonstrate the simplicity and elegance behind the transform model. So to dazzle you I have used a really simple transform I came across about 12 years ago. It comes from a book by Gerard Holzmann called Beyond Photography: The Digital Darkroom (Prentice Hall, 1988). The transform image-manipulation formula looks like this

  newImage[r,a] = originaImage[sqrt(r*R), a]
  

 

where r and a variables are polar coordinates of a particular pixel—r is the radius and a is the angle. The capital R is the maximum radius defined as:

  R = min(imageHeight, imageWidth) / 2
  

 

      The transform has the effect of pulling or pinching the image towards the center and fanning out the edges of the image. To save time (and out of laziness) I have reused the body of the Negative transform that I developed earlier and simply wrote another WorkProc function. You can control the choice of WorkProcs by changing the Visual C++ Project settings. To view and experiment with the Pinching transform, add POLAR=1 to the project's preprocessor definitions and recompile. Figure 13 shows before and after results of the transform.

Figure 13 My Transform
Figure 13 My Transform

Conclusion

      The DirectX transform framework provides many benefits to developers and users alike. Among them are COM component programming model support, integration with CSS in Internet Explorer 5.5, and rich graphics file format support. If you are in the business of developing image-manipulation software and want to capture a wide customer base, you should seriously consider packaging your code as DirectX transforms.

For related articles see:
Introduction to DirectX Transform

Alex Lerner is a developer for the Microsoft Consulting Services in New York. In his spare time he likes to take perfectly normal photographs and digitally manipulate them until they are no longer recognizable.

From the March 2001 issue of MSDN Magazine