Improving the Performance of DES Preview

by Nick Vicars-Harris and Olivier Colle

Microsoft Corporation

July 2003

Applies to:

Microsoft® DirectShow® Editing Services

Microsoft® DirectX® version 9.0

Summary

Microsoft® DirectX® version 9.0 or later allows C++ developers to change the resizing behavior of Microsoft® DirectShow® Editing Services (DES) by implementing a custom video resizer filter. This article describes how to implement an optimized video resizer in C++ that enables the DES Render Engine to run significantly faster than it can with a non-optimized resizer.

Introduction

This article is intended to help developers who are using DES. The article and accompanying sample code show how to optimize the DES pipeline to achieve greater performance when rendering to a timeline whose frame rate is less than the source frame rate. The optimization technique involves implementing a custom video resizer filter that defers video-resize operations until late in the DES pipeline, after unnecessary data has been removed from the video sample.

This article covers the following topics:

  • A Brief Overview of DES. Describes the main features of DES.
  • Providing a Custom Video Resizer. Describes how to change the resizing behavior of DES by implementing a custom resizer as a DirectShow transform filter.
  • DES Source Pipeline. Discusses the components that make up the DES source pipeline.
  • Modified DES Source Pipeline. Describes how to implement a custom video resizer that defers resize operations until late in the DES source pipeline.
  • Frequently Asked Questions. Provides answers to common questions about deferring video resize operations.
  • For More Information. Provides additional resources.
  • Sample code. The accompanying sample code (in C++) includes a video resizer that defers resize operations, and an application that creates a timeline using the resizer.

A Brief Overview of DES

Microsoft Direct Show Editing Services (DES) is an application programming interface (API) that greatly simplifies the tasks involved in video editing. DES is built on top of the core DirectShow architecture. It abstracts much of the complexity of DirectShow, and provides a set of interfaces designed specifically for manipulating video editing projects. Because DES handles a wide variety of media types, you no longer need to write multiple independent sections of code to manipulate a wide range of content in editing software.

DES brings the following features to DirectShow:

  • A timeline model that organizes video and audio tracks into nested layers, making it easy to manipulate the final production
  • The ability to preview a video project on the fly
  • Project persistence through an XML-based format
  • Support for video and audio effects, as well as transitions between video tracks (such as fades and wipes)
  • Over 100 standard wipes, as defined by the Society of Motion Picture and Television Engineers (SMPTE)
  • Keying based on hue, luminance, RGB value, or alpha value
  • Automatic conversion of frame rates and audio sampling rates, enabling a production to use heterogeneous sources
  • Resizing or cropping of video

Providing a Custom Video Resizer

By default, DES uses a StretchBlt operation to resize a video source clip. A StretchBlt operation is fast, but does not provide anti-aliasing for the resized video clip. You can change the resizing behavior by implementing a custom resizer as a DirectShow transform filter. The filter must expose the IResize interface, which enables DES to specify the input and output video size. For information about writing a transform filter, see the section of the DirectShow documentation entitled "Writing Transform Filters." The CTransformFilter base class is recommended as the starting point. When you implement the filter, note the following:

  • You must support the IResize interface on the filter (not the pins).
  • The filter should accept only VIDEOINFOHEADER formats (FORMAT_VideoInfo). Reject other format types.
  • The video format from DES may be any uncompressed RGB type, including 32-bit RGB with an alpha channel (MEDIASUBTYPE_ARGB32). Your filter can safely reject formats that use a biHeight of less than 0.
  • Before the Render Engine connects the filter's output pin, it calls IResize::put_MediaType to set the output type. It may also call IResize::put_Size to adjust the output size. The Render Engine can call these two methods in any order, any number of times, before it connects the output pin.
  • After the Render Engine connects the output pin, it might call put_Size again. The resizer filter should reconnect its output pin with the new size.
  • Inside the filter's CTransformFilter::Transform method, you must stretch the input video to the output size.
  • Your filter should never set the discontinuity flag on the output sample, or attach a media type to the output sample.
  • To save the filter's state in a GraphEdit (.grf) file, implement the IPersistStream interface. (This is optional, but useful for testing.)

To use the resizer filter, the filter must be registered as a COM object on the user's system. Before the application renders the video project, query the Render Engine for the IRenderEngine2 interface and call IRenderEngine2::SetResizerGUID passing the CLSID of the resizer filter.

DES Source Pipeline

The following diagram shows the components that make up the DES pipeline for both single- and multiple-source projects. Optional components are enclosed in square brackets.

DES Pipeline

The media source that feeds into the decoder is one of many optional sources. The video resizer and the frame rate converter (FRC) change the media into the type that the switch expects. The switch then hands off the video stream in accordance with the timeline requirements for further optional processing. Finally, the stream is sent to the output queue to be rendered to the screen or other device.

Modified DES Source Pipeline

The problem with the DES pipeline structure is that the resizer is asked to perform its work before the frame-rate conversion has occurred. In scenarios where frames will be dropped from the timeline, this uses up unnecessary CPU cycles by resizing samples that will never be used. For example, suppose that the DES timeline is running in preview mode at 15 frames per second (fps) and at a resolution of 320 by 240 pixels, and that the source is running at 30 fps. To reach the 15 fps playback speed, half of the source's content must be discarded before it reaches the output queue. The video resizer must do twice as much work as it really needs to do. By using a custom video resizer, the resize operation can be deferred until after the extra preview content has been discarded, reducing the processing requirements and significantly improving performance.

Deferring Video Resize Operations

The key to deferring video resize operations is to replace the standard video resizer with one that indicates to the filter graph that the resize operation has already been done. The custom resizer can then carry out the actual resize operation later in the pipeline, after the extra content has been discarded.

To defer the video resize operations, the custom video resizer implements its own memory allocator on its output pin. A custom allocator is needed because we need to allocate custom samples based on the CDeferredSample class. This class implements a thin wrapper on top of the CMediaSample class. Although the wrapper does little more than override the GetPointer function, this bit of code is very important.

In a DirectShow media sample, the GetPointer function is called when a filter needs to access the internal buffer of the media sample in order to work with the pixels. What makes deferred resizing possible is the fact that neither the DES frame rate converter nor the switch call GetPointer on the media sample. They only change the sample presentation times without ever accessing the actual pixel bits. It is only later that a downstream filter such as a transition, an effect, or the video renderer itself will access the pixels. Implementing the resizing code within the GetPointer function ensures that the resizing operation will not be invoked until after all unneeded samples have been dropped.

Note   Deferring the resize operations requires DirectX version 9.0 or later because only those versions allow you to insert a custom video resizer into the pipeline.

Responding to Transform and GetPointer

The attached sample code defines a new class called CDeferredResizerFilter that is derived from CTransformFilter and IResize. Besides supporting the standard methods, the CDeferredResizerFilter class overrides the following two methods:

  • Transform – This method is called by the resizer filter once for every sample sent to it. But instead of resizing the video source content, this method just copies a pointer from the input pin to the output pin. The filter, however, behaves as if the content has been resized.
  • GetPointer – This method is called by any downstream filter that needs to examine or modify the actual pixel bits and, as stated earlier, neither the frame rate converter nor the DES switch object call this method. So instead of simply returning a pointer as the base class implementation does, the sample uses this overridden method to resize the video frame, and to then return a pointer to the resized frame by calling the base class method. The following code snippet shows the implementation of the CDeferredSample::GetPointer method.

HRESULT CDeferredSample::GetPointer(BYTE ** ppBuffer)

{

  if(m_pOrgSample)

  {

    CComPtr<IMediaSample> pSample = m_pOrgSample;

    m_pOrgSample.Release();

    RTN_HR_IF_BADPTR(m_pDeferredResizerFilter);

    // Resize the frame here

    RTN_HR_IF_FAILED(m_pDeferredResizerFilter->DoTransform(pSample, this));

  }

  return   CMediaSample::GetPointer(ppBuffer);

}

Note that a downstream filter may call GetPointer any number of times. So to ensure that the resizing only occurs on the first call, the method checks for the value of the original sample in m_pOrgSample. If the pointer is not NULL, it indicates that GetPointer has not been called yet on this sample, and therefore the sample has not yet been resized.

The following steps and accompanying diagrams describe the progress of a 720 by 480 video sample as it passes through a DES filter graph that uses a custom video resizer to defer resize operations.

  1. The source reads the sample1.wmv file and generates new samples. Each sample is passed to the decoder, which outputs an uncompressed frame buffer of 720 by 480 pixels.
  2. The deferred resizer's Transform method is called one time for each sample that is received. This method does nothing more than cast the sample to a CDeferredSample and store the pointer in that object's m_pOrgSample member variable. The inner sample is still 720 by 480 pixels and has not been resized.
  3. The FRC receives the sample, which it thinks is 320 by 240 pixels because that was the media type that was indicated by the deferred resizer. However, the sample has not actually been resized yet and is still 720 by 480 pixels. The FRC adjusts the sample time to convert from 30 fps to 15 fps. Because the FRC modifies the sample time without accessing the pixel bits, the GetPointer method is not called.
  4. The DES switch receives the samples at a rate of 15 samples per second. Each sample is still 720 by 480 pixels. The switch sends the samples to the appropriate downstream filters, such as transition filters, effects filters, or the preview filter. The samples are still 720 by 480 pixels even though each downstream filter thinks they are 320 by 480 pixels.
  5. At some point, a downstream filter will need to access the pixels of the media sample by calling the GetPointer method on the media sample. The CDeferredSample class overrides GetPointer, intercepts the calls, performs the resize operation, and returns the new 320 by 240 pixel sample. To the caller, it seems that the sample has always been 320 by 240 pixels.

The following diagram shows where the deferred resizing operation occurs.

Deferred resizing operation

Frequently Asked Questions

This topic provides answers to common questions about deferring video resize operations.

Can we be sure that the resize operation will happen?

Yes, the resize operation will happen whether you have a transition, an effect, or a simple preview pipeline. In the first two cases, it is obvious that the transform filter has to call the GetPointer method on the sample because it is the only way that the filter can access the pixels. The preview case is similar. To display the pixels on the screen, the preview filter must also access the pixels by calling GetPointer.

What is gained by deferring the resize operation?

Every operation that is performed before the FRC is done at the frame rate of the source (for example, for NTSC sources the frame rate is 30 fps), and every subsequent operation is done at the frame rate set by IAMTimelineGroup::SetOutputFPS. By delaying the resize operation until after the FRC, you have to resize half as often.

Is there another operation I can defer?

Yes. In DES, color conversion is one of the most CPU-intensive operations that must be performed when dealing with WMV or other compressed formats. DES only supports RGB, but the internal format of WMV is YV12. By modifying the video resizer to accept the YV12 media type and to output RGB24, you can also defer the color conversion operation. Even better, you can resize first in YV12, and then do the color conversion afterward.

Sample Application

The sample application consists of a simple filter graph that shows how to turn deferred video-resize operations on and off. The main function defines a constant called IMPROVE_PERFORMANCE that, when set, uses a new method supplied by DirectX 9.0 to insert the deferred resizer into the filter graph.

For More Information

For information about DirectShow programming, see the DirectShow SDK documention.

For more information about Microsoft DirectX version 9.0, see DirectX 9.0 for C++.