Migrating from DirectShow to Media Foundation

 

Microsoft Corporation

July 2006

 

Applies to:

   Microsoft DirectShow

   Microsoft Media Foundation

 

Summary: Media Foundation is a new application programming interface (API) for digital media in Windows Vista. This paper explains how applications that are now built using DirectShow can begin using Media Foundation.

 

Contents

Introduction

Comparing Media Foundation with DirectShow

Migration Paths

Application Scenarios

Conclusion

Appendix: Feature Comparisons

For More Information

Introduction

This paper explains how applications that now use Microsoft DirectShow can begin adopting Media Foundation, a digital media API in Microsoft Windows Vista. The extent to which you leverage Media Foundation will depend on the particular needs of your application. The purpose of this paper is to help you weigh the benefits of incorporating Media Foundation into your application and to decide on the migration strategy that best fits your product.

This paper assumes that you are familiar with DirectShow. Also, this paper does not describe the design of Media Foundation except in passing. For a high-level overview of Media Foundation, see the Windows SDK documentation for Windows Vista.

Comparing Media Foundation with DirectShow

The era of high-definition digital media is here. Today's consumer can view high-definition (HD) digital television; the next generation of HD DVD will be here soon; and every day, millions of people download digital music. In short, digital media has become ubiquitous. But high-value content requires robust protection. The current digital media platform does not offer a sufficient level of protection for premium content. Media Foundation has been designed to meet this challenge.

To fully embrace the new wave of high-definition content, the platform must be resilient to glitches and must offer unparalleled audio and video quality. Media Foundation has been designed with this challenge in mind. Several audio and video quality enhancements throughout the platform now make it possible to deliver a great experience for next-generation high-definition content. For example:

  • DirectX Video Acceleration (DXVA) 2.0 offers more efficient video acceleration, compared with DXVA 1.0, with more robust, streamlined video decoding, and extended use of hardware in video processing. With DXVA 2.0, Windows can handle some of the most demanding high-definition content with high quality and improved glitch resilience.
  • Color-space information is preserved throughout the video pipeline. Users can enjoy video content with full fidelity. Preserving color-space information also reduces unnecessary color-space conversions, which frees more cycles to process demanding HD content.
  • The enhanced video renderer (EVR) offers better timing support, enhanced video processing, and improved glitch resilience.

Another challenge is the wide variety of content protection technologies and standards that are associated with different types of premium content. A consumer might purchase a song online, download it to her personal computer, stream it over the home network to a digital audio receiver, and then transfer it to a portable media player. At each step, different protection technologies might be used. It is not enough just to support these technologies—they must interoperate as seamlessly as possible.

Finally, Media Foundation is designed to address limitations in the current digital media platform. Although DirectShow is a versatile API for writing digital media applications, its core architecture has been in place for about a decade and is starting to show its age. For example:

  • The pipeline in DirectShow tends to be static. Implementing dynamic graphs and major format changes is a complex task.
  • The threading model for DirectShow filters is complex and requires a thorough understanding to implement correctly.
  • Filters cannot be used easily outside of a DirectShow graph, which ties them to the DirectShow pipeline.
  • DirectShow does not readily support protected content.

For all of these reasons, Media Foundation has been designed from the beginning as a completely new API. In the long term, Media Foundation is intended to replace DirectShow. The version of Media Foundation that is being made available in Windows Vista represents the first step toward this goal.

That said, Media Foundation does not yet encompass all of the capabilities of the existing Windows digital media platform, of which DirectShow is a key component. Instead, this release of Media Foundation brings a deep focus to the area of protected media processing. That includes playback of protected content, but also transcoding and transcription between different formats and content protection technologies.

Protected media processing has two goals:

  • Customers can make legitimate use of media content they purchase without difficulty. Moreover, by making the personal computer a trusted part of the digital entertainment ecosystem, Media Foundation will enable more premium content to flow through the computer, which will benefit consumers. 
  • Content providers can deliver high-quality, premium audio and video through the personal computer, in a robust environment, while granting specific rights for the use of that content.

The key feature of Media Foundation that supports protected content is the Protected Media Path (PMP), which provides a protected environment for running audio and video processing pipelines.

Migration Paths

The basic infrastructure is in place for Media Foundation to be expanded into a complete digital media platform. In the meantime, existing digital media APIs, including DirectShow, will continue to be used in applications written for Windows Vista. Your migration path depends on the type of application you are writing, whether you are writing a new application or maintaining an existing application, and whether you want to provide features that are specific to Windows Vista.

It is not expected that every digital media application will adopt Media Foundation immediately. The typical migration path will be to use existing SDKs, such as DirectShow and the Windows Media Format SDK, and incorporate Media Foundation as needed. However, the sooner you begin building with Media Foundation, the better positioned you will be to capitalize on the wave of next-generation premium content.

The following new features in DirectShow and Media Foundation will simplify the migration path to Media Foundation:

  • Media Foundation uses a new model for video and audio transforms. Media Foundation Transforms (MFTs) are an evolution from DirectX Media Objects (DMOs), which were introduced in the DirectX 8.0 SDK. Compared with DMOs, the required behaviors of MFTs are more clearly specified, which makes it easier to write a correct implementation. In addition, MFTs can support hardware-accelerated video transforms.
  • Media Foundation provides a new video renderer, called the Enhanced Video Renderer (EVR). The EVR uses the next version of DirectX Video Acceleration (DXVA 2.0) for more efficient video rendering, and it has a simpler API for creating custom video presenters. To make EVR adoption easier, DirectShow provides an EVR filter in Windows Vista. Internally, the DirectShow EVR and the Media Foundation EVR use the same mixer and presenter objects. If you write a custom presenter, it can be used with either Direct Show or Media Foundation. DirectShow applications that use the EVR filter for advanced video rendering will be well placed to convert to a Media Foundation implementation in the future.
  • With DXVA 2.0, video acceleration is now available directly to user-mode components without needing to communicate with the DirectShow video renderer. Previously, DXVA was accessible only through the video renderer. Decoders can now take advantage of DXVA 2.0 to provide fast video decoding without any dependency on DirectShow. Applications can also use DXVA 2.0 to perform video processing operations, such as contrast and gamma adjustment.

Application Scenarios

This section describes some typical scenarios for digital media applications, and provides recommendations for which technologies to use.

Simple Playback

The most basic type of digital media application is one that provides audio or video playback with transport controls (play, pause, fast forward, rewind, and so forth). For this type of application, you should consider using the Windows Media Player OCX. On Windows Vista, Windows Media Player supports protected as well as unprotected content. Using the OCX, you get all of the functionality that is provided by the OCX, and an object model that is easier to program to than lower-level APIs such as DirectShow or Media Foundation. Also, you can write a managed .NET application that uses the OCX through COM interoperability.

Customized Playback

If you need or want to go beyond the playback features provided by the Windows Media Player OCX, you should use the Media Foundation pipeline to play protected content. For unprotected Windows Media or MP3 content, you can also use Media Foundation and get the benefits of enhanced video fidelity and glitch resiliency. Use DirectShow for other unprotected content.

For example, suppose you are creating a Windows Media–based music or video service with a customized end-to-end experience. This type of application would use the Media Foundation control layer for Windows Media content and MP3 files, and would use DirectShow for unprotected content such as AVI and WAV. You should continue to use DirectShow for traditional DVD playback as well.

Similarly, if your existing application already uses DirectShow for playback, and you want to support protected content in Windows Vista, you should add a code path that uses the Media Foundation pipeline for protected content playback, while continuing to use DirectShow for the existing functionality. If you use any advanced video rendering features, consider using the DirectShow EVR filter.

Third-Party Formats or Content Protection Technologies

In addition to providing built-in support for Windows Media Format and Windows Media Digital Rights Management, Media Foundation is designed to meet the content protection robustness requirements of next-generation premium content. If you provide premium content in a third-party format or use a custom or industry-standard content-protection technology, you should build custom components using Media Foundation. The types of components that you might create include:

  • Media source. A media source introduces media data into the pipeline, similar to a source filter in DirectShow.
  • Input trust authority (ITA). An ITA encapsulates the content protection technology for a media source. An ITA might define rights, enable license acquisition, and decrypt the content.
  • Media Foundation Transform (MFT). An MFT is a component that processes a media stream, similar to a transform filter. Write MFTs for your encoders, stream parsers, demultiplexers, and so forth—anything that manipulates the bits in the stream.
  • Media sink. A media sink receives data, processes it, and delivers it to a destination, similar to a renderer filter in DirectShow.
  • Output trust authority (OTA). An OTA enables protected content to reach its intended destination securely. The OTA enforces any output protection mechanisms that are needed.

With these objects in place, you can plug into the Media Foundation Protected Media Path so that your content works seamlessly within the Media Foundation pipeline.

Video Capture

For video capture, continue to use DirectShow.

Custom Plug-in Components

If you create custom filters for DirectShow, such as encoders or decoders, you should consider writing an MFT instead. Writing an MFT gives you the inherent advantages of the MFT model over filters or DMOs, lets you take advantage of DXVA 2.0, and positions your product to work within the Media Foundation pipeline as well as DirectShow. The choice of whether to write an MFT depends on several factors:

  • If you have an existing DMO, converting it to an MFT is typically a straightforward process, because the basic design of the two APIs is similar.
  • If you have already written a custom DirectShow filter, and the filter is meant to be used only within your own DirectShow application, there is probably no benefit to rewriting it as an MFT.
  • Source and sink filters should generally not be written as MFTs.

Advanced Playback

Some applications need to exert a great deal of low-level control over the pipeline. For protected content, you can build your own media pipeline with your own Media Foundation plug-ins. You will have the ability to run your custom media pipeline inside the Media Foundation Protected Media Path. For unprotected content, continue to use DirectShow and use the DirectShow EVR.

Video Editing

Video editing is not the primary focus of this release of Media Foundation. If your application is written using DirectShow or DirectShow Editing Services, you should continue to use those.

Conclusion

Media Foundation is the digital multimedia platform for Windows Vista and beyond. On Windows Vista, the primary focus of Media Foundation is premium content playback. Media Foundation does not yet completely replace DirectShow; for many applications, the best approach will be to use a blend of technologies. For premium content, Media Foundation today gives you the advantages of content protection and enhanced audio-video quality in the pipeline. Applications that incorporate Media Foundation are well positioned to take advantage of the next generation of digital media content.

Appendix: Feature Comparisons

The following table compares the features of Media Foundation with those of DirectShow.

Feature
group
Feature Media
Foundation
DirectShow
Basic functionality Audio and video rendering Yes Yes
  Event notification Yes Yes
  Device enumeration No Yes
  Component enumeration Yes Yes
  Synchronization to reference clock Yes Yes
  Seeking Yes Yes
  Improved stress resilience Yes No
Content protection Component validation Yes No
  Content protection policy negotiation Yes No
  Interoperability between content protection technologies Yes No
  Protection against kernel-mode and user-mode threats Yes No
  Component revocation and renewal Yes No
  Video output protection management Yes Yes
Media tasks Audio capture No Yes
  Video capture No Yes
  Video editing No Yes
  DVD playback and navigation No Yes
  MPEG-2 support No Yes
  ASF support No Yes
  TV technologies No Yes
  Stream buffer engine No Yes
  Encoder API No Yes
Video renderer Substream mixing using per-pixel or planar alpha blending Yes Yes
  Customizable video composition No Yes
  Support for custom presenters Yes Yes
  Windowless rendering Yes Yes
  Multimonitor support Yes Yes
  DXVA Yes Yes
  DirectDraw exclusive mode Yes Yes
  Backward compatibility with existing applications Yes Yes
  Accurate frame stepping Yes Yes
  Alpha blending of image data Yes Yes
  Glitch resilience Yes No
  Enhanced video fidelity Yes No
  Enhanced content protection robustness Yes No
  Standalone use Yes No
  Standalone mixing component Yes No
Transforms (MFT or DMO) Synchronous data processing Yes Yes
  Simple programming model Yes Yes
  Standalone use Yes Yes
  Multiple inputs and multiple outputs Yes Yes
  Dynamic number of streams Yes No
  Access to sample-level metadata Yes No
  In-place processing Yes Yes
  Dynamic format changes Yes No
  Quality adjustment Yes No
  Rate change Yes No

The following table compares the features of Media Foundation with those of the Windows Media Format SDK.

Feature
group
Feature Media
Foundation
Format SDK
ASF file features Audio and video streams Yes Yes
  Image streams No Yes
  Arbitrary streams (text, file, Web, custom data) No Yes
  Script commands No Yes
  Data unit extensions Yes Yes
  SMPTE time code support No Yes
  Mutual exclusion Yes Yes
  Stream prioritization Yes Yes
  Bandwidth sharing No Yes
  Indexes Yes Yes
  Markers Yes Yes
  Multiple bit rate stream Yes Yes
  Multiple language support Yes Yes
Codec features CBR encoding Yes Yes
  VBR encoding Yes Yes
  Two-pass encoding Yes Yes
  High-resolution audio support Yes Yes
  Low delay audio Yes Yes
  S/PDIF audio output Yes Yes
  Video image Yes Yes
  Device conformance template Yes Yes
  Video complexity settings Yes Yes
  Frame interpolation Yes Yes
  DirectX Video Acceleration Yes Yes
File writing Video resizing Yes Yes
  Color space conversion Yes Yes
  Audio resampling Yes Yes
  ASF file sink Yes Yes
  Network sinks No Yes
  Push sinks No Yes
  Watermarking support No Yes
  Input formats, input settings, and data unit extensions Yes Yes
  WMA smart recompression No Yes
  Multichannel audio Yes Yes
File reading User-allocated sample support No Yes
  Synchronous reading No Yes
  Output format enumeration Yes Yes
  Multichannel audio Yes Yes
  MP3 support Yes Yes
  Network sources Yes Yes
Metadata ID3 support No Yes
  Custom metadata Yes Yes
Digital rights management Live DRM No Yes
  DRM Individualization Yes Yes
  Back up and restore DRM licenses Yes Yes
  View DRM attributes in the Metadata Editor Yes Yes
  Output protection levels Yes Yes
  License revocation Yes Yes
  Windows Media DRM for Network Devices Yes Yes
  Secure Audio Path No Yes
  Playlist burning Yes Yes
  Third-party transcription support Yes No
  Local license issuance Yes No
  Enhanced Windows Media DRM renewability Yes No

For More Information

Web addresses can change, so you might be unable to connect to the Web site or sites mentioned here.