Migrating from DirectShow to Media Foundation
Microsoft Media Foundation
Summary: Media Foundation is a new application programming interface (API) for digital media in Windows Vista. This paper explains how applications that are now built using DirectShow can begin using Media Foundation.
This paper explains how applications that now use Microsoft DirectShow can begin adopting Media Foundation, a digital media API in Microsoft Windows Vista. The extent to which you leverage Media Foundation will depend on the particular needs of your application. The purpose of this paper is to help you weigh the benefits of incorporating Media Foundation into your application and to decide on the migration strategy that best fits your product.
This paper assumes that you are familiar with DirectShow. Also, this paper does not describe the design of Media Foundation except in passing. For a high-level overview of Media Foundation, see the Windows SDK documentation for Windows Vista.
The era of high-definition digital media is here. Today's consumer can view high-definition (HD) digital television; the next generation of HD DVD will be here soon; and every day, millions of people download digital music. In short, digital media has become ubiquitous. But high-value content requires robust protection. The current digital media platform does not offer a sufficient level of protection for premium content. Media Foundation has been designed to meet this challenge.
To fully embrace the new wave of high-definition content, the platform must be resilient to glitches and must offer unparalleled audio and video quality. Media Foundation has been designed with this challenge in mind. Several audio and video quality enhancements throughout the platform now make it possible to deliver a great experience for next-generation high-definition content. For example:
- DirectX Video Acceleration (DXVA) 2.0 offers more efficient video acceleration, compared with DXVA 1.0, with more robust, streamlined video decoding, and extended use of hardware in video processing. With DXVA 2.0, Windows can handle some of the most demanding high-definition content with high quality and improved glitch resilience.
- Color-space information is preserved throughout the video pipeline. Users can enjoy video content with full fidelity. Preserving color-space information also reduces unnecessary color-space conversions, which frees more cycles to process demanding HD content.
- The enhanced video renderer (EVR) offers better timing support, enhanced video processing, and improved glitch resilience.
Another challenge is the wide variety of content protection technologies and standards that are associated with different types of premium content. A consumer might purchase a song online, download it to her personal computer, stream it over the home network to a digital audio receiver, and then transfer it to a portable media player. At each step, different protection technologies might be used. It is not enough just to support these technologies—they must interoperate as seamlessly as possible.
Finally, Media Foundation is designed to address limitations in the current digital media platform. Although DirectShow is a versatile API for writing digital media applications, its core architecture has been in place for about a decade and is starting to show its age. For example:
- The pipeline in DirectShow tends to be static. Implementing dynamic graphs and major format changes is a complex task.
- The threading model for DirectShow filters is complex and requires a thorough understanding to implement correctly.
- Filters cannot be used easily outside of a DirectShow graph, which ties them to the DirectShow pipeline.
- DirectShow does not readily support protected content.
For all of these reasons, Media Foundation has been designed from the beginning as a completely new API. In the long term, Media Foundation is intended to replace DirectShow. The version of Media Foundation that is being made available in Windows Vista represents the first step toward this goal.
That said, Media Foundation does not yet encompass all of the capabilities of the existing Windows digital media platform, of which DirectShow is a key component. Instead, this release of Media Foundation brings a deep focus to the area of protected media processing. That includes playback of protected content, but also transcoding and transcription between different formats and content protection technologies.
Protected media processing has two goals:
- Customers can make legitimate use of media content they purchase without difficulty. Moreover, by making the personal computer a trusted part of the digital entertainment ecosystem, Media Foundation will enable more premium content to flow through the computer, which will benefit consumers.
- Content providers can deliver high-quality, premium audio and video through the personal computer, in a robust environment, while granting specific rights for the use of that content.
The key feature of Media Foundation that supports protected content is the Protected Media Path (PMP), which provides a protected environment for running audio and video processing pipelines.
The basic infrastructure is in place for Media Foundation to be expanded into a complete digital media platform. In the meantime, existing digital media APIs, including DirectShow, will continue to be used in applications written for Windows Vista. Your migration path depends on the type of application you are writing, whether you are writing a new application or maintaining an existing application, and whether you want to provide features that are specific to Windows Vista.
It is not expected that every digital media application will adopt Media Foundation immediately. The typical migration path will be to use existing SDKs, such as DirectShow and the Windows Media Format SDK, and incorporate Media Foundation as needed. However, the sooner you begin building with Media Foundation, the better positioned you will be to capitalize on the wave of next-generation premium content.
The following new features in DirectShow and Media Foundation will simplify the migration path to Media Foundation:
- Media Foundation uses a new model for video and audio transforms. Media Foundation Transforms (MFTs) are an evolution from DirectX Media Objects (DMOs), which were introduced in the DirectX 8.0 SDK. Compared with DMOs, the required behaviors of MFTs are more clearly specified, which makes it easier to write a correct implementation. In addition, MFTs can support hardware-accelerated video transforms.
- Media Foundation provides a new video renderer, called the Enhanced Video Renderer (EVR). The EVR uses the next version of DirectX Video Acceleration (DXVA 2.0) for more efficient video rendering, and it has a simpler API for creating custom video presenters. To make EVR adoption easier, DirectShow provides an EVR filter in Windows Vista. Internally, the DirectShow EVR and the Media Foundation EVR use the same mixer and presenter objects. If you write a custom presenter, it can be used with either Direct Show or Media Foundation. DirectShow applications that use the EVR filter for advanced video rendering will be well placed to convert to a Media Foundation implementation in the future.
- With DXVA 2.0, video acceleration is now available directly to user-mode components without needing to communicate with the DirectShow video renderer. Previously, DXVA was accessible only through the video renderer. Decoders can now take advantage of DXVA 2.0 to provide fast video decoding without any dependency on DirectShow. Applications can also use DXVA 2.0 to perform video processing operations, such as contrast and gamma adjustment.
This section describes some typical scenarios for digital media applications, and provides recommendations for which technologies to use.
The most basic type of digital media application is one that provides audio or video playback with transport controls (play, pause, fast forward, rewind, and so forth). For this type of application, you should consider using the Windows Media Player OCX. On Windows Vista, Windows Media Player supports protected as well as unprotected content. Using the OCX, you get all of the functionality that is provided by the OCX, and an object model that is easier to program to than lower-level APIs such as DirectShow or Media Foundation. Also, you can write a managed .NET application that uses the OCX through COM interoperability.
If you need or want to go beyond the playback features provided by the Windows Media Player OCX, you should use the Media Foundation pipeline to play protected content. For unprotected Windows Media or MP3 content, you can also use Media Foundation and get the benefits of enhanced video fidelity and glitch resiliency. Use DirectShow for other unprotected content.
For example, suppose you are creating a Windows Media–based music or video service with a customized end-to-end experience. This type of application would use the Media Foundation control layer for Windows Media content and MP3 files, and would use DirectShow for unprotected content such as AVI and WAV. You should continue to use DirectShow for traditional DVD playback as well.
Similarly, if your existing application already uses DirectShow for playback, and you want to support protected content in Windows Vista, you should add a code path that uses the Media Foundation pipeline for protected content playback, while continuing to use DirectShow for the existing functionality. If you use any advanced video rendering features, consider using the DirectShow EVR filter.
Third-Party Formats or Content Protection Technologies
In addition to providing built-in support for Windows Media Format and Windows Media Digital Rights Management, Media Foundation is designed to meet the content protection robustness requirements of next-generation premium content. If you provide premium content in a third-party format or use a custom or industry-standard content-protection technology, you should build custom components using Media Foundation. The types of components that you might create include:
- Media source. A media source introduces media data into the pipeline, similar to a source filter in DirectShow.
- Input trust authority (ITA). An ITA encapsulates the content protection technology for a media source. An ITA might define rights, enable license acquisition, and decrypt the content.
- Media Foundation Transform (MFT). An MFT is a component that processes a media stream, similar to a transform filter. Write MFTs for your encoders, stream parsers, demultiplexers, and so forth—anything that manipulates the bits in the stream.
- Media sink. A media sink receives data, processes it, and delivers it to a destination, similar to a renderer filter in DirectShow.
- Output trust authority (OTA). An OTA enables protected content to reach its intended destination securely. The OTA enforces any output protection mechanisms that are needed.
With these objects in place, you can plug into the Media Foundation Protected Media Path so that your content works seamlessly within the Media Foundation pipeline.
For video capture, continue to use DirectShow.
Custom Plug-in Components
If you create custom filters for DirectShow, such as encoders or decoders, you should consider writing an MFT instead. Writing an MFT gives you the inherent advantages of the MFT model over filters or DMOs, lets you take advantage of DXVA 2.0, and positions your product to work within the Media Foundation pipeline as well as DirectShow. The choice of whether to write an MFT depends on several factors:
- If you have an existing DMO, converting it to an MFT is typically a straightforward process, because the basic design of the two APIs is similar.
- If you have already written a custom DirectShow filter, and the filter is meant to be used only within your own DirectShow application, there is probably no benefit to rewriting it as an MFT.
- Source and sink filters should generally not be written as MFTs.
Some applications need to exert a great deal of low-level control over the pipeline. For protected content, you can build your own media pipeline with your own Media Foundation plug-ins. You will have the ability to run your custom media pipeline inside the Media Foundation Protected Media Path. For unprotected content, continue to use DirectShow and use the DirectShow EVR.
Video editing is not the primary focus of this release of Media Foundation. If your application is written using DirectShow or DirectShow Editing Services, you should continue to use those.
Media Foundation is the digital multimedia platform for Windows Vista and beyond. On Windows Vista, the primary focus of Media Foundation is premium content playback. Media Foundation does not yet completely replace DirectShow; for many applications, the best approach will be to use a blend of technologies. For premium content, Media Foundation today gives you the advantages of content protection and enhanced audio-video quality in the pipeline. Applications that incorporate Media Foundation are well positioned to take advantage of the next generation of digital media content.
The following table compares the features of Media Foundation with those of DirectShow.
|Basic functionality||Audio and video rendering||Yes||Yes|
|Synchronization to reference clock||Yes||Yes|
|Improved stress resilience||Yes||No|
|Content protection||Component validation||Yes||No|
|Content protection policy negotiation||Yes||No|
|Interoperability between content protection technologies||Yes||No|
|Protection against kernel-mode and user-mode threats||Yes||No|
|Component revocation and renewal||Yes||No|
|Video output protection management||Yes||Yes|
|Media tasks||Audio capture||No||Yes|
|DVD playback and navigation||No||Yes|
|Stream buffer engine||No||Yes|
|Video renderer||Substream mixing using per-pixel or planar alpha blending||Yes||Yes|
|Customizable video composition||No||Yes|
|Support for custom presenters||Yes||Yes|
|DirectDraw exclusive mode||Yes||Yes|
|Backward compatibility with existing applications||Yes||Yes|
|Accurate frame stepping||Yes||Yes|
|Alpha blending of image data||Yes||Yes|
|Enhanced video fidelity||Yes||No|
|Enhanced content protection robustness||Yes||No|
|Standalone mixing component||Yes||No|
|Transforms (MFT or DMO)||Synchronous data processing||Yes||Yes|
|Simple programming model||Yes||Yes|
|Multiple inputs and multiple outputs||Yes||Yes|
|Dynamic number of streams||Yes||No|
|Access to sample-level metadata||Yes||No|
|Dynamic format changes||Yes||No|
The following table compares the features of Media Foundation with those of the Windows Media Format SDK.
|ASF file features||Audio and video streams||Yes||Yes|
|Arbitrary streams (text, file, Web, custom data)||No||Yes|
|Data unit extensions||Yes||Yes|
|SMPTE time code support||No||Yes|
|Multiple bit rate stream||Yes||Yes|
|Multiple language support||Yes||Yes|
|Codec features||CBR encoding||Yes||Yes|
|High-resolution audio support||Yes||Yes|
|Low delay audio||Yes||Yes|
|S/PDIF audio output||Yes||Yes|
|Device conformance template||Yes||Yes|
|Video complexity settings||Yes||Yes|
|DirectX Video Acceleration||Yes||Yes|
|File writing||Video resizing||Yes||Yes|
|Color space conversion||Yes||Yes|
|ASF file sink||Yes||Yes|
|Input formats, input settings, and data unit extensions||Yes||Yes|
|WMA smart recompression||No||Yes|
|File reading||User-allocated sample support||No||Yes|
|Output format enumeration||Yes||Yes|
|Digital rights management||Live DRM||No||Yes|
|Back up and restore DRM licenses||Yes||Yes|
|View DRM attributes in the Metadata Editor||Yes||Yes|
|Output protection levels||Yes||Yes|
|Windows Media DRM for Network Devices||Yes||Yes|
|Secure Audio Path||No||Yes|
|Third-party transcription support||Yes||No|
|Local license issuance||Yes||No|
|Enhanced Windows Media DRM renewability||Yes||No|
- To learn more about Media Foundation, see the Media Foundation SDK documentation.
- For information about DirectShow programming, see the DirectShow SDK documentation.
Web addresses can change, so you might be unable to connect to the Web site or sites mentioned here.