Supporting DXVA 2.0 in Media Foundation
This topic describes how to support DirectX Video Acceleration (DXVA) 2.0 in a Media Foundation transform (MFT) using Microsoft Direct3D 9 Specifically, it describes the communication between the decoder and the video renderer, which is mediated by the topology loader. This topic does not describe how to implement DXVA decoding.
In the remainder of this topic, the term decoder refers to the decoder MFT, which receives compressed video and outputs uncompressed video. The term decoder device refers to a hardware video accelerator implemented by the graphics driver.
Tip For info on Microsoft Direct3D 11 video decoding see, Supporting Direct3D 11 Video Decoding in Media Foundation.
Note Windows Store apps must use Direct3D 11.
Here are the basic steps that a decoder must perform to support DXVA 2.0 in Media Foundation:
- Open a handle to the Direct3D 9 device.
- Find a DXVA decoder configuration.
- Allocate uncompressed Buffers.
- Decode frames.
These steps are described in more detail in the remainder of this topic.
The MFT uses the Microsoft Direct3D device manager to get a handle to the Direct3D 9 device. To open the device handle, perform the following steps:
- Expose the MF_SA_D3D_AWARE attribute with the value TRUE. The topology loader queries this attribute by calling IMFTransform::GetAttributes. Setting the attribute to TRUE notifies the topology loader that the MFT supports DXVA.
- When format negotiation begins, the topology loader calls IMFTransform::ProcessMessage with the MFT_MESSAGE_SET_D3D_MANAGER message. The ulParam parameter is an IUnknown pointer to the video renderer's Direct3D device manager. Query this pointer for the IDirect3DDeviceManager9 interface.
- Call IDirect3DDeviceManager9::OpenDeviceHandle to get a handle to the renderer's Direct3D device.
- Call IDirect3DDeviceManager9::GetVideoService and pass in the device handle. This method returns a pointer to the IDirectXVideoDecoderService interface.
- Cache the pointers and the device handle.
The MFT must find a compatible configuration for the DXVA decoder device. Perform the following steps inside the IMFTransform::SetInputType method, after validating the input type:
- Call IDirectXVideoDecoderService::GetDecoderDeviceGuids. This method returns an array of decoder device GUIDs.
Loop through the array of decoder GUIDs to find the ones that the decoder supports. For example, for an MPEG-2 decoder, you would look for DXVA2_ModeMPEG2_MOCOMP, DXVA2_ModeMPEG2_IDCT, or DXVA2_ModeMPEG2_VLD.
- When you find a candidate decoder device GUID, pass the GUID to the IDirectXVideoDecoderService::GetDecoderRenderTargets method. This method returns an array of render target formats, specified as D3DFORMAT values.
- Loop through the render target formats and look for a format supported by the decoder.
- Call IDirectXVideoDecoderService::GetDecoderConfigurations. Pass in the same decoder device GUID, along with a DXVA2_VideoDesc structure that describes the proposed output format. The method returns an array of DXVA2_ConfigPictureDecode structures. Each structure describes one possible configuration for the decoder device. Look for a configuration that the decoder supports.
- Store the render target format and configuration.
In the IMFTransform::GetOutputAvailableType method, return an uncompressed video format, based on the proposed render target format.
In the IMFTransform::SetOutputType method, check the media type against the render target format.
If the MFT cannot find a DXVA configuration (for example, if the graphics driver does not have the right capabilities), it should return the error code MF_E_UNSUPPORTED_D3D_TYPE from the SetInputType and SetOutputType methods. The topology loader will respond by sending the MFT_MESSAGE_SET_D3D_MANAGER message with the value NULL for the ulParam parameter. The MFT should release its pointer to the IDirect3DDeviceManager9 interface. The topology loader will then renegotiate the media type, and the MFT can use software decoding.
In DXVA 2.0, the decoder is responsible for allocating Direct3D surfaces to use as uncompressed video buffers. The decoder should allocate 3 surfaces for the EVR to use for deinterlacing. This number is fixed, because Media Foundation does not provide a way for the EVR to specify how many surfaces the graphics driver requires for deinterlacing. Three surfaces should be sufficient for any driver.
In the IMFTransform::GetOutputStreamInfo method, set the MFT_OUTPUT_STREAM_PROVIDES_SAMPLES flag in the MFT_OUTPUT_STREAM_INFO structure. This flag notifies the Media Session that the MFT allocates its own output samples.
To create the surfaces, call IDirectXVideoAccelerationService::CreateSurface. (The IDirectXVideoDecoderService interface inherits this method from IDirectXVideoAccelerationService.) You can do this in SetInputType, after finding the render target format.
Decoding should occur inside the IMFTransform::ProcessOutput method. On each frame, call IDirect3DDeviceManager9::TestDevice to test the device handle. If the device has changed, the method returns DXVA2_E_NEW_VIDEO_DEVICE. If this occurs, do the following:
- Close the device handle by calling IDirect3DDeviceManager9::CloseDeviceHandle.
- Release the IDirectXVideoDecoderService and IDirectXVideoDecoder pointers.
- Open a new device handle.
- Negotiate a new decoder configuration, as described in "Finding a Decoder Configuration" earlier on this page.
- Create a new decoder device.
Assuming that the device handle is valid, the decoding process works as follows:
- Get an available surface that is not currently in use. (Initially all of the surfaces are available.)
- Query the media sample for the IMFTrackedSample interface.
- Call IMFTrackedSample::SetAllocator and provide a pointer to the IMFAsyncCallback interface, implemented by the decoder. When the video renderer releases the sample, the decoder's callback will be invoked.
- Call IDirectXVideoDecoder::BeginFrame.
- Do the following one or more times:
DXVA 2.0 uses the same data structures as DXVA 1.0 for decoding operations. For the original set of DXVA profiles (for H.261, H.263, and MPEG-2), these data structures are described in the DXVA 1.0 specification.
Within each pair of BeginFrame/Execute calls, you may call GetBuffer multiple times, but only once for each type of DXVA buffer. If you call it twice with the same buffer type, you will overwrite the data.
Use the callback from the SetAllocator method (step 3) to keep track of which samples are currently available and which are in use.