MPEG-2 Audio Encoder

Article
08/19/2021

The Microsoft Media Foundation MPEG-2 audio encoder is a Media Foundation transform that encodes mono or stereo audio to MPEG-1 audio (ISO/IEC 11172-3) or MPEG-2 audio (ISO/IEC 13818-3).

The encoder supports Layer 1 and Layer 2 audio. It does not support MPEG-Layer 3 (MP3) audio. For MPEG-2, the encoder supports the Low Sampling Frequencies (LSF) portion of MPEG-2 audio. It does not support the multichannel extensions. The MFT outputs an MPEG elementary stream. It cannot generate packetized elementary streams, program streams, or transport streams.

Class Identifier

The class identifier (CLSID) of the MEPG-2 audio encoder is CLSID_CMPEG2AudioEncoderMFT, defined in the header file wmcodecdsp.h.

Output Types

The output type must be set first, before the input type. The following table lists the required and optional attributes for the output media type.

Attribute	Description	Remarks
MF_MT_MAJOR_TYPE	Major type.	Required. Must be MFMediaType_Audio.
MF_MT_SUBTYPE	Audio subtype.	Required. Must be MFAudioFormat_MPEG. This subtype is used for both MPEG-1 and MPEG-2 audio.
MF_MT_AUDIO_SAMPLES_PER_SECOND	Samples per second.	Required. The following values are supported for both MPEG-1 and MPEG-2: 32000 44100 48000 In addition, the following values are supported for MPEG-2 LSF: 16000 22050 24000
MF_MT_AUDIO_NUM_CHANNELS	Number of channels.	Required. Must be either 1 (mono) or 2 (stereo).
MF_MT_AUDIO_CHANNEL_MASK	Specifies the assignment of audio channels to speaker positions.	Optional. If set, the value must be 0x3 for stereo (front left and right channels) or 0x4 for mono (front center channel).
MF_MT_AUDIO_AVG_BYTES_PER_SECOND	Bit rate of the encoded MPEG stream, in bytes per second.	Optional. The ISO/IEC 11172-3 and ISO/IEC 13818-3 (LSF) specifications define several bit rates, depending on the sampling rate, the number of channels, and the audio layer (1 or 2). The encoder defaults to Layer 2 audio. If the MF_MT_AUDIO_AVG_BYTES_PER_SECOND attribute is not set, the encoder uses the following default bit rates: MPEG-1 stereo: 224,000 bits per second (bps) = 28,000 bytes per second. MPEG-1 mono: 192,000 bps = 24,000 bytes per second. MPEG-2 LSF, mono or stereo: 160,000 bps = 20,000 bytes per second. This attribute can be set to other values. If the value is not valid according to MPEG specifications, the MFT will reject the media type. You can also set the bit rate by using the ICodecAPI interface. See Remarks for more information.

If the optional attributes are not set, the encoder adds them to the media type after the type is set.

Input Types

The following table lists the required and optional attributes for the input media type.

Attribute	Description	Remarks
MF_MT_MAJOR_TYPE	Major type.	Required. Must be MFMediaType_Audio.
MF_MT_SUBTYPE	Audio subtype.	Required. Must be MFAudioFormat_PCM or MFAudioFormat_Float.
MF_MT_AUDIO_BITS_PER_SAMPLE	Number of bits per audio sample.	Required. The value must be 16 if the subtype is MFAudioFormat_PCM, or 32 if the subtype is MFAudioFormat_Float.
MF_MT_AUDIO_SAMPLES_PER_SECOND	Samples per second.	Required. Must match the output type.
MF_MT_AUDIO_NUM_CHANNELS	Number of channels.	Required. Must match the output type.
MF_MT_AUDIO_BLOCK_ALIGNMENT	Block alignment, in bytes.	Required. Calculate the value as follows: MFAudioFormat_PCM: Number of channels × 2. MFAudioFormat_Float: Number of channels × 4.
MF_MT_AUDIO_AVG_BYTES_PER_SECOND	Bit rate of the encoded AC3 stream, in bytes per second.	Required. Must equal block alignment × samples per second.
MF_MT_AUDIO_CHANNEL_MASK	Specifies the assignment of audio channels to speaker positions.	Optional. If set, the value must match the output type.
MF_MT_AUDIO_VALID_BITS_PER_SAMPLE	Number of valid bits of audio data in each audio sample.	Optional. If set, the value must be identical to MF_MT_AUDIO_BITS_PER_SAMPLE.

The encoder does not support sample-rate conversion or stereo/mono conversion. If the optional attributes are not set, the encoder adds them to the media type after the type is set.

Codec Properties

The encoder supports the following properties through the ICodecAPI interface.

Property	Description	Default value
CODECAPI_AVEncCommonMeanBitRate	Specifies the average encoded bit rate, in bits per second.	As described for the MF_MT_AUDIO_AVG_BYTES_PER_SECOND attribute in the output media type.
CODECAPI_AVEncMPACodingMode	Specifies the MPEG audio encoding mode.	Stereo for 2-channel audio, or single channel for 1-channel audio. For 2-channel audio, the encoder also supports dual channel and joint stereo.
CODECAPI_AVEncMPACopyright	Specifies whether to set the copyright bit in the MPEG audio stream.	No copyright.
CODECAPI_AVEncMPAEmphasisType	Specifies the type of de-emphasis filter that should be used when the encoded stream is decoded.	No emphasis specified.
AVEncMPAEnableRedundancyProtection	Specifies whether to add a cyclic redundancy check (CRC) to the frame header.	A CRC checksum is written to the bit stream.
CODECAPI_AVEncMPALayer	Specifies the MPEG audio layer.	Layer 2 audio.
CODECAPI_AVEncMPAOriginalBitstream	Specifies whether to set for the original bit in the MPEG audio stream.	"Original" bit is off.
CODECAPI_AVEncMPAPrivateUserBit	Specifies whether to set for the private user bit in the MPEG audio stream.	Private user bit is off.

To get a pointer to the ICodecAPI interface, call QueryInterface on the MFT.

The MFT implements the following ICodecAPI methods:

All other ICodecAPI methods return E_NOTIMPL.

Remarks

Each MPEG audio frame contains either 384 (Layer 1) or 1152 (Layer 2) audio samples per channel. However, each input buffer to the encoder may contain any number of PCM samples. The size of each input buffer must be a multiple of the block alignment. The encoder caches input samples until it has enough for an MPEG audio frame.

Each output buffer contains one raw MPEG frame. The size of each output buffer depends on the bit rate and the sample rate.

Configuring the Encoder

To change any of the default settings on the encoder, perform the following steps:

Create an instance of the encoder MFT.
Call IMFTransform::GetOutputAvailableType to get the list of the preferred output types. The encoder enumerates all sample rates for both mono and stereo. Select one of these media types, based on the sample rate and number of channels. The MF_MT_AUDIO_AVG_BYTES_PER_SECOND attribute indicates the default bit rate, in bytes per second.
Optional: You can override the default bit rate by setting a new value for MF_MT_AUDIO_AVG_BYTES_PER_SECOND on the output media type. Valid bit rates depend on the sample rate, number of channels, and audio layer.

Note

At this point in the configuration process, the encoder defaults to Layer 2 audio and will only accept Layer 2 bit rates. You will be able to switch the encoder to Layer 1 in a later step (see step 7). In that case, leave the default bit rate for now; you can change it again in step 8.
Call IMFTransform::SetOutputType to set the output media type. If you set your own value for MF_MT_AUDIO_AVG_BYTES_PER_SECOND and the MFT rejects the output media type, it is likely because you specified an invalid bit rate.
Call IMFTransform::GetInputAvailableType to enumerate the input media type. Because the sample rate and number of channels must be identical to the output type, only two options are enumerated: 32-bit floating-point PCM input and 16-bit integer PCM input. Select one of these.
Call IMFTransform::SetInputType to set the input media type.
Optional: To encode Layer 1 audio, set the CODECAPI_AVEncMPALayer property to eAVEncMPALayer_1.
Optional: To change the bit rate, set the CODECAPI_ AVEncCommonMeanBitRate property. The bit rate must be one of the valid bit rates listed in the MPEG-1 or MPEG-2 LSF specifications. Alternatively, you can call ICodecAPI::GetParameterValues to get a list of valid bit rates, based on the current settings.
Optional: With 2-channel audio, you can set the CODECAPI_ AVEncMPACodingMode property to change the coding mode to dual channel or joint stereo. You can call ICodecAPI::GetParameterRange to get the valid options. (For 1-channel audio, the only option is mono.)
Optional: Set any of the other ICodecAPI properties listed previously.

It is important to follow the order of these steps. In particular, set the output media type before changing any ICodecAPI properties. Also, you must set ICodecAPI properties before the MFT receives the first input sample. After the MFT receives input, the codec properties are read-only, and ICodecAPI::SetValue returns the value S_FALSE.

Supported Bit Rates

The encoder supports the following bit rates.

MPEG-1

MPEG-2

Layer 1

Layer 2

Layer 1

Layer 2

32*

48*

56*

128

160

80*

192

224

112

256

128

288

160

144

320

192

160

352

224**

176

112

384

256**

192

128

416

230**

224

144

448

384**

256

160

\* Mono only \*\* Stereo only

Example Media Types

Here is an example of the media types that are needed to encode 16-bit integer PCM, 48-kHz stereo audio at the default bit rate.

Output media type:

Attribute	Value
MF_MT_MAJOR_TYPE	MFMediaType_Audio
MF_MT_SUBTYPE	MFAudioFormat_MPEG
MF_MT_AUDIO_SAMPLES_PER_SECOND	48000
MF_MT_AUDIO_NUM_CHANNELS	2

Input media type:

Attribute	Value
MF_MT_MAJOR_TYPE	MFMediaType_Audio
MF_MT_SUBTYPE	MFAudioFormat_PCM
MF_MT_AUDIO_BITS_PER_SAMPLE	16
MF_MT_AUDIO_SAMPLES_PER_SECOND	48000
MF_MT_AUDIO_NUM_CHANNELS	2
MF_MT_AUDIO_BLOCK_ALIGNMENT	4
MF_MT_AUDIO_AVG_BYTES_PER_SECOND	192000

Requirements

Requirement	Value
Minimum supported client	Windows 8 [desktop apps only]
Minimum supported server	None supported
DLL	Msmpeg2enc.dll