June 2013

Volume 28 Number 6

DirectX Factor - An Introduction to Audio Processing Objects

By Charles Petzold

Charles PetzoldThe XAudio2 component of DirectX is much more than just a way to play sounds and music in a Windows 8 application. I’ve come to view it rather as a versatile construction set of sound processing. Through the use of multiple IXAudio2SourceVoice and IXAudio2SubmixVoice instances, programmers can split sounds into separate pipelines for customized processing, and then combine them to merge in the final IXAudio2MasteringVoice.

As I demonstrated in the previous installment of this column (msdn.microsoft.com/magazine/dn198248), XAudio2 allows standard audio filters to be applied to source and submix voices. These filters attenuate frequency ranges and consequently alter the harmonic content and timbre of the sounds.

But much more powerful is a generalized facility that provides access to the actual audio streams passing through the voices. You can simply analyze this audio stream, or modify it.

This facility is known informally as an “audio effect.” More formally it involves the creation of an Audio Processing Object (APO), also known as a cross-platform APO, or XAPO, when it can be used with Xbox 360 applications as well as Windows.

XAudio2 includes two predefined APOs for common tasks. The XAudio2CreateVolumeMeter function creates an APO that allows a program to dynamically obtain the peak amplitude of an audio stream at intervals convenient to the application. The XAudio­2CreateReverb function creates an APO that applies echo or reverberation to a voice based on 23 runtime parameters—and by “runtime” in this context I mean parameters that can be dynamically changed while the APO is actively processing audio. In addition, a library known as XAPOFX provides echo and reverberation effects, as well as a volume limiter and a four-band equalizer.

An APO implements the IXAPO interface, and an APO with runtime parameters implements the IXAPOParameters interface. But an easier approach to creating your own APOs involves deriv­ing from the CXAPOBase and CXAPOParametersBase classes, which implement these interfaces and handle much of the overhead.

Deriving from these two classes is the strategy I’ll be using in this article. In addition to the other header files and important libraries I’ve discussed in previous columns, projects that implement APOs need a reference to the xapobase.h header file and the xapobase.lib import library.

Using Audio Processing Objects

Before discussing the internals of an APO class, let me show you how to apply the effects to XAudio2 voices. The SimpleEffectDemo project in the downloadable code for this column allows you to load a file from your Windows 8 music library and play it. It’s similar to code I’ve shown in previous columns: The file is loaded and decoded using Media Foundation classes, and played with XAudio2. SimpleEffectDemo creates only two XAudio2 voices: a source voice for generating the audio and the required mastering voice that funnels the audio to the sound hardware.

SimpleEffectDemo also contains two non-parametered CXAPO­Base derivatives called OneSecondTremoloEffect (which applies a tremolo, or wavering of volume, based on simple amplitude modulation) and OneSecondEchoEffect. Figure 1 shows the program running with a loaded music file. Each of the two effects is enabled or disabled by a ToggleButton. The screenshot shows the echo effect enabled but the tremolo effect disabled.

The SimpleEffectDemo Program with Two Audio Effects
Figure 1 The SimpleEffectDemo Program with Two Audio Effects

APOs can be applied to any type of XAudio2 voice. When applied to source voices or submix voices, the audio processing occurs after the built-in filters you set with SetFilterParameters, but before the filters applied to audio sent to other voices using SetOutputFilterParameters.

I’ve chosen to apply these two effects to the mastering voice. The code that instantiates the effects and attaches them to the mastering voice is shown in Figure 2. Each effect is referenced with an XAUDIO2_­EFFECT_DESCRIPTOR. If there’s more than one effect (as is the case here), you use an array of such structures. That structure (or array) is then referenced by an XAUDIO2_EFFECT_CHAIN structure, which is passed to the SetEffectChain method supported by all XAudio2 voices. The order of the effects matters: In this case the echo effect will get an audio stream that already has the tremolo effect applied.

Figure 2 Applying Two Audio Effects to a Mastering Voice

// Create tremolo effect
ComPtr<OneSecondTremoloEffect> pTremoloEffect = new OneSecondTremoloEffect();
// Create echo effect
ComPtr<OneSecondEchoEffect> pEchoEffect = new OneSecondEchoEffect();
// Reference those effects with an effect descriptor array
std::array<XAUDIO2_EFFECT_DESCRIPTOR, 2> effectDescriptors;
effectDescriptors[0].pEffect = pTremoloEffect.Get();
effectDescriptors[0].InitialState = tremoloToggle->IsChecked->Value;
effectDescriptors[0].OutputChannels = 2;
effectDescriptors[1].pEffect = pEchoEffect.Get();
effectDescriptors[1].InitialState = echoToggle->IsChecked->Value;
effectDescriptors[1].OutputChannels = 2;
// Reference that array with an effect chain
XAUDIO2_EFFECT_CHAIN effectChain;
effectChain.EffectCount = effectDescriptors.size();
effectChain.pEffectDescriptors = effectDescriptors.data();
hresult = pMasteringVoice->SetEffectChain(&effectChain);
if (FAILED(hresult))
  throw ref new COMException(hresult, "pMasteringVoice->SetEffectChain failure");

After the SetEffectChain call, the effect instances should not be further referenced by the program. XAudio2 has already added a reference to these instances, and the program can release its own copies, or ComPtr can do that for you. From here on, the effects are identified by indices—in this case 0 for the tremolo effect and 1 for the echo effect. You might want to use an enumeration for those constants.

For both of the effects, I’ve set the InitialState field of the XAUDIO2_­EFFECT_DESCRIPTOR to a ToggleButton checked status. This governs whether the effect is initially enabled or disabled. The effects are later enabled and disabled by the Checked and Unchecked handlers for the two ToggleButton controls, as shown in Figure 3.

Figure 3 Enabling and Disabling Audio Effects

void MainPage::OnTremoloToggleChecked(Object^ sender, 
  RoutedEventArgs^ args)
{
  EnableDisableEffect(safe_cast<ToggleButton^>(sender), 0);
}
void MainPage::OnEchoToggleChecked(Object^ sender, 
  RoutedEventArgs^ args)
{
  EnableDisableEffect(safe_cast<ToggleButton^>(sender), 1);
}
void MainPage::EnableDisableEffect(ToggleButton^ toggle, int index)
{
  HRESULT hresult = toggle->IsChecked->Value ? 
    pMasteringVoice->EnableEffect(index) :
  pMasteringVoice->DisableEffect(index);
  if (FAILED(hresult))
    throw ref new COMException(hresult, "pMasteringVoice->Enable/DisableEffect " +
       index.ToString());
}

Instantiation and Initialization

Both OneSecondTremoloEffect and OneSecondEchoEffect derive from CXAPOBase. Perhaps the first puzzlement you’ll encounter when deriving from this class is dealing with the CXAPOBase constructor. This constructor requires a pointer to an initialized XAPO_REGISTRATION_PROPERTIES structure, but how does this structure get initialized? C++ requires that a base class constructor complete before any code in the derived class is executed. 

This is a bit of a quandary, which you can solve by defining and initializing the structure as a global variable, or a static field, or within a static method. I prefer the static field approach in this case, as you can see in the OneSecondTremoloEffect.h header file in Figure 4.

Figure 4 The OneSecondTremoloEffect.h Header File

#pragma once
class OneSecondTremoloEffect sealed : public CXAPOBase
{
private:
  static const XAPO_REGISTRATION_PROPERTIES RegistrationProps;
  WAVEFORMATEX waveFormat;
  int tremoloIndex;
public:
  OneSecondTremoloEffect() : CXAPOBase(&RegistrationProps),
                             tremoloIndex(0)
  {
  }
protected:
  virtual HRESULT __stdcall LockForProcess(
    UINT32 inpParamCount,
    const XAPO_LOCKFORPROCESS_BUFFER_PARAMETERS  *pInpParams,
    UINT32 outParamCount,
    const XAPO_LOCKFORPROCESS_BUFFER_PARAMETERS  *pOutParam) override;
  virtual void __stdcall Process(
    UINT32 inpParameterCount,
    const XAPO_PROCESS_BUFFER_PARAMETERS *pInpParams,
    UINT32 outParamCount,
    XAPO_PROCESS_BUFFER_PARAMETERS *pOutParams,
    BOOL isEnabled) override;
};
class __declspec(uuid("6FB2EBA3-7DCB-4ADF-9335-686782C49911"))
                       OneSecondTremoloEffect;

The RegistrationProperties field is initialized in the code file (coming up shortly). A pointer to it is passed to the CXAPOBase constructor. Very often a CXAPOBase derivative will also define a field of type WAVEFORMATEX (as this one does) or WAVEFORMATEXTENSIBLE (in the general case) for saving the waveform format of the audio stream passing through the effect.

Notice also the __declspec (“declaration specifier”) at the bottom of the file that associates the OneSecondTremoloEffect class with a GUID. You can generate a GUID for your own effects classes from the Create GUID option on the Tools menu in Visual Studio.

A CXAPOBase derivative must override the Process method and usually overrides the LockForProcess method as well. The LockForProcess method allows the APO to perform initialization based on a particular audio format, which includes the sampling rate, the number of channels and the sample data type. The Process method actually performs the analysis or modification of the audio data.

Figure 5 shows these two methods as well as the initialization of the RegistrationProperties field. Notice that the first field of XAPO_REGISTRATION_PROPERTIES is the GUID identified with the class.

Figure 5 The OneSecondTremoloEffect.cpp File

#include "pch.h"
#include "OneSecondTremoloEffect.h"
const XAPO_REGISTRATION_PROPERTIES OneSecondTremoloEffect::RegistrationProps =
{
  __uuidof(OneSecondTremoloEffect),
  L"One-Second Tremolo Effect",
  L"Coded by Charles Petzold",
  1,      // Major version number
  0,      // Minor version number
  XAPOBASE_DEFAULT_FLAG | XAPO_FLAG_INPLACE_REQUIRED,
  1,      // Min input buffer count
  1,      // Max input buffer count
  1,      // Min output buffer count
  1       // Max output buffer count
};
HRESULT OneSecondTremoloEffect::LockForProcess(
  UINT32 inpParamCount,
  const XAPO_LOCKFORPROCESS_BUFFER_PARAMETERS  *pInpParams,
  UINT32 outParamCount,
  const XAPO_LOCKFORPROCESS_BUFFER_PARAMETERS  *pOutParams)
{
  waveFormat = * pInpParams[0].pFormat;
  return CXAPOBase::LockForProcess(inpParamCount, pInpParams,
                                   outParamCount, pOutParams);
}
void OneSecondTremoloEffect::Process(UINT32 inpParamCount,
  const XAPO_PROCESS_BUFFER_PARAMETERS *pInpParams,
  UINT32 outParamCount,
  XAPO_PROCESS_BUFFER_PARAMETERS *pOutParams,
  BOOL isEnabled)
{
  XAPO_BUFFER_FLAGS flags = pInpParams[0].BufferFlags;
  int frameCount = pInpParams[0].ValidFrameCount;
  const float * pSrc = static_cast<float *>(pInpParams[0].pBuffer);
  float * pDst = static_cast<float *>(pOutParams[0].pBuffer);
  int numChannels = waveFormat.nChannels;
  switch(flags)
  {
  case XAPO_BUFFER_VALID:
    for (int frame = 0; frame < frameCount; frame++)
    {
      float sin = 1;
      if (isEnabled)
      {
        sin = fabs(DirectX::XMScalarSin(DirectX::XM_PI * tremoloIndex /
                                        waveFormat.nSamplesPerSec));
        tremoloIndex = (tremoloIndex + 1) % waveFormat.nSamplesPerSec;
      }
      for (int channel = 0; channel < numChannels; channel++)
      {
        int index = numChannels * frame + channel;
        pDst[index] = sin * pSrc[index];
      }
    }
    break;
  case XAPO_BUFFER_SILENT:
    break;
  }
  pOutParams[0].ValidFrameCount = pInpParams[0].ValidFrameCount;
  pOutParams[0].BufferFlags = pInpParams[0].BufferFlags;
}

In theory, APOs can deal with multiple input buffers and multiple output buffers. However, APOs are currently restricted to one input buffer and one output buffer. This restriction affects the last four fields of the XAPO_REGISTRATION_PROPERTIES structure, and the parameters to the LockForProcess and Process method. For both methods, inpParamCount and outParamCount are always equal to 1, and the pointer arguments always point to just one instance of the indicated structure.

At the rate of 100 calls per second, the Process method of an APO receives an input buffer of audio data and prepares an output buffer. It’s possible for APOs to perform format conversions—for example, to change the sampling rate between the input and output buffer, or the number of channels, or the data type of the samples.

These format conversions can be difficult, so you can indicate in the sixth field of the XAPO_REGISTRATION_PROPERTIES structure which conversions you’re not prepared to implement. The XAPOBASE_DEFAULT_FLAG indicates that you don’t wish to perform conversions of the sampling rate, the number of channels, the sample bit sizes or the frame sizes (the number of samples in each Process call).

The format of the audio data passing through the APO is available from the parameters to the LockForProcess override in the form of a standard WAVEFORMATEX structure. Commonly, LockForProcess is only called once. Most APOs need to know the sampling rate and number of channels, and it’s best to generalize your APO for any possible values.

Also crucial is the data type of the samples themselves. Most often when working with XAudio2, you’re dealing with samples that are 16-bit integers or 32-bit floating-point values. Internally, however, XAudio2 prefers using floating-point data (the C++ float type), and that’s what you’ll see in your APOs. If you’d like, you can verify the sample data type in the LockForProcess method. However, it’s also my experience that the wFormatTag field of the WAVEFORMATEX structure does not equal WAVE_FORMAT_IEEE_FLOAT as might be expected. Instead, it’s WAVE_FORMAT_EXTENSIBLE (the value 65534), which means that you’re really dealing with a WAVEFORMATEXTENSIBLE structure, in which case the SubFormat field indicates the data type KSDATAFORMAT_SUBTYPE_IEEE_FLOAT.

If the LockForProcess method encounters an audio format it can’t deal with, it should return an HRESULT indicating an error, perhaps E_NOTIMPL to indicate “not implemented.”

Processing the Audio Data

The LockForProcess method can spend whatever time it needs for initialization, but the Process method runs on the audio-processing thread, and it must not dawdle. You’ll discover that for a sampling rate of 44,100 Hz, the ValidFrameCount field of the buffer parameters equals 441, indicating that Process is called 100 times per second, each time with 10 ms of audio data. For two-channel stereo, the buffer contains 882 float values with the channels interleaved: left channel followed by right channel.

The BufferFlags field is either XAPO_BUFFER_VALID or XAPO_BUFFER_SILENT. This flag allows you to skip processing if there’s no actual audio data coming through. In addition, the isEnabled parameter indicates if this effect has been enabled via the EnableEffect and DisableEffect methods that you’ve already seen.

If the buffer is valid, the OneSecondTremoloEffect APO loops through the frames and the channels, calculates an index for the buffer, and transfers float values from the source buffer (pSrc) to the destination buffer (pDst). If the effect is disabled, a factor of 1 is applied to the source values. If it’s enabled, a sine value is applied, calculated using the zippy XMScalarSin function from the DirectX Math library.

At the end of the Process method, the ValidFrameCount and BufferFlags are set on the output parameters structure to the corresponding values of the input parameters structure.

Although the code treats the input and output buffers as separate objects, this is not actually the case. Among the flags you can set in the XAPO_REGISTRATION_PROPERTIES structure are XAPO_FLAG_INPLACE_SUPPORTED (which is included in the XAPOBASE_DEFAULT_FLAG) and XAPO_FLAG_INPLACE_REQUIRED. The word “inplace” means that the pointers to the input and output buffers—called pSrc and pDst in my code—are actually equal. There’s only one buffer used for both input and output. You should definitely be aware of that fact when writing your code.

But watch out: It’s my experience that if those flags are removed, separate buffers are indeed present, but only the input buffer is valid for both input and output.

Saving Past Samples

The tremolo effect merely needs to alter samples. An echo effect needs to save previous samples because the output of a one-second echo effect is the current audio plus audio from one second ago.

This means that the OneSecondEchoEffect class needs to maintain its own buffer of audio data, which it defines as an std::vector of type float and sizes during the LockForProcess method:

delayLength = waveFormat.nSamplesPerSec;   

int numDelaySamples = waveFormat.nChannels * 
                      waveFormat.nSamplesPerSec;delayBuffer.resize(numDelaySamples);

This delayBuffer vector is sufficient to hold one second of audio data, and it’s treated as a revolving buffer. The LockForProcess method initializes the buffer to 0 values, and initializes an index to this buffer:

delayIndex = 0;

Figure 6 shows the Process method in OneSecondEchoEffect. Because the echo effect must continue after the source audio has completed, you can no longer skip processing when the XAPO_BUFFER_SILENT flag indicates no input audio. Instead, after the sound file is finished, the output audio must continue to play the tail end of the echo. The variable named source is therefore either the input audio or the value 0, depending on the existence of the XAPO_BUFFER_SILENT flag. Half of this source value is combined with half the value stored in the delay buffer, and the result is saved back into the delay buffer. At any time, you’re hearing half the current audio, plus one-quarter of the audio from one second ago, plus one-eighth of the audio from two seconds ago and so forth. You can adjust the balance for different effects, including an echo that gets louder with each repetition.

Figure 6 The Process Method in OneSecondEchoEffect

void OneSecondEchoEffect::Process(UINT32 inpParamCount,
  const XAPO_PROCESS_BUFFER_PARAMETERS *pInpParams,
  UINT32 outParamCount,
  XAPO_PROCESS_BUFFER_PARAMETERS *pOutParams,
  BOOL isEnabled)
{
  const float * pSrc = static_cast<float *>(pInpParams[0].pBuffer);
  float * pDst = static_cast<float *>(pOutParams[0].pBuffer);
  int frameCount = pInpParams[0].ValidFrameCount;
  int numChannels = waveFormat.nChannels;
  bool isSourceValid = pInpParams[0].BufferFlags == XAPO_BUFFER_VALID;
  for (int frame = 0; frame < frameCount; frame++)
  {
    for (int channel = 0; channel < numChannels; channel++)
    {
      // Get sample based on XAPO_BUFFER_VALID flag
      int index = numChannels * frame + channel;
      float source = isSourceValid ? pSrc[index] : 0.0f;
      // Combine sample with contents of delay buffer and save back
      int delayBufferIndex = numChannels * delayIndex + channel;
      float echo = 0.5f * source + 0.5f * delayBuffer[delayBufferIndex];
      delayBuffer[delayBufferIndex] = echo;
      // Transfer to destination buffer
      pDst[index] = isEnabled ? echo : source;
    }
    delayIndex = (delayIndex + 1) % delayLength;
  }
  pOutParams[0].BufferFlags = XAPO_BUFFER_VALID;
  pOutParams[0].ValidFrameCount = pInpParams[0].ValidFrameCount;
}

Try setting the length of the delay buffer to one-tenth of a second:

delayLength = waveFormat.nSamplesPerSec / 10;

Now you get more of a reverb effect than a distinct echo. Of course, in a real APO, you’ll want programmatic control over these various parameters (and others as well), which is why the real echo/reverb APO is controlled by an XAUDIO2FX_REVERB_PARAMETERS structure with 23 fields.

An APO with Parameters

Most APOs allow their behavior to be altered with runtime parameters that can be set programmatically. The SetEffectParameters method is defined for all the voice classes and references a particular APO with an index. A parametered APO is a little trickier to implement, but not much.

In the previous installment of this column, I demonstrated how to use the built-in bandpass filter implemented in the XAudio2 source and submix voices to create a 26-band graphic equalizer, in which each band affects one-third octave of the total audio spectrum. That GraphicEqualizer program effectively split the sound into 26 parts for the application of these filters, and then recombined those audio streams. This technique might have seemed somewhat inefficient.

It’s possible to implement an entire graphic equalizer algorithm in a single APO, and to get the same effect as the previous program with just one source voice and one mastering voice. This is what I’ve done in the GraphicEqualizer2 program. The new program looks the same and sounds the same as the earlier program, but internally it’s quite different.

One of the issues in passing parameters to an APO is thread synchronization. The Process method runs in the audio-processing thread, and parameters are likely being set from the UI thread. Fortunately, the CXAPOParametersBase class performs this synchronization for you.

You first need to define a structure for the parameters. For the 26-band equalizer effect, the structure contains just one field that’s an array of 26 amplitude levels:

struct OneThirdOctaveEqualizerParameters
{
  std::array<float, 26> Amplitude;};

Within the program, the members of this array are calculated from the decibel values of the sliders.

To initialize CXAPOParametersBase, you need to pass an array of three of the parameter structures to its constructor. CXAPOParametersBase uses this block of memory to perform the thread synchronization.

We’ve again encountered the problem of passing initialized data to a base class constructor from a derived class. The solution I chose this time was to define the derived class constructor as protected and instantiate the class from a public static method named Create, which is shown in Figure 7.

Figure 7 The Static Create Method for OneThirdOctaveEqualizerEffect

OneThirdOctaveEqualizerEffect * OneThirdOctaveEqualizerEffect::Create()
{
  // Create and initialize three effect parameters
  OneThirdOctaveEqualizerParameters * pParameterBlocks =
    new OneThirdOctaveEqualizerParameters[3];
  for (int i = 0; i < 3; i++)
    for (int band = 0; band < 26; band++)
      pParameterBlocks[i].Amplitude[band] = 1.0f;
  // Create the effect
  return new OneThirdOctaveEqualizerEffect(
    &RegistrationProps,
    (byte *) pParameterBlocks,
    sizeof(OneThirdOctaveEqualizerParameters),
    false);
}

The digital biquad filters implemented in XAudio2 (which are emulated in this APO) involve the following formula:

y = (b0·x + b1·x’ + b2·x’’ – a1·y’ – a2·y’’) / a0

In this formula, x is the input sample, x’ is the previous input sample, and x’’ is the sample before that. The output is y, y’ is the previous output and y’’ is the output before that.

An equalizer effect thus needs to save two previous input values for each channel, and two previous output values for each channel and each band.

The six constants in this formula depend on the type of filter; the cutoff frequency (or in the case of a bandpass filter, the center frequency) relative to the sampling rate; and Q, the filter quality. For a one-third-octave graphic equalizer, each filter has a Q corresponding to a bandwidth of one-third octave, or 4.318. Each band has a unique set of constants that are calculated in the LockForProcess method with the code shown in Figure 8.

Figure 8 The Calculation of Equalizer Filter Constants

Q = 4.318f;       // One-third octave
static float frequencies[26] =
{
  20.0f, 25.0f, 31.5f, 40.0f, 50.0f, 63.0f, 80.0f, 100.0f, 125.0f,
  160.0f, 200.0f, 250.0f, 320.0f, 400.0f, 500.0f, 630.0f, 800.0f, 1000.0f,
  1250.0f, 1600.0f, 2000.0f, 2500.0f, 3150.0f, 4000.0f, 5000.0f, 6300.0f
};
for (int band = 0; band < 26; band++)
{
  float frequency = frequencies[band];
  float omega = 2 * 3.14159f * frequency / waveFormat.nSamplesPerSec;
  float alpha = sin(omega) / (2 * Q);
  a0[band] = 1 + alpha;
  a1[band] = -2 * cos(omega);
  a2[band] = 1 - alpha;
  b0[band] = Q * alpha;       // == sin(omega) / 2;
  b1[band] = 0;
  b2[band] = -Q * alpha;      // == -sin(omega) / 2;
}

During the Process method, the APO obtains a pointer to the current parameters structure with a call to CXAPOParametersBase::BeginProcess, in this case casting the return value to a structure of type OneThird­OctaveEqualizerParameters. At the end of the Process method, a call to CXAPOParametersBase::EndProcess releases the method’s hold on the parameters structure. The complete Process method is shown in Figure 9.

Figure 9 The Process Method in OneThirdOctaveEqualizerEffect

void OneThirdOctaveEqualizerEffect::Process(UINT32 inpParamCount,
  const XAPO_PROCESS_BUFFER_PARAMETERS *pInpParam,
  UINT32 outParamCount,
  XAPO_PROCESS_BUFFER_PARAMETERS *pOutParam,
  BOOL isEnabled)
{
  // Get effect parameters
  OneThirdOctaveEqualizerParameters * pEqualizerParams =
    (OneThirdOctaveEqualizerParameters *) CXAPOParametersBase::BeginProcess();
  // Get buffer pointers and other information
  const float * pSrc = static_cast<float *>(pInpParam[0].pBuffer);
  float * pDst = static_cast<float *>(pOutParam[0].pBuffer);
  int frameCount = pInpParam[0].ValidFrameCount;
  int numChannels = waveFormat.nChannels;
  switch(pInpParam[0].BufferFlags)
  {
  case XAPO_BUFFER_VALID:
    for (int frame = 0; frame < frameCount; frame++)
    {
      for (int channel = 0; channel < numChannels; channel++)
      {
        int index = numChannels * frame + channel;
        // Do very little if filter is disabled
        if (!isEnabled)
        {
          pDst[index] = pSrc[index];
          continue;
        }
        // Get previous inputs
        float x = pSrc[index];
        float xp = pxp[channel];
        float xpp = pxpp[channel];
        // Initialize accumulated value
        float accum = 0;
        for (int band = 0; band < 26; band++)
        {
          int bandIndex = numChannels * band + channel;
          // Get previous outputs
          float yp = pyp[bandIndex];
          float ypp = pypp[bandIndex];
          // Calculate filter output
          float y = (b0[band] * x + b1[band] * xp + b2[band] * xpp
                                  - a1[band] * yp - a2[band] * ypp) / a0[band];
          // Accumulate amplitude-adjusted filter output
          accum += y * pEqualizerParams->Amplitude[band];
          // Save previous output values
          pypp[bandIndex] = yp;
          pyp[bandIndex] = y;
        }
        // Save previous input values
        pxpp[channel] = xp;
        pxp[channel] = x;
        // Save final value adjusted for filter gain
        pDst[index] = accum / Q;
      }
    }
    break;
  case XAPO_BUFFER_SILENT:
    break;
  }
  // Set output parameters
  pOutParam[0].ValidFrameCount = pInpParam[0].ValidFrameCount;
  pOutParam[0].BufferFlags = pInpParam[0].BufferFlags;
  CXAPOParametersBase::EndProcess();
}

One characteristic of programming that I’ve always liked is that problems often have multiple solutions. Sometimes a different solution is more efficient in some way, and sometimes not. Certainly replacing 26 IXAudio2SubmixVoice instances with a single APO is a radical change. But if you think this change is reflected in vastly improved performance, you’re wrong. The Windows 8 Task Manager reveals that the two GraphicEqualizer programs are approximately equivalent, suggesting that splitting an audio stream into 26 submix voices isn’t so crazy after all.


Charles Petzold is a longtime contributor to MSDN Magazine and the author of “Programming Windows, 6th Edition” (O’Reilly Media, 2012), a book about writing applications for Windows 8. His Web site is charlespetzold.com.

Thanks to the following technical experts for reviewing this article: Duncan McKay (Microsoft) and James McNellis (Microsoft)