Capturing Audio Data in C++

Kinect for Windows 1.5, 1.6, 1.7, 1.8

Overview

This How-To topic describes user tasks from the AudioBasics-D2D C++ sample, which includes a visualization of sound wave intensity, audio beam direction, sound source direction, and sound source direction confidence. This sample uses the KinectAudio DMO. The tasks have been slightly edited to remove error case handling.

Code It

Enumerate Sensors and Connect to a Sensor

INuiSensor * pNuiSensor;
HRESULT hr;

int iSensorCount = 0;
hr = NuiGetSensorCount(&iSensorCount);

// Look at each Kinect sensor
for (int i = 0; i < iSensorCount; ++i)
{
   // Create the sensor so we can check status, if we can't create it, move on to the next
   hr = NuiCreateSensorByIndex(i, &pNuiSensor);

   // Get the status of the sensor, and if connected, then we can initialize it
   hr = pNuiSensor->NuiStatus();
   if (S_OK == hr)
   {
      m_pNuiSensor = pNuiSensor;
      break;
   }
}

if (NULL != m_pNuiSensor)
{
   // Initialize the Kinect and specify that we'll be using audio signal
   hr = m_pNuiSensor->NuiInitialize(NUI_INITIALIZE_FLAG_USES_AUDIO);
}
      

Connect to the sensor.

Get the DMO and the DMO's PropertyStore

// Get the audio source
HRESULT hr = m_pNuiSensor->NuiGetAudioSource(&m_pNuiAudioSource);
hr = m_pNuiAudioSource->QueryInterface(IID_IMediaObject, (void**)&m_pDMO );
hr = m_pNuiAudioSource->QueryInterface(IID_IPropertyStore, (void**)&m_pPropertyStore); 
      

Get the DMO.

Configure the DMO

// Set AEC-MicArray DMO system mode. This must be set for the DMO to work properly.
// Possible values are:
//   SINGLE_CHANNEL_AEC = 0
//   OPTIBEAM_ARRAY_ONLY = 2
//   OPTIBEAM_ARRAY_AND_AEC = 4
//   SINGLE_CHANNEL_NSAGC = 5
PROPVARIANT pvSysMode;
PropVariantInit(&pvSysMode );
pvSysMode.vt = VT_I4;
pvSysMode.lVal = (LONG)(2); 
m_pPropertyStore->SetValue(MFPKEY_WMAAECMA_SYSTEM_MODE, pvSysMode);
PropVariantClear(& pvSysMode);
      

Configure the DMO to use automatic beamforming without automatic echo cancellation. To configure the DMO to use automatic beamforming with automatic echo cancellation, use the OPTIBEAM_ARRAY_AND_AEC enumeration value and set pvSysMode.lVal to (LONG)(4).

Configure the DMO Output Format

WAVEFORMATEX wfxOut = {AudioFormat, AudioChannels, AudioSamplesPerSecond, AudioAverageBytesPerSecond, AudioBlockAlign, AudioBitsPerSample, 0};
DMO_MEDIA_TYPE mt = {0};
MoInitMediaType(& mt, sizeof(WAVEFORMATEX));

mt.majortype = MEDIATYPE_Audio;
mt.subtype = MEDIASUBTYPE_PCM;
mt.lSampleSize = 0;
mt.bFixedSizeSamples = TRUE;
mt.bTemporalCompression = FALSE;
mt.formattype = FORMAT_WaveFormatEx;
memcpy(mt.pbFormat, & wfxOut, sizeof(WAVEFORMATEX));

hr = m_pDMO->SetOutputType(0, & mt, 0);
      

Set the output type for the DMO.

Start Processing Audio

hr = m_pDMO->ProcessOutput(0, 1, & outputBuffer, & dwStatus);
      

Start retrieving audio from the sensor.

Retrieve the Current Beam Angle and Sound Location

m_pNuiAudioSource->GetBeam(& beamAngle);
m_pNuiAudioSource->GetPosition(& sourceAngle, & sourceConfidence);
      

This code retrieves the current beam angle and sound location.

Shut Down the Audio

m_pNuiSensor->NuiShutdown();
      

Community Additions

ADD
Show: