Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

Microsoft Speech Platform

Microsoft Speech Platform Native Code API Documentation

API Overview

The Microsoft Speech Platform native-code API contains Component Object Model (COM) interfaces that you can program using C++ code to manage the Speech Platform Runtime. The Speech Platform API implements all the low-level details needed to control and manage the real-time operations of various speech engines.

You can manage two types of speech engines using the Speech Platform API: speech synthesis engines (text-to-speech or TTS) and speech recognition engines (speech recognizers). TTS engines synthesize text strings and files into spoken audio using synthetic voices. Speech recognizers convert human speech into readable text strings and files.

Speech Recognition

Speech recognition allows users to interact with and control your applications by speaking. Using the Speech Platform native-code API, you can acquire and monitor speech input, create speech recognition grammars that produce both literal and semantic recognition results, capture information from events generated by the speech recognition, and configure and manage speech recognition engines. See Speech Recognition API for more information.

Speech Synthesis

Speech Synthesis gives a voice to your application, allowing it to present spoken information to users. Used in combination with speech recognition, speech synthesis lets users engage in hands-free verbal dialog with your application to accomplish their tasks. The Speech Platform API gives you control over many aspects of synthetic speech generation, including voice selection and specifying speech output characteristics such as pronunciation, volume, pitch, and speaking rate. You can author TTS prompts programmatically or using XML that conforms to the Speech Synthesis Markup Language (SSML) Version 1.0. See Speech Synthesis API for more information.

In This Section