Microsoft Speech SDK Overview

Other versions of this page are also available for the following:
Windows Mobile Not SupportedWindows Embedded CE Supported

The following topics introduce the Microsoft Speech SDK (SDK) to first-time users and explain its contents and functionalities:

Microsoft Speech SDK is a software development kit for building speech engines and applications for Microsoft Windows. Designed primarily for the desktop speech developer, the SDK contains the Microsoft® Win32®-compatible speech application programming interface (SAPI), sample application that demonstrates the use of Speech with other engine technologies, and documentation on the most important SDK functionalities.

You can use the SDK and SAPI/engine run-time to build applications that incorporate speech recognition and speech synthesis.

Included in the Speech API architecture is a collection of speech components for directly managing the audio, training wizard, events, grammar compiler, resources, speech recognition manager, and TTS manager for low-level control and more flexibility. The Speech API also enables support and manages shared recognition events for running multiple speech-enabled applications.

The Microsoft Speech SDK includes samples that can be used as a reference for creating speech-enabled applications. The samples are available in the %_WINCEROOT%\Public\Speech\Sdk\Samples directory. For information about the SDK samples see SAPI Samples.

Microsoft Speech API 5.0 for Windows Embedded CE has been designed to coexist on the same device with prior versions of the Microsoft Speech API (ver. 3.0, 4.0, and v. 4.0a). Microsoft is also working with many of the top speech recognition engine vendors on providing SAPI 5.0 support. Visit this Microsoft Web site for the latest list on SAPI 5.0-compatible engines.

The Microsoft Speech SDK documentation provides information for both the experienced speech developer and the beginner.

The Programmer's Guide provides information on the following Microsoft Speech API topics:

  • Application level interfaces
  • Engine level interfaces
  • Structures
  • Enumerations
  • Helper functions

The Samples Reference describes all the SDK samples.

The Microsoft Speech SDK is not a GUI or voice-user interface (VUI) development environment with menus, buttons, toolbars, and so on. The Microsoft Speech SDK assumes knowledge of programming for C, C++ or COM to directly instantiate the Speech APIs.

There is no Automation interface support in this release of the Speech API. Therefore, support is limited for Microsoft Visual Basic or Web script development without writing an Automation wrapper or VXML interpreter. For more information on third parties who provide additional SAPI 5.0-compatible Web controls, visit this Microsoft Web site.

No Microsoft Visual Basic, script, or telephony-specific controls are included in this release of the Speech SDK. For more information on third parties who provide additional SAPI 5.0-compatible Web or telephony controls, visit this Microsoft Web site.

The organization of the Speech SDK documentation is similar to other traditional Microsoft SDKs. The Finding Information section of the Microsoft Speech SDK documentation contains important information on how to use the documentation's Help Viewer, including use of the toolbar buttons and full text search, and finding a Help topic, and much more.

Visit the Microsoft® Speech.NET Technologies home page frequently. Here you can find the latest news and updates to the SDK and the Microsoft speech engines.