Please note that all automation interface names begin with "ISpeech" and that all automation object names begin with "Sp." Applications can explicitly create object variables which instantiate automation objects, using the "CreateObject" statement or the "New" keyword in a "Dim" or "Set" statement. Object variables which instantiate automation interfaces, on the other hand, are only created by the methods, properties and events of automation objects.
Additionally, some automation interfaces are implemented by automation objects, and the properties and methods of those interfaces are inherited by the objects. For example, the ISpeechBaseStream interface defines a set of properties and methods for storing and manipulating audio data in memory. The SpFileStream, SpMemoryStream and SpCustomStream objects implement the ISpeechBaseStream interface; as a result, the methods and properties of the ISpeechBaseStream interface are available in all three objects.
| Objects | Description |
| SpAudioFormat | Defines an audio format. |
| SpCustomStream | Supports supports the use of existing IStream objects in SAPI. |
| SpFileStream | Provides the ability to open files as audio streams and save audio streams as files. |
| SpInProcRecoContext | Defines a recognition context, or a collection of settings, that requests a specific type of recognition as determined by the needs of an application. |
| SpInProcRecoContext (Events) | Defines the types of events that a recognition context can receive. |
| SpInProcRecognizer | Represents a speech recognition engine. |
| SpLexicon | Provides access to lexicons, which contain information about words that can be recognized or spoken. |
| SpMemoryStream | Supports audio stream operations in memory. |
| SpMMAudioIn | Represents the audio implementation for the standard Windows wave-in multimedia layer. |
| SpMMAudioOut | Represents the audio implementation for the standard Windows wave-out multimedia layer. |
| SpObjectToken | Supports object token entries. |
| SpObjectTokenCategory | Represents a class of object tokens. |
| SpPhoneConverter | Supports conversion from the SAPI character phoneset to the Id phoneset. |
| SpPhraseInfoBuilder | Provides the ability to rebuild phrase information from audio data saved to memory. |
| SpSharedRecoContext | Defines a recognition context, or a collection of settings, that requests a specific type of recognition as determined by the needs of an application. |
| SpSharedRecoContext (Events) | Defines the types of events that a recognition context can receive. |
| SpSharedRecognizer | Represents a speech recognition engine. |
| SpTextSelectionInformation | Provides access to the text selection information pertaining to a word sequence buffer. |
| SpUnCompressedLexicon | Provides access to lexicons, which contain information about words that can be recognized or spoken. |
| SpVoice | Enables an application to perform text synthesis operations. |
| SpVoice (Events) | defines the types of events that can be received by an SpVoice object. |
| SpWaveFormatEx | Defines the format of waveform-audio data. |