Microsoft Speech Platform


ISpVoice::Speak speaks the contents of a text string or file.

      HRESULT Speak(
   LPCWSTR       *pwcs,
   DWORD          dwFlags,
   ULONG         *pulStreamNumber


[in, string] Pointer to the null-terminated text string (possibly containing XML markup) to be synthesized. This value can be NULL when dwFlags is set to SPF_PURGEBEFORESPEAK indicating that any remaining data to be synthesized should be discarded. If dwFlags is set to SPF_IS_FILENAME, this value should point to a null-terminated, fully qualified path to a file.
[in] Flags used to control the rendering process for this call. The flag values are contained in the SPEAKFLAGS enumeration.
[out] Pointer to a ULONG which receives the current input stream number associated with this Speak request. Each time a string is spoken, an associated stream number is returned. Events queued back to the application related to this string will contain this number. If NULL, no value is passed back.

Return Values

S_OKFunction completed successfully.
E_INVALIDARGOne or more parameters are invalid.
E_POINTERInvalid pointer.
E_OUTOFMEMORYExceeded available memory.
SPERR_INVALID_FLAGSInvalid flags specified for this operation.
SPERR_DEVICE_BUSYTimeout occurred on synchronous call.


Normally, pulStreamNumber will just be 1. If, however, several asynchronous Speak (or SpeakStream) calls are received and must be queued, the stream number will be incremented for each call. 

If you call the Speak method with SSML markup parameters that omit the closing tag for an element that requires it, such as <prosody> or <emphasis>, the Speech Platform does not return an error code. For example, the following code snippet is missing the closing </prosody> tag, but returns S_OK.

// Speak a string directly.
if (SUCCEEDED(hr))
  hr = cpVoice->Speak(L"<prosody volume=\"x-loud\" \>Do it now", SPF_IS_XML, 0);