SpeechSynthesizer Class

Reference

Definition

Namespace:: Windows.Media.SpeechSynthesis

Important

Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.

Edit

Provides access to the functionality of an installed speech synthesis engine (voice) for Text-to-speech (TTS) services.

public ref class SpeechSynthesizer sealed : IClosable

/// [Windows.Foundation.Metadata.Activatable(65536, Windows.Foundation.UniversalApiContract)]
/// [Windows.Foundation.Metadata.ContractVersion(Windows.Foundation.UniversalApiContract, 65536)]
/// [Windows.Foundation.Metadata.MarshalingBehavior(Windows.Foundation.Metadata.MarshalingType.Agile)]
class SpeechSynthesizer final : IClosable

/// [Windows.Foundation.Metadata.ContractVersion(Windows.Foundation.UniversalApiContract, 65536)]
/// [Windows.Foundation.Metadata.MarshalingBehavior(Windows.Foundation.Metadata.MarshalingType.Agile)]
/// [Windows.Foundation.Metadata.Activatable(65536, "Windows.Foundation.UniversalApiContract")]
class SpeechSynthesizer final : IClosable

[Windows.Foundation.Metadata.Activatable(65536, typeof(Windows.Foundation.UniversalApiContract))]
[Windows.Foundation.Metadata.ContractVersion(typeof(Windows.Foundation.UniversalApiContract), 65536)]
[Windows.Foundation.Metadata.MarshalingBehavior(Windows.Foundation.Metadata.MarshalingType.Agile)]
public sealed class SpeechSynthesizer : System.IDisposable

[Windows.Foundation.Metadata.ContractVersion(typeof(Windows.Foundation.UniversalApiContract), 65536)]
[Windows.Foundation.Metadata.MarshalingBehavior(Windows.Foundation.Metadata.MarshalingType.Agile)]
[Windows.Foundation.Metadata.Activatable(65536, "Windows.Foundation.UniversalApiContract")]
public sealed class SpeechSynthesizer : System.IDisposable

function SpeechSynthesizer()

Public NotInheritable Class SpeechSynthesizer
Implements IDisposable

Inheritance: Object Platform::Object IInspectable SpeechSynthesizer

Attributes: ActivatableAttribute ContractVersionAttribute MarshalingBehaviorAttribute

Implements: IClosable IDisposable

Windows requirements

Device family	Windows 10 (introduced in 10.0.10240.0)
API contract	Windows.Foundation.UniversalApiContract (introduced in v1.0)

Examples

The following example shows how to generate a speech audio stream from a basic text string.

// The media object for controlling and playing audio.
MediaElement mediaElement = this.media;

// The object for controlling the speech synthesis engine (voice).
var synth = new Windows.Media.SpeechSynthesis.SpeechSynthesizer();

// Generate the audio stream from plain text.
SpeechSynthesisStream stream = await synth.SynthesizeTextToStreamAsync("Hello World");

// Send the stream to the media object.
mediaElement.SetSource(stream, stream.ContentType);
mediaElement.Play();

// The object for controlling the speech synthesis engine (voice).
synth = ref new SpeechSynthesizer();
// The media object for controlling and playing audio.
media = ref new MediaElement();
// The string to speak.
String^ text = "Hello World";

// Generate the audio stream from plain text.
task<SpeechSynthesisStream ^> speakTask = create_task(synth->SynthesizeTextToStreamAsync(text));
speakTask.then([this, text](SpeechSynthesisStream ^speechStream)
{
    // Send the stream to the media object.
    // media === MediaElement XAML object.
    media->SetSource(speechStream, speechStream->ContentType);
    media->AutoPlay = true;
    media->Play();
});

This example shows how to generate a speech audio stream from an SSML string, which includes some modulation elements that control the pitch, speaking rate, and volume of the speech output.

// The string to speak with SSML customizations.
string Ssml =
    @"<speak version='1.0' " +
    "xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>" +
    "Hello <prosody contour='(0%,+80Hz) (10%,+80%) (40%,+80Hz)'>World</prosody> " + 
    "<break time='500ms'/>" +
    "Goodbye <prosody rate='slow' contour='(0%,+20Hz) (10%,+30%) (40%,+10Hz)'>World</prosody>" +
    "</speak>";

// The media object for controlling and playing audio.
MediaElement mediaElement = this.media;

// The object for controlling the speech synthesis engine (voice).
var synth = new Windows.Media.SpeechSynthesis.SpeechSynthesizer();

// Generate the audio stream from plain text.
SpeechSynthesisStream stream = await synth.synthesizeSsmlToStreamAsync(Ssml);

// Send the stream to the media object.
mediaElement.SetSource(stream, stream.ContentType);
mediaElement.Play();

// The object for controlling the speech synthesis engine (voice).
synth = ref new SpeechSynthesizer();
// The media object for controlling and playing audio.
media = ref new MediaElement();
// The string to speak.
String^ ssml =
    "<speak version='1.0' "
    "xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>"
    "Hello <prosody contour='(0%,+80Hz) (10%,+80%) (40%,+80Hz)'>World</prosody>"
    "<break time='500ms' /> "
    "Goodbye <prosody rate='slow' contour='(0%,+20Hz) (10%,+30%) (40%,+10Hz)'>World</prosody>"
    "</speak>";

// Generate the audio stream from SSML.
task<SpeechSynthesisStream ^> speakTask = create_task(synth->SynthesizeSsmlToStreamAsync(ssml));
speakTask.then([this, ssml](SpeechSynthesisStream ^speechStream)
{
    // Send the stream to the media object.
    // media === MediaElement XAML object.
    media->SetSource(speechStream, speechStream->ContentType);
    media->AutoPlay = true;
    media->Play();
});

Remarks

Only Microsoft-signed voices installed on the system can be used to generate speech.

Windows includes various Microsoft-signed voices that can be used for a number of languages. Each voice generates synthesized speech in a single language, as spoken in a specific country/region.

By default, a new SpeechSynthesizer object uses the current system voice (call DefaultVoice to find out what the default voice is).

To specify any of the other speech synthesis (text-to-speech) voices installed on the user's system, use the Voice method (to find out which voices are installed on the system, call AllVoices).

If you don't specify a language, the voice that most closely corresponds to the language selected in the Language control panel is loaded.

Use a SpeechSynthesizer object to:

Generate speech from plain text using SynthesizeTextToStreamAsync, or Speech Synthesis Markup Language (SSML) Version 1.1 using SynthesizeSsmlToStreamAsync (
The generated audio stream is played through a MediaElement object), which lets you manage all media playback.
Control the speech output with the various SpeechSynthesizerOptions settings exposed through SpeechSynthesizer.Options.

Version history

Windows version	SDK version	Value added
1703	15063	Options
1709	16299	TrySetDefaultVoiceAsync

Constructors

SpeechSynthesizer()

Initializes a new instance of a SpeechSynthesizer object.

Properties

AllVoices	Gets a collection of all installed speech synthesis engines (voices).
DefaultVoice	Gets the default speech synthesis engine (voice).
Options	Gets a reference to the collection of options that can be set on the SpeechSynthesizer object.
Voice	Gets or sets the speech synthesis engine (voice).

Methods

Close()	Closes the SpeechSynthesizer and releases system resources.
Dispose()	Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.
SynthesizeSsmlToStreamAsync(String)	Asynchronously generate and control speech output from a Speech Synthesis Markup Language (SSML) Version 1.1 string.
SynthesizeTextToStreamAsync(String)	Asynchronously generate speech output from a string.
TrySetDefaultVoiceAsync(VoiceInformation)	Asynchronously attempts to set the voice used for speech synthesis on an IoT device. Note This method is available only in Embedded mode.

Applies to