Glossary Item Box
The Text-To-Speech sample service shows you how to write a service that interacts with .NET's System.Speech API.
This sample is provided in the C# language. You can find the project files for this sample at the following location under the Microsoft Robotics Developer Studio installation folder:
To run this sample you will need speakers or another type of audio output device, such as a headset, attached to a sound card in your computer.
This software requires .NET version 3.0 or higher. Note that this supsedes the older SAPI software.
You will also need Microsoft Internet Explorer or another conventional web browser.
The Text to Speech service is designed to provide your applications with a verbal interface. It can be used in conjunction with the Speech Recognizer service for two-way communication with the computer.
Running the Sample
Perform the following steps to begin the Text-to-Speech sample:
- Choose the Start > then All Programs > then Microsoft Robotics Developer Studio > Run DSS node menu command.
- When the web browser appears, under System Services (at the left-hand side of the window) choose Control Panel.
- Find the Text to Speech service and click the Create button.
The service appears in the Service Directory once it loads and begins running.
Inspecting the Service
To inspect the service, type the following Uniform Resource Identifier (URI) in the browser's address bar:
This displays the TextToSpeech page. It will show you the state of the speech object (which includes the last text spoken), as well as various other speech parameters.
You can make changes using the web page, and also test out the TextToSpeech service by entering some text and pressing the Say button.
You can also perform text-to-speech by entering this URI into the web browser:
Note that the plus sign (+) as part of the query string in a URL represents a space.
You can change the speech parameters by using the following URI:
You can perform more than one operation by combining parameters, as in the following sample URI:
The available operations for the TextToSpeech service include the following:
- SayText - Says the specified text
- SayTextSynch - Says the specified text but does not send a response until the speaking has finished. If you are using the service from VPL you might find this operation useful because program execution can be suspended until the response is recieved from the TextToSpeech service.
- SetRate - Sets the speed at which text is spoken
- SetVolume - Sets the sound volume
- SetVoice - Selects a particular voice
The TextToSpeech service allows other services to subscribe to it. Subscribers will be notified of state updates as well as of viseme states as speech is synthesized. Whereas phonemes are the basic acoustic unit of speech, visemes are the basic visual unit of speech. Thus, a viseme describes a particular facial and oral expression that occurs alongside the voicing of phonemes.
The VisemeNotify messages can be used to assist with displaying a facial representation during speech. The service does not provide a simulated face, so you will have to write your own code for this.
As with most services, the speech parameters can be set in a configuration file. Alternatively, the speech parameters can be modified through messages. The parameters include:
Changes the volume of the voice.
Valid range is 0 through 100.
Changes the speed at which the text is read.
Valid range is from -10 to 10.
Changes the speaker's voice.
Possible voices may include:
- LH Michael
- LH Michelle
- Microsoft Anna
- Microsoft Mary
- Microsoft Mike
- Microsoft Sam
- Sample TTS Voice
Short names may be used. For example, Mike may be used to select Microsoft Mike.
Note that you might only have one of these voices on your system.
These, as well as other speech parameters such as pitch and emphasis, can be modified through XML markup of the text to be spoken. To use XML markup, the first characters of the text should be: <XML>. Here are some sample parameters:
<XML>This is the normal voice.<VOICE REQUIRED="NAME=Microsoft Mike">This is the Microsoft Mike voice.</VOICE></XML> <XML><RATE SPEED="-5">This is slow speech.</RATE></XML> <XML><PITCH MIDDLE="5">This is high pitched speech.</PITCH></XML> <XML>This is a pause.<SILENCE MSEC ="500"/></XML> <XML><VOLUME LEVEL="50">This is quite speech.</VOLUME></XML> <XML><EMPH>This</EMPH> is an emphasis.</XML>
In addition to the notifications, Text-To-Speech makes calls to LogInfo() and LogError(). By default, these methods send output to the Debug output and may be viewed using the appropriate development tools.
DSS Interop Samples: WPF Text To Speech UI
Technology Samples: Speech Recognition Sample
User Interface: Microsoft Speech API
© 2012 Microsoft Corporation. All Rights Reserved.