Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

voice Element (Microsoft.Speech)

With the parameters of the voice element, you can change the voice used in speech synthesis and also specify the attributes of the voice such as the culture, the gender, and the age of the voice.

Syntax

<voice name"string" gender="string" age="integer" xml:lang="string" variant="integer"> </voice>

Attributes

Attribute

Description

name

Optional. Specifies the name of the installed voice that will speak the contained text.

gender

Optional. Specifies the preferred gender of the voice that will speak the contained text. The allowed values are: male, female, and neutral.

age

Optional. Specifies the preferred age in years of the voice that will speak the contained text. The allowed values are: 10 (child), 15 (teen), 30 (adult), and 65 (senior).

xml:lang

Optional. Specifies the language that the voice must support. The value may contain only a lower-case, two-letter language code, (such as "en" for English or "it" for Italian) or may optionally include an upper-case, country/region or other variation in addition to the language code. Examples with a county/region code include "es-US" for Spanish as spoken in the US, or "fr-CA" for French as spoken in Canada. See the Remarks section for additional information.

variant

Optional. An integer that specifies a preferred voice when more than one voice matches the values specified in any of the xml:lang, gender, or age parameters.

Remarks

Although each of its attributes is individually optional, a voice element must have at least one attribute specified.

The Microsoft Speech Platform SDK 11 accepts all valid language-country codes as values for the xml:lang attribute. For a given language code specified in the xml:lang attribute, a speech synthesis engine that supports that language code must be installed to correctly pronounce words in the specified language.

If the xml:lang attribute specifies only a language code, (such as "en" for English or "es" for Spanish), and not a country/region code, then any installed speech synthesis voice that expresses support for that generic, region-independent language may produce acceptable pronunciations for words in the specified language. See Language Identifier Constants and Strings for a comprehensive list of language codes.

Note

The Microsoft Speech Platform Runtime 11 and Speech Platform SDK 11 do not include any engines for speech synthesis in a specific language. You must download a Runtime Language (an engine for speech synthesis in a specific language) for each language in which you want to generate synthesized speech. See InstalledVoice for more information.

The voice element may declare a different language in its xml:lang attribute than the language declared in the speak element. The Speech Platform SDK 11 supports multiple languages in SSML documents.

Example

The following example speaks a phrase in English. The example then changes the speaking voice by specifying the French speech synthesis voice, and speaks the same phrase in French.

<?xml version="1.0" encoding="ISO-8859-1"?>
<speak version="1.0"
 xmlns:ssml="http://www.w3.org/2001/10/synthesis"
 xml:lang="en-US">

  This is the text that the application will speak.

  <voice name="Microsoft Server Speech Text to Speech Voice (fr-FR, Hortense)">
  Ceci est le texte qui sera prononcé par l'application.
  </voice>

</speak>