Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

phoneme Element (Microsoft.Speech)

Specifies the phonetic pronunciation for the contained text using phones from a supported phonetic alphabet.

Syntax

<phoneme alphabet="string" ph="string"> </phoneme>

Attributes

Attribute

Description

alphabet

Optional. Specifies the phonetic alphabet to use when synthesizing the pronunciation of the ph string. The string specifying the alphabet is case-sensitive and must be entered in lowercase letters. The only acceptable values are ipa or x-microsoft-sapi or x-microsoft-ups. The pronunciation alphabet specified applies only to the containing phoneme.

ph

Required. A string containing phones that specify the pronunciation of the word contained by the phoneme element. If the specified string contains unrecognized phones, the text-to-speech (TTS) engine rejects the entire SSML document and produces none of the speech output specified in the document.

Remarks

Phonetic alphabets are composed of phones, which consist of letters, numbers or characters, sometimes in combination. Each phone describes a unique sound of speech. This is in contrast to the Latin alphabet, for which any letter may represent multiple spoken sounds. Consider the different pronunciations of the letter “c” in the words “candy” and “cease”, or the different pronunciations of the letter combination “th” in the words “thing” and “those”. See Lexicons and Phonetic Alphabets (Microsoft.Speech) for more information.

Even though the TTS engine ignores the content of the phoneme element (it pronounces only the string specified in the ph attribute), place text content in the element so that devices without speech capability can still render something intelligible in place of speech.

Example

<?xml version="1.0" encoding="ISO-8859-1"?>
<speak version="1.0"
 xmlns="http://www.w3.org/2001/10/synthesis"
 xml:lang="en-US">

  <s>
    His name is Mike <phoneme alphabet="x-microsoft-ups" ph="JH AU"> Zhou </phoneme>
  </s>

</speak>