Pronunciation Lexicon Reference
Pronunciation lexicons contain the mapping between the pronunciations and the written representations of words or short phrases. You can use lexicons to improve the accuracy of speech recognition or to customize the vocabulary and pronunciations of a synthesized voice.
The implementation of lexicons in System.Speech is based on World Wide Web Consortium (W3C) Pronunciation Lexicon Specification (PLS) Version 1.0, which defines the structure and syntax for XML-based lexicon documents.
The following table lists and describes elements from the PLS specification that are implemented in System.Speech. Elements are listed in the order in which they occur in a PLS document,
The root element in a lexicon document that contains all the other elements.
version, xml:base, xmlns, xml:lang, alphabet
Specifies information about the document.
name, http-equiv, content
Contains information about the document in a metadata schema.
Must contain one or more grapheme elements, one or more pronunciations (each in a separate phoneme element), and can optionally contain one or more example elements.
A child element of the lexeme element, it contains the written representation of a word or phrase in a lexeme element.
A child element of the lexeme element, it contains the phonetic pronunciation of a word or phrase in a lexeme element.
A child element of the lexeme element, it contains an example sentence that illustrates an occurrence of a lexeme.
Although the PLS specification does not require that lexicon documents contain lexeme elements, a lexicon is not useful for pronunciation without them.
The meta and metadata elements must occur before the first lexeme element.
Currently, Microsoft does not support the alias element that is defined in the PLS specification.
PLS lexicons used in System.Speech are not required to contain an XML prolog, for example: <?xml version="1.0" encoding="UTF-8"?>