grammar Element

Specifies the highest level container for an XML grammar definition. This element is required to make a valid grammar.

Syntax

<grammar
   mode = (voice | dtmf)
   root = "string"
   sapi:alphabet= (ipa | x-microsoft-ups | x-microsoft-sapi)
   tag-format = (semantics/1.0 | semantics-ms/1.0 | properties-ms/1.0)
   version = "1.0"
   xml:base = "grammarBaseUri"
   xml:lang = "language code-country/region code"
   xmlns = "http://www.w3.org/2001/06/grammar"
   xmlns:sapi= "https://schemas.microsoft.com/Speech/2002/06/SRGSExtensions">
</grammar>

Attributes

Attribute

Description

mode

Optional. Specifies the mode of the grammar. The mode can be one of the following values:

  • voice for spoken input

  • dtmf for dual tone multi-frequency (DTMF) input

If omitted, the default value is voice.

root

Optional, but recommended. Specifies the name of the grammar rule that will be active when the grammar is loaded by a speech recognition engine. If root is omitted, the grammar passes validation checks and compiles, but does not trigger recognition. The rule declared as the root rule must be defined within the scope of the grammar. The root rule can be scoped as either public or private.

sapi:alphabet

Required if using the sapi:pron attribute in the token Element. Specifies the phonetic alphabet to use for pronunciations defined in the sapi:pron attribute. Valid values are ipa, x-microsoft-ups, and x-microsoft-sapi. When using sapi:alphabet, the grammar Element must contain the following declaration: xmlns:sapi="https://schemas.microsoft.com/Speech/2002/06/SRGSExtensions"

tag-format

Required if a grammar contains tag Elements, this attribute specifies the content type of all tag elements contained within a grammar. This attribute takes one of three values:

  • semantics/1.0 declares that the content within tag elements is ECMAScript.

  • semantics-ms/1.0 declares that the content within tag elements is ECMAScript as implemented by Microsoft.

  • properties-ms/1.0 declares that the content within tag elements is a boolean, an integer, or a string. A string must be enclosed in double quotes.

version

Required. Specifies the version number of the Speech Recognition Grammar Specification used. The only accepted value is 1.0.

xml:base

Optional. Specifies a grammar document's base Uniform Resource Identifier (URI). The value for xml:base is used to resolve relative URIs in a grammar document. For example, a grammar file declares:
xml:base="https://www.contoso.com/"
and contains a relative reference to another document, for example:
<ruleref uri="ExternalGrammar.grxml">
This creates the following absolute path to the document:
https://www.contoso.com/ExternalGrammar.grxml.

xml:lang

Required if the value of the mode attribute is voice, optional if the value of the mode attribute is dtmf. Declares the single language for the content of the containing grammar document. The value may contain either a lower-case, two-letter language code, (such as "en" for English or "fr" for French) or may optionally include an upper-case, country/region or other variation in addition to the language code. Examples with a county/region code include "zh-TW" for Chinese as spoken in Taiwan, or "de-DE" for German. See the Remarks section for additional information.

xmlns

Required. Specifies the XML namespace for W3C speech recognition grammar. The XML namespace is http://www.w3.org/2001/06/grammar.

xmlns:sapi

Required if the grammar uses any of the following Microsoft-proprietary extensions to the SRGS specification:

The value must be https://schemas.microsoft.com/Speech/2002/06/SRGSExtensions.

Remarks

The model and syntax indicated by the tag-format value semantics/1.0 is defined in the W3C specification recommendation Semantic Interpretation for Speech Recognition (SISR) Version 1.0. The tag-format values semantics-ms/1.0 and properties-ms/1.0 indicate a model and syntax defined by Microsoft. See Support for Semantic Markup for more information.

The content of tag elements in a grammar must be of the type declared in the grammar element's tag-format attribute. Using the string literal syntax when the value of tag-format is semantics/1.0 or semantics-ms/1.0 or will generally result in a runtime error. Using the ECMAScript syntax when the value of tag-format is properties-ms/1.0 will not produce a runtime error, but will erroneously populate Rule Variables with ECMAScript code. See tag Element for more information about the syntax for each of the values of tag-format.

For a given language code declared in the xml:lang attribute, a speech recognition engine that supports that language code must be installed for the grammar to be loaded successfully.

If the grammar element specifies only a language code, and not a country/region code, for the xml:lang attribute (such as xml:lang="en"), then any installed recognizer that expresses support for that generic, region-independent language will be able to load the grammar. See Language Identifier Constants and Strings for a comprehensive list of language codes.

System.Speech does not currently support grammars that specify multiple languages. This is a departure from the Speech Recognition Grammar Specification (SRGS) Version 1.0, which allows for a grammar processor to optionally support multiple languages. For example, System.Speech does not permit a grammar such as the one shown in the following example.

<?xml version="1.0" encoding="utf-8"?>
<grammar version="1.0" xml:lang="en-GB" xmlns="http://www.s3.org/2001/06/grammar" root="Digits">
  <rule id="Digits">
    <one-of>
      <item xml:lang="fr-FR"> deux </item>
    </one-of>
  </rule>
</grammar>

To support multiple languages for your applications, you can use multiple grammars in parallel, each with a separate single language.

Note

The System.Speech namespace does support multiple languages in Speech Synthesis Markup Language (SSML) documents used to create prompts for synthesized speech. See speak Element for more information.

Language Support in Windows 7

Microsoft Windows and the System.Speech API accept all valid language-country codes, but only a limited number of speech recognition engines are provided with Windows 7. The speech recognition engines that are shipped with Microsoft Windows 7 work with the following language codes. Two-letter language codes such as "en", "fr", or "es" are also permitted.

  • en-GB. English (United Kingdom)

  • en-US. English (United States)

  • de-DE. German (Germany)

  • es-ES. Spanish (Spain)

  • fr-FR. French (France)

  • ja-JP. Japanese (Japan)

  • zh-CN. Chinese (China)

  • zh-TW. Chinese (Taiwan)

Example

<?xml version="1.0" encoding="utf-8"?>
<grammar 
   version="1.0" mode="voice" root="Welcome"
   tag-format="semantics/1.0" xml:lang="en-US"
   xml:base="https://www.contoso.com/"
   xmlns="http://www.w3.org/2001/06/grammar"
   xmlns:sapi="https://schemas.microsoft.com/Speech/2002/06/SRGSExtensions">

<rule id="Welcome">
   <item>
      Welcome to the managed code API for speech on the desktop.
   </item>
</rule>

</grammar>