Using Grammar Elements to Recognize Speech

  Microsoft Speech Technologies Homepage

Grammar rules should account for all possible user speech patterns in a particular speech application. Use the elements in Speech Grammar Editor to design grammar rules that define user speech patterns.

  • Recognizing Words and Phrases
  • Customizing Pronunciation
  • Recognizing Choices From a List
  • Referencing Other Rules
  • Ignoring Irrelevant Words
  • Disabling Recognition
  • Ignoring Recognition
  • Adding Semantic Interpretation Information
  • Grouping Related Information

Note  The grammar rules that create the Semantic Markup Language (SML) results in this topic do not contain semantic interpretation information (semantic information). Use semantic information to define and extract meaningful data from a user's speech that an application can use.

Recognizing Words and Phrases

Use a Phrase element to specify the text of a single word, phrase, or complete sentence that the speech recognition engine recognizes. Phrase elements are the building blocks of grammar rules.

The Phrase element represents an item XML markup element in the XML text of a .grxml file.

The following code example illustrates using one Phrase element in Speech Grammar Editor to represent the phrase "good morning."

Single Phrase element

The following code example illustrates the corresponding XML markup.

  <rule id="PhraseElement">
    <item>good morning</item>
</rule>

The following code example illustrates using two Phrase elements, one for each word, in Speech Grammar Editor to represent the phrase "good morning."

Two Phrase elements

The following code example illustrates the corresponding XML markup.

  <rule id="PhraseElement" scope="private">
    <item>good</item>
    <item>morning</item>
</rule>

Note  When entering text values for Phrase elements, enter text without using quotes. Text enclosed by quotes is not recognized by the speech recognition engine.

Note  No element in Speech Grammar Editor represents a rule XML markup element. Each window in Rule Editor represents a rule XML markup element.

When the speech recognition engine combines with either of the previous rules to recognize the phrase "good morning," the recognition engine sends the following SML results to an application.

  <SML text="good morning" utteranceConfidence="1.000">
    good morning
</SML>

Customizing Pronunciation

Speech Grammar Editor provides Pronunciation Editor, which enables grammar authors to define custom pronunciations of single-word Phrase elements.

By default, Speech API (SAPI) provides default pronunciations for all words. These default pronunciations are comprised of phonemes, which are abstract categories of speech sounds (vowels and consonants) grouped together to create words.

For example, SAPI provides two default pronunciations of the word "hello." Each group of sounds in the following code example, separated by spaces, represents a phoneme.

  h ax l ow
h eh l ow

Use Pronunciation Editor to create custom pronunciations of single words.

Note  Individual phonemes in Pronunciation Editor are disabled until a pronunciation in the Custom pronunciations pane is editable.

To define a custom pronunciation

  1. Right-click a Phrase element on the Rule Editor window that contains a single word, and on the shortcut menu, click Properties.
  2. On the Properties window, click the Pronunciation field, and then click the ellipsis (...) button to open Pronunciation Editor.
  3. On the Word to lookup text box, click Lookup the pronunciations to display SAPI default pronunciations for the word in the Default pronunciations pane, and then do one of the following:
    • On the Default pronunciations pane, click a default pronunciation, and then click Play selected pronunciation to hear a text-to-speech (TTS) representation of the selected pronunciation.
    • On the Default pronunciations pane, click a default pronunciation, and then click Insert selected pronunciation to add the selected default pronunciation to the Custom pronunciations pane, where authors can modify its pronunciation.
    • On the Default pronunciations pane, click Copy all lookup pronunciations to add all of the default pronunciations to the Custom pronunciations pane, where authors can modify all of their pronunciations.
  4. Depending on the action taken in the previous step, on the Custom pronunciations pane, do one of the following.
    • Click a custom pronunciation to enable the Phoneme Palette.
    • Click Add new custom pronunciation to enable the Phoneme Palette.
  5. Click phonemes under the Vowels, Consonants, or Misc headings to create custom pronunciations.
  6. To hear a TTS representation of the selected pronunciation at any time, on the Custom pronunciations pane, click a custom pronunciation, and then click Play selected pronunciation.
  7. Click OK to save changes and close Pronunciation Editor.

When grammar authors add a custom pronunciation to a Phrase element, the Phrase element represents a token XML markup element with a pron attribute in the XML text of a .grxml file instead of the default item XML markup element.

  <token sapi:pron="h eh l ow 1">hello</token>

Custom phonemes appear in the Phrase element's Properties window on the Pronunciation field.

Recognizing Choices From a List

Use List elements to specify a set of alternative words or phrases that a user might speak to an application. When a user says one of the words or phrases in the list, the recognition engine recognizes that word or phrase.

When a grammar author drags a List element onto the Rule Editor window, a child Phrase element appears underneath the List element. Although this is the default behavior, List elements may contain any type of child elements, including Group elements and other List elements.

The List element represents a one-of XML markup element in the XML text of a .grxml file.

The following code example illustrates using a List element on Speech Grammar Editor to enclose a list of flavor choices—chocolate, raspberry, and vanilla. Three Phrase elements specify the flavor choices, using one Phrase element per choice.

List element

The following code example illustrates the corresponding XML markup.

  <rule id="ListExample" scope="private">
    <item>I'd like a</item>
    <one-of>
        <item>chocolate</item>
        <item>raspberry</item>
        <item>vanilla</item>
    </one-of>
    <item>please</item>
</rule>

When the recognition engine combines with this rule to recognize the phrase "I'd like a chocolate please," the recognition engine sends the following SML results to an application.

  <<SML text="I'd like a chocolate please" utteranceConfidence="1.000">
    I'd like a chocolate please
</SML>

Note  All child elements of the containing List element inherit the properties of the parent List element.

Referencing Other Rules

Use the RuleRef element to reference another rule, either from the same grammar file or from an external grammar file. Use RuleRef elements to reuse preexisting component grammars. For example, the Library.grxml file included with every new project contains preexisting rules that return semantic results for common items such as dates, credit card numbers, telephone numbers, and other items. If the user selects a rule in Library.grxml in Grammar Explorer, a description of the rule appears in the Properties pane. Clicking the ellipsis button on the Description field displays a Description window with the attributes and values associated with that rule. Only rules with Scope set to Public in Rule Properties are available for reference in other grammars. Rules with Scope set to Private are available for reference only within the grammar containing them. The two types of rules are marked in the display in Grammar Explorer, as shown in the following illustration.

Public and Private Rules

Note  A compiled version of the default Library.grxml file, named cmnrules.cfg, is installed with the Microsoft Speech Application SDK Version 1.1 (SASDK). If the Microsoft SASDK is installed to the default location, cmnrules.cfg can be found at:

%SystemDrive%\Inetpub\wwwroot\aspnet_speech\%Version%\client_script\1033

To specify the target rule of a RuleRef

  • Click a RuleRef element on the Rule Editor window, and enter a value for the URI property in the Properties window.
  • - or -
  • Right-click a RuleRef element, on the context menu point to Set Target Rule, and then on the shortcut sub-menu click a rule name, or click Other, to reference a rule within the same grammar file. If no rule names appear on the shortcut sub-menu, no other rules exist within the grammar file. Click Browse to find a public rule in an external grammar.

The RuleRef element represents a ruleref XML markup element in the XML text of a .grxml file.

The following code example illustrates using two RuleRef elements in Speech Grammar Editor to reference two rules, City and Day, in the same grammar file.

RuleRef elements

The following code example illustrates the corresponding XML markup.

  <rule id="RuleRefExample" scope="private">
    <item>I'm traveling to</item>
    <ruleref uri="#City" type="application/srgs+xml"/>
    <item>on</item>
    <ruleref uri="#Day" type="application/srgs+xml"/>
</rule>

<rule id="City" scope="private">
    <one-of>
        <item>Chicago</item>
        <item>Denver</item>
        <item>Seattle</item>
    </one-of>
</rule>

<rule id="Day" scope="private">
    <one-of>
        <item>Saturday</item>
        <item>Sunday</item>
        <item>Monday</item>
    <one-of>
</rule>

When the recognition engine combines with these rules to recognize the phrase "I'm traveling to Seattle on Monday," the recognition engine sends the following SML results to an application.

  <SML text="I'm traveling to Seattle on Monday" utteranceConfidence="1.000">
    I'm traveling to Seattle on Monday
</SML>

To edit the rule specified by a RuleRef

  • Right-click the RuleRef referring to the target rule to edit, and select Edit Target Rule from the context menu.
  • —or—
  • Double-click the RuleRef.

Ignoring Irrelevant Words

Use the Wildcard element if the recognition engine recognizes a phrase but ignores irrelevant words. For example, it may be necessary to recognize the phrase "open message." Recognition should not fail if a user says open my message, open the message, or open the message please. Use the Wildcard element to discard the unnecessary words.

The Wildcard element represents a ruleref XML markup element with its special attribute set to GARBAGE in the XML text of a .grxml file.

The following code example illustrates using two Wildcard elements in Speech Grammar Editor. These elements enable the recognition engine to recognize speech if a user says open my message, open the message, open the message please, or any number of other responses.

Wildcard element

The following code example illustrates the corresponding XML markup.

  <rule id="WildcardExample" scope="private">
    <item>open</item>
    <ruleref special="GARBAGE"/>
    <item>message</item>
    <ruleref special="GARBAGE"/>	
</rule>

When the recognition engine combines with this rule to recognize the phrase "open my message please," the recognition engine sends the following SML results to an application.

  <SML text="open ... message ..." utteranceConfidence="1.000">
    open ... message ...
</SML>

Note  The ellipses are actually part of the SML results. They represent the unnecessary spoken words.

Disabling Recognition

Use the Halt element to disable recognition of any recognition path that contains this element. This element enables grammar authors to isolate particular recognition paths and rules for testing purposes during development. If Speech Grammar Editor activates a recognition path or rule containing the Halt element, recognition fails.

The Halt element represents a ruleref XML markup element with its special attribute set to VOID in the XML text of a .grxml file.

The following code example illustrates using a Halt element on Speech Grammar Editor.

Halt element

The following code example illustrates the corresponding XML markup.

  <rule id="HaltExample" scope="private">
    <ruleref special="VOID"/>
</rule>

Ignoring Recognition

Use the Skip element to treat a recognition path as optional. Using the Skip element on a recognition path has the same effect as specifying the Min Repeat and Max Repeat properties on that path.

The Skip element represents a ruleref XML markup element with its special attribute set to NULL in the XML text of a .grxml file.

The following code example illustrates a Skip element within a List element. The Skip element makes the List optional, so the recognition engine would recognize both hello or hello Prasanna.

The following code example illustrates the corresponding XML markup.

  <rule id="SkipExample" scope="public">
    <item>hello</item>
    <one-of>
        <item>
            <ruleref special="NULL"/>
        </item>
        <item>Francisco</item>
        <item>Prasanna</item>
    </one-of>
</rule>

Adding Semantic Interpretation Information

Use the Script Tag element to associate other rule elements with property values and scripts that manipulate the Semantic Markup Language (SML) text results returned by the recognition engine and sent to an application.

The Script Tag element represents a tag XML markup element in the XML text of a .grxml file.

The following code example illustrates using a Script Tag element in Speech Grammar Editor.

Script Tag element

The following code example illustrates the corresponding XML markup.

  <rule id="Tickets" scope="private">
    <item>one</item>
    <tag>$._value = "1"</tag>
</rule>

For detailed information about using the Script Tag element to associate semantic information with grammar rule elements, see Adding Semantic Interpretation Information.

Use the Group element to specify that a group of elements are related, and also to assign properties to a group of elements. For example, if a grammar author sets the Max Repeat property on a Group element, the value of that property also applies to the elements contained in the group. The recognition engine recognizes each item in the group.

When a grammar author drags a Group element onto the Rule Editor window, a child Phrase element appears underneath the Group element. Although this action is the default behavior, Group elements may contain any type of child elements, including List elements and other Group elements.

The Group element represents an item XML markup element in the XML text of a .grxml file. In contrast to the Phrase element, which encloses individual words or phrases in separate item XML markup elements, the Group element encloses all elements in the group with a single item XML markup element.

The following code example illustrates using a Group element in Speech Grammar Editor that groups a reference to a rule to recognize a digit, and script tags to increment the count of digits and to capture the digits. The Group element has Min Repeat set to zero and Max Repeat set to 3. Consequently, each element in the group inherits those properties. The rule recognizes from none to three spoken digits, and returns in its semantic information the number of digits spoken, the digits, and the spoken representations of the digits.

The following code example illustrates the corresponding XML markup.

  	<rule id="GroupExample" scope="public">
		<tag>$.count = 0;</tag>
		<tag>$._value = ""</tag>
		<item repeat="0-3">
			<ruleref uri="Library.grxml#digit" type="application/srgs+xml"/>
			<tag>$._value = $._value + $$._value</tag>
			<tag>$.count += 1</tag>
		</item>
		<tag>$._attributes.text = $recognized.text;</tag>
	</rule>

When the recognition engine combines with this rule to recognize the phrase "one two," the recognition engine sends the following SML results to an application.

  	<SML confidence="1.000" text="one two" utteranceConfidence="1.000">
		12
		<count>2</count>
	</SML>

Note  Although there is no specific World Wide Web Consortium Speech Recognition Grammar Specification Version 1.0 (W3C SRGS) element corresponding to a Group element, using this element does not break W3C SRGS compliance.

See Also

Enabling Speech Recognition | Creating Grammars | Grammar Design