Semantic Interpretation

Semantic interpretation tags provide the mechanism for returning grammar match data to the VoiceXML application.

Grammars for the Tellme Platform must conform to the W3C’s Semantic Interpretation for Speech Recognition (SISR) 1.0 standard.

The SISR specification determines how the <tag> elements are used to convert the result generated by an SRGS speech grammar processor into an ECMAScript (JavaScript) object that can be processed by the VoiceXML application.

This chapter includes the following topics:

  • <tag> syntax
  • Using the <tag> element
  • How rules can return values
    • Rule variables
  • Retrieving values from referenced rules
    • The rules object
  • Using concatenation with repeats

<tag> syntax

According to the SISR specification, the semantic interpretation <tag> elements can have one of two syntaxes:

  • The Script tag syntax, enabled by setting the <grammar> element’s tag-format attribute to "semantics/1.0", defines the content in the <tag> elements to be semantic ECMAScript (Compact version).
  • The String Literal tag syntax, enabled by setting the <grammar> element’s tag-format attribute to "semantics/1.0-literals", defines the content in the <tag> elements to be string literals.

Note

The Script tag syntax is recommended for all but very simple grammars.

To use the Script tag syntax, your <grammar> element header must include the tag-format attribute, like this:

<grammar tag-format="semantics/1.0">.......</grammar>

Ee800146.note(en-us,MSDN.10).gif
When using the Script tag syntax, an entire ECMAScript program can be placed between the <tag> tags (in such a case each statement must terminate with a semi-colon). However, the best practice is to keep the embedded ECMAScript very simple.

Using the <tag> element

As noted above, <tag> elements are used to convert the result generated by an SRGS speech grammar processor into an ECMAScript object that can be processed by the VoiceXML application. For example:

<item> yeah<tag>out="yes";</tag></item>

out="yes"; is an ECMAScript statement assigning the string "yes" to the output variable when the speaker says "yeah.".

Note

The semi-colon ending the ECMAScript statement is not necessary, but it is good practice. If additional ECMAScript statements are included in the tag, however, then the semi-colon is necessary to delimit the individual statements.

Scripts in <tag> elements are executed only if the <rule> or <item> containing it provides a match.

How rules can return values

Rule variables

Every <rule> element has a Rule Variable. When tag-format= "semantics/1.0", the Rule Variable is named out. The variable out is implicitly declared as an empty object before the first tag in the rule is executed. The <tag> element (which is compact ECMAScript) can either:

  • assign a primitive value like a number or string (for example, out="george";), which converts the out object to an ordinary variable with the name out
  • add properties to the object, for example, out.firstName="george";

Warning

If no <tag> element is used, the rule simply returns the text of words that were recognized.

Retrieving values from referenced rules

The rules object

When using tag-format = "semantics/1.0", there is a global rules object that has properties that hold the Rule Variable for every visible rule. The Rule Variable for any visible rule is contained in rules.rulename, where rulename is the name of the rule. Therefore, in complex grammars, the Rule variable (out) for every rule is available. Since the Rule Variable is an object, you can define properties for it—for example, rules.foodChoice.hot_dog, where foodChoice is the name of a rule.

rules.latest()

The rules object has a method, rules.latest(), that captures the most recent grammar match made at any given point in time. This can be used to collect more than one match from the same grammar. The example below demonstrates this. The example also shows how to return the grammar match as a string in the form "src=sfo^dst=lax". Tellme grammars typically return matches in strings like this, using the caret (^) as a delimiter.

<grammar mode="voice"
         root="top"
         tag-format="semantics/1.0"
         version="1.0"
         xml:lang="en-US">
 <rule id="top" scope="public">
  <item>
   <item repeat="0-1">
    i want to go
   </item>
   from
   <item>
    <ruleref uri="#CityName"/>
    <tag> out = "src=" + rules.latest(); </tag>
   </item>
   to
   <item>
    <ruleref uri="#CityName"/>
    <tag> out += "^dst=" + rules.latest(); </tag>
   </item>
  </item>
 </rule>

 <rule id="CityName" scope="private">
  <one-of>
   <item>
    san francisco
    <tag>out = "sfo";</tag>
   </item>
   <item>
    los angeles
    <tag>out = "lax";</tag>
   </item>
   <item>
    new orleans
    <tag>out = "msy";</tag>
   </item>
  </one-of>
 </rule>

</grammar>

The speech recognition engine finds a match between the speaker's utterance and this grammar only if the speaker says "I want to go from," followed by one of the three cities, followed by "to," and finally followed by another of the three cities. When a match occurs, the out variable contains the string "src=city1^dst=city2".

Using concatenation with repeats

When a grammar is invoked and a match is found, the out and rules.latest() variables are populated with the match. If the same grammar is invoked again, the contents of these variables are replaced.

When the same grammar is repeatedly invoked, for example to obtain a string of digits, you must concatenate each new digit with the cumulative sequence of digits, as follows:

<field name="phoneNumber">
   <prompt>
      what is your phone number with area code first
   </prompt>
   <grammar mode="voice" xml:lang="en-US"
            tag-format="semantics/1.0"
            version="1.0" root="phoneNum">
      <rule id="phoneNum">
         <tag>out=""</tag>
         <item repeat="10">
            <ruleref uri="http://www.ourgrammars.com/digits.grxml"/> 
            <tag>out += rules.latest( );</tag>
         </item>
      </rule>
   </grammar>
   <filled>........</filled>
</field>

Note

The line <tag>out += rules.latest()</tag> could have been written <tag>out += rules.digits;</tag>, where digits is the name of the rule in digits.grxml.