Using VoiceXML in a UCMA 3.0 Application: VoiceXML Document (Part 2 of 4)

Summary:   The Microsoft Unified Communications Managed API (UCMA) 3.0 Core SDK can be used to write interactive voice response (IVR) applications that work with VoiceXML documents. This article describes the VoiceXML document that is interpreted by the Microsoft Unified Communications Managed API (UCMA) 3.0 application.

Applies to:   Microsoft Unified Communications Managed API (UCMA) 3.0 Core SDK

Published:   June 2011 | Provided by:   Mark Parker, Microsoft | About the Author

Contents

Download code   Download code

Watch video   See video

This is the second in a series of four articles about how to use VoiceXML in a UCMA 3.0 application.

The VoiceXML document defines a form element whose name is get_City. This form contains a single field element whose name is CityOffset.

The following example shows the VoiceXML document.

<?xml version="1.0" encoding="utf-8" ?>
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xml:lang="en-US" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml21/vxml.xsd" >

  <script>
  <![CDATA[
    var DST = true; // Set this to true if Daylight Savings Time is in effect.
    var isPM = false;
    var hourAtRequestedCity = 0;
    var timeAtRequestedCity = 0;

    var now = new Date();
    var localHour = Number(now.getHours());
    var localMins = Number(now.getMinutes());
    if (0 <= localMins && localMins <= 9)
    {
       localMins = "0" + localMins.toString();
    }
  ]]>
  </script>
  <form id="get_city">
    <field name="CityOffset">
      <prompt bargein="true" bargeintype="speech" timeout="10s">
         Say the name of a city to find the present time there.
      </prompt>
      <grammar type="application/srgs+xml" src="http://localhost/VoiceXmlTime/CityTimeOffsets.grxml" />
      
      <catch event="nomatch">
        <prompt bargein="true" bargeintype="speech" timeout="4s">
          Sorry, I don't know that city.
        </prompt>
        <reprompt/>
      </catch>    
      
      <catch event="noinput">
        <prompt bargein="true" bargeintype="speech" timeout="4s">
          I didn't hear you.
        </prompt>
        <reprompt/>
      </catch>

      <filled>
        <script>
        <![CDATA[
          var offset = Number(CityOffset);
          hourAtRequestedCity = localHour.valueOf() + offset.valueOf();
          if (DST) 
          {
            // Increment the time if Daylight Savings Time is in effect.
            hourAtRequestedCity++;
          }
          if (hourAtRequestedCity > 24)
          {
            hourAtRequestedCity = hourAtRequestedCity - 24;
            isPM = false;
          }
          else if (hourAtRequestedCity > 12)
          {
            hourAtRequestedCity = hourAtRequestedCity - 12;
            isPM = true;
          }
          
          timeAtRequestedCity = hourAtRequestedCity.toString() + ":" + localMins.toString();
          if (isPM)
          {
            timeAtRequestedCity = timeAtRequestedCity + " PM.";
          }
          else
          {
            timeAtRequestedCity = timeAtRequestedCity + " AM.";
          }
          
        ]]> 
        </script>
        
        The time in <value expr="CityOffset$.utterance"/> is <value expr="timeAtRequestedCity"/> 
        <exit namelist="CityOffset CityOffset$.utterance CityOffset$.confidence timeAtRequestedCity"/>  
      </filled> 
    </field>
  </form>
</vxml>

A significant part of the VoiceXML document is embedded JavaScript code, which is discussed in the following section. The prompt element that is associated with the CityOffsetfield asks the user to say the name of a city. The grammar element specifies the grammar that is used in matching a user’s utterance. The bargein attribute on the prompt element indicates that the user can speak without listening to the whole prompt.

If the user says the name of a city that is not in the grammar, the nomatch event is raised, and the computer prompts the user again by saying, “Sorry, I do not know that city.” The computer then repeats the original prompt.

If the user does not respond after 10 seconds, the noinput event is raised, and the computer prompts the user again, by saying, “I did not hear you.” The computer then repeats the original prompt.

When the user’s utterance matches the grammar, control flows to the filled element that is located in the CityOffsetfield. The JavaScript block in the filled element performs some calculations to find the current time in the city whose name was spoken by the user. Following the JavaScript block, the computer speaks a sentence similar to, “The time in Atlanta is 5:43 PM.”

Finally, the exit element ends the session, and returns a namelist array, which is a list of name-value pairs. Each name in the namelist array can be used to retrieve the value associated with that name. In this VoiceXML document, the values returned can be retrieved by using the following names:

Namelist entry

Description

CityOffset

Provides access to the semantic value returned by the Speech Recognition Grammar Specification (SRGS) grammar.

CityOffset$.utterance

Provides access to the user utterance that matched an item in the SRGS grammar.

CityOffset$.confidence

Provides access to the confidence of the recognition. The confidence is a number in the range from 0.0 through 1.0.

timeAtRequestedCity

Provides access to the value produced by the JavaScript code.

For more information about how these namelist entries are used in the UCMA 3.0 application, see Using VoiceXML in a UCMA 3.0 Application: UCMA Application (Part 3 of 4).

The VoiceXML document contains two script elements. The two code examples appearing in this section obtain the time at the user’s location and calculate the time at the requested city.

The code in the first block initializes four variables that are used in the second block of JavaScript, after the SRGS grammar has returned the Coordinated Universal Time (UTC) time offset for the requested city. The first statement in this block sets the DST variable to true (at the time of writing, Daylight Savings is in effect). This variable should be set to false if Daylight Savings is not in effect.

After the initialization statements, a JavaScript Date object named now is created and initialized with the current time and date for the user’s locale. In the next statement, the localTimezoneOffset variable is initialized with the value returned by the getTimezoneOffset method on the Date object. The next two statements initialize the localHour and localMins variables with the values returned, respectively, by the getHours and getMinutes methods on the Date object. The last statement inserts the string value “0” at the beginning of the localMins value if the minutes’ value is a single digit. This prevents a time such as 10:03 from being displayed as 10:3.

Important note Important

The getTimezoneOffset method returns the number of minutes the time is offset from UTC. The offset is positive for locations west of the Prime Meridian (also known as the Greenwich Meridian), to Japan. The offset is negative for locations east of the Prime Meridian, to Japan.

The localHour and localMins variables are initialized with the hour and minutes part of the local time.

The following example is the first block of JavaScript code in the VoiceXML document.

<script>
<![CDATA[
  var DST = true; // Set this to true if Daylight Savings Time is in effect.
  var isPM = false;
  var hourAtRequestedCity = 0;
  var timeAtRequestedCity = 0;

  var now = new Date();
  var localTimezoneOffset = now.getTimezoneOffset();
  var localHour = Number(now.getHours());
  var localMins = Number(now.getMinutes());
  if (0 <= localMins && localMins <= 9)
  {
     localMins = "0" + localMins.toString();
  }
]]>
</script>

In the second block of JavaScript code, the offset variable is initialized with the value returned from the CityOffsetfield in the VoiceXML document, plus the number of hours of offset. The value in offset is then added to the localHour value to produce the adjusted value for the hourAtRequestedCity variable. If the resulting value is larger than 24, it is decremented by 24 and isPM is set to false. Otherwise, if the resulting value is larger than 12, it is decremented by 12 and isPM is set to true. This Boolean value determines whether “PM” or “AM” should be appended to the string that contains the time at the requested location. Finally, timeAtRequestedCity is set by concatenating a string with the correct hours, a colon (:), and the minutes. If isPM is true, the string “ PM.” is appended. Otherwise, the string “ AM.” is appended.

The following example is the second block of JavaScript code in the VoiceXML document.

<script>
<![CDATA[
  var offset = Number(CityOffset) + Number(localTimezoneOffset/60);
  hourAtRequestedCity = localHour.valueOf() + offset.valueOf();
  if (DST) 
  {
    // Adjust for Daylight Savings Time.
    hourAtRequestedCity++;
  }
  if (hourAtRequestedCity > 24)
  {
    hourAtRequestedCity = hourAtRequestedCity - 24;
    isPM = false;
  }
  else if (hourAtRequestedCity > 12)
  {
    hourAtRequestedCity = hourAtRequestedCity - 12;
    isPM = true;
  }
  
  timeAtRequestedCity = hourAtRequestedCity.toString() + ":" + localMins.toString();
  if (isPM)
  {
    timeAtRequestedCity = timeAtRequestedCity + " PM.";
  }
  else
  {
    timeAtRequestedCity = timeAtRequestedCity + " AM.";
  }
  
]]> 
</script>

To help clarify the previous example, suppose that the local time for a user in Los Angeles, California is 1:14 PM, Pacific Daylight Time. The user wishes to find the time in Rome, Italy. From the SRGS grammar that is shown in the next section, the offset for Rome is 1.

  1. The now object is created and initialized with the user’s current date, with the time portion set to 13:14 (1:14 PM).

  2. localTimezoneOffset is set to 420.

  3. localHour is set to 13.

  4. localMins is set to 14.

  5. DST is set to true.

  6. offset is set to 8. This value is the result of CityOffset (1) plus localTimezoneOffset (420/60).

  7. hourAtRequestedCity is set to 21. This value is the result of localHour (13) plus offset (8).

  8. Because DST is true, hourAtRequestedCity is incremented to 22.

  9. Because hourAtRequestedCity is not larger than 24, control flows to the next if statement.

  10. Because hourAtRequestedCity is larger than 12, hourAtRequestedCity is reset to 10, and isPM is set to true.

  11. timeAtRequestedCity is set to the string value, “10:14”.

  12. Because isPM is true, the string value “ PM.” is appended to timeAtRequestedCity to form “10:14 PM.”

An external SRGS XML grammar is referred to in the VoiceXML document, in the following grammar element: <grammar type="application/srgs+xml" src="http://localhost/VoiceXmlTime/CityTimeOffsets.grxml" />. This grammar contains the names of 18 cities. If the city name that is spoken by the user matches a city name in the grammar, the returned semantic value is the offset to Coordinated Universal Time (UTC) for that city. For example, the semantic value for Atlanta, Georgia is -5, which means that the time in Atlanta is five hours earlier than the time in London.

Note Note

Alternatively, the SRGS grammar could have been put inline in the VoiceXML document. If that had been done, the src attribute would not appear, and the grammar element would resemble the following: <grammar version="1.0" root="Main" tag-format="semantics/1.0">.

The following example is the SRGS grammar.

<?xml version="1.0" encoding="UTF-8" ?>
<grammar version="1.0" xml:lang="en-US" mode="voice" root= "Main"
xmlns="http://www.w3.org/2001/06/grammar" tag-format="semantics/1.0">

<rule id="Main" scope = "public">
  <one-of>
    <item> Anchorage <tag>out="-9";</tag> </item>
    <item> Atlanta <tag>out="-5";</tag> </item>
    <item> Baltimore <tag>out="-5";</tag> </item>
    <item> Boston <tag>out="-5";</tag> </item>
    <item> Dallas <tag>out="-6";</tag> </item>
    <item> Denver <tag>out="-7";</tag> </item>
    <item> Detroit <tag>out="-5";</tag> </item>
    <item> Honolulu <tag>out="-10";</tag> </item>
    <item> London <tag>out="0";</tag> </item>
    <item> Moscow <tag>out="3";</tag> </item>
    <item> Phoenix <tag>out="-7";</tag> </item>
    <item> New York <tag>out="-5";</tag> </item>
    <item> Philadelphia <tag>out="-5";</tag> </item>
    <item> Phoenix <tag>out="-7";</tag> </item>
    <item> Rome <tag>out="1";</tag> </item>
    <item> San Francisco <tag>out="-8";</tag> </item>
    <item> Seattle <tag>out="-8";</tag> </item>
    <item> Vancouver <tag>out="-8";</tag> </item>  </one-of>
</rule>
</grammar>

Mark Parker is a programming writer at Microsoft whose current responsibility is the UCMA SDK documentation. Mark previously worked on the Microsoft Speech Server 2007 documentation.

Show: