Adding, loading, and preloading grammars for Windows Phone 8

[ This article is for Windows Phone 8 developers. If you’re developing for Windows 10, see the latest documentation. ]

A speech recognizer needs one or more grammars to perform recognition.

This topic contains the following sections.

Providing the speech recognizer with a grammar

To provide the speech recognizer with a grammar, you add a grammar to the speech recognizer’s grammar set (SpeechGrammarSet). The Windows.Phone.Speech.Recognition API gives you the following three methods for adding a grammar to a speech recognizer's grammar set. Each method creates a SpeechGrammar instance from a different type of grammar, and then adds it to the speech recognizer's grammar set.

An app adds one or more grammars to a speech recognizer's grammar set before the speech recognizer begins the recognition process. Each instance of a speech recognizer has only one grammar set. The speech recognizer loads its grammar set at the beginning of a recognition operation, unless it has been preloaded. If its grammar set is empty, by default the speech recognizer loads the predefined dictation grammar at the start of recognition.

You can add only specific combinations of grammars to a grammar set, indicated as follows:

  • A grammar set can contain only one dictation grammar or web search grammar. In addition, if a grammar set contains a dictation or web search grammar, it cannot contain any list grammars or SRGS grammars.

  • If a grammar set contains more than one grammar, each grammar must be either a list grammar or an SRGS grammar.

This means that an instance of a speech recognizer can load its grammar set if it contains one of the following:

  • A single dictation grammar OR a single web search grammar.

  • One or more list grammars or SRGS grammars.

If the grammar set contains large or many grammars, the time it takes the speech recognizer to load its grammar set may delay the start of recognition. To prevent a delay, you can preload a grammar set using the SpeechRecognizerPreloadGrammarsAsync()()() method before initiating a recognition operation. This loads all the grammars in a grammar set as a separate operation, and not as part of beginning recognition.

The following example illustrates how to use SpeechRecognizerPreloadGrammarsAsync()()() to load a grammar set in advance of starting the recognition operation.

// Declare a SpeechRecognizerUI object global to the page or app.
    SpeeechRecognizerUI recoWithUI;

    public async void InitializeRecognition()
    {
      // Initialize objects ahead of time to avoid delays when starting recognition.
      recoWithUI = new SpeechRecognizerUI();

      // Initialize a URI with a path to the SRGS-compliant XML file.
      Uri orderPizza = new Uri("ms-appx:///OrderPizza.grxml", UriKind.Absolute);

      // Add an SRGS-compliant XML grammar to the grammar set.
      recoWithUI.Recognizer.Grammars.AddGrammarFromUri("PizzaGrammar", orderPizza);

      // Preload the grammar set.
      await recoWithUI.Recognizer.PreloadGrammarsAsync();

      // Display text to prompt the user's input.
      recoWithUI.Settings.ListenText = "What kind of pizza do you want?";

      // Display an example of ideal expected input.
      recoWithUI.Settings.ExampleText = "Large combination with Italian sausage";
    }

    // Start recognition when the user clicks this button.
    private async void ButtonSR_Click(object sender, EventArgs e)
    {
      SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync();
    }

Working with short message dictation grammars

The following example creates a SpeechRecognizerUI instance, and then calls the SpeechRecognizerUIRecognizeWithUIAsync()()() method to start speech recognition. Because no grammars have been added to the speech recognizer's grammar set, by default the speech recognizer loads the predefined dictation grammar before it begins the recognition operation.

private async void ButtonSR_Click(object sender, RoutedEventArgs e)
{
  SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI();

  // Load the pre-defined dictation grammar by default
  // and start recognition.
  SpeechRecognitionUIResult recoResult = await recoWithUI.RecognizeWithUIAsync();

  // Do something with the recognition result
  MessageBox.Show(string.Format("You said {0}.", recoResult.RecognitionResult.Text));
}

You can also explicitly load a dictation grammar using the SpeechGrammarSetAddGrammarFromPredefinedType(String, SpeechPredefinedGrammar) method.

Working with web search grammars

Unlike a dictation grammar, which may be loaded by default, you must add a web search grammar to a speech recognizer's grammar set before starting recognition. The following example shows how to add a web search grammar to an instance of a speech recognizer using the AddGrammarFromPredefinedType method. The method passes in a name for the grammar and the WebSearch value from the SpeechPredefinedGrammar enumeration in the call. On the call to SpeechRecognizerUIRecognizeWithUIAsync()()(), the speech recognizer loads the web search grammar from its grammar set and begins recognition.

private async void ButtonWeatherSearch_Click(object sender, RoutedEventArgs e)
{
    SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI();

  // Add the pre-defined web search grammar to the grammar set.
  recoWithUI.Recognizer.Grammars.AddGrammarFromPredefinedType ("weatherSearch",
  SpeechPredefinedGrammar.WebSearch);

  // Display text to prompt the user's input.
  recoWithUI.Settings.ListenText = "Say what you want to search for";

  // Display an example of ideal expected input.
    recoWithUI.Settings.ExampleText = @"Ex. 'weather for London'";

  // Load the grammar set and start recognition.
    SpeechRecognitionUIResult result = await recoWithUI.RecognizeWithUIAsync();
}

Warning

A grammar set can contain only one of either a dictation grammar or a web search grammar. Because the dictation and web search grammars perform recognition remotely, you can’t include them in a grammar set that contains grammars that perform recognition locally (as in a list grammar or an SRGS grammar) on the phone.

Working with programmatic list grammars

The following example creates a programmatic list grammar from a string array, and then adds the created grammar to the speech recognizer's grammar set. The M:Windows.Phone.Speech.Recognition.SpeechGrammarSet.AddGrammarFromList(System.String,Windows.Foundation.Collections.IIterable`1) method takes a name of the grammar and the string array as arguments. On the call to SpeechRecognizerUIRecognizeWithUIAsync()()(), the speech recognizer loads the list grammar from its grammar set and begins recognition.

private async void ButtonSR_Click(object sender, RoutedEventArgs e)
{
    SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI();

    // You can create this string dynamically, for example from a movie queue.
    string[] movies = { "Play The Cleveland Story", "Play The Office",
      "Play Psych", "Play Breaking Bad", "Play Valley of the Sad", "Play Shaking Mad" };

    // Create a grammar from the string array and add it to the grammar set.
    recoWithUI.Recognizer.Grammars.AddGrammarFromList("myMovieList", movies);

    // Display an example of ideal expected input.
    recoWithUI.Settings.ExampleText = @"ex. 'Play New Mocumentaries'";

    // Load the grammar set and start recognition.
    SpeechRecognitionUIResult result = await recoWithUI.RecognizeWithUIAsync();

    // Play movie given in result.Text
}

Keep the following points in mind:

Working with SRGS grammars

To add an SRGS grammar to an instance of a speech recognizer, use the SpeechGrammarSetAddGrammarFromUri(String, Uri) method. Pass in a name for the grammar and a URI that creates an absolute reference to an SRGS grammar file.

The following example creates a new SpeechRecognizerUI instance and adds two SRGS grammars to the speech recognizer's grammar set. The CitiesList grammar is used to recognize city names, and the YesNo grammar is used to recognize "yes" or "no" to confirm the user’s city choice. Because the CitiesList grammar contains a large list of cities, the example preloads the grammars to avoid possible delays when starting recognition. It then displays text to prompt the user for input and gives an example of the expected input. The calls to SpeechRecognizerUIRecognizeWithUIAsync()()() begin recognition.

private async void CityPicker_Click(object sender, RoutedEventArgs e)
{

  // Initialize a SpeechRecognizerUI object.
  SpeechRecognizerUI recoWithUI = new SpeechRecognizerUI();

  // Initialize URIs with paths to the SRGS-compliant XML files.
  Uri citiesGrammar = new Uri("ms-appx:///CitiesList.grxml", UriKind.Absolute);
  Uri yesNoGrammar = new Uri("ms-appx:///YesNo.grxml", UriKind.Absolute);

  // Add the SRGS grammars to the grammar set.
  recoWithUI.Recognizer.Grammars.AddGrammarFromUri("cities", citiesGrammar);
  recoWithUI.Recognizer.Grammars.AddGrammarFromUri("yesNo", yesNoGrammar);

  // Preload the SRGS grammars to avoid possible delays while loading.
  await recoWithUI.Recognizer.PreloadGrammarsAsync();

  // Disable the yesNo grammar, it won't be needed for the first recognition.
  recoWithUI.Recognizer.Grammars["yesNo"].Enabled = false;

  // Display text to prompt the user's input.
  recoWithUI.Settings.ListenText = "Fly from which city?";

  // Display an example of ideal expected input.
  recoWithUI.Settings.ExampleText = @"Ex. 'Rome', 'Sao Paulo', 'Tokyo'";

  // Start recognition for the name of the origin city.
  SpeechRecognitionUIResult cityName = await recoWithUI.RecognizeWithUIAsync();

  // Display text to prompt the user to confirm the origin city.
  recoWithUI.Settings.ListenText = string.Format("Fly from" + cityName.RecognitionResult.Semantics["fromCity"].Value + "?");

  // Display an example of ideal expected input.
  recoWithUI.Settings.ExampleText = @"Ex. 'Yes', 'No'";
            
  // Disable the cities grammar, enable the yesNo grammar.
  recoWithUI.Recognizer.Grammars["cities"].Enabled = false;
  recoWithUI.Recognizer.Grammars["yesNo"].Enabled = true;

  // Start recognition for the origin city confirmation.
  SpeechRecognitionUIResult confirm = await recoWithUI.RecognizeWithUIAsync();

  // Do something with the confirmation result.
}

Also keep the following points in mind:

  • You can create and load multiple SRGS grammars to a speech recognizer's grammar set with successive calls to the SpeechGrammarSetAddGrammarFromUri(String, Uri) method.

  • To make sure that an SRGS grammar is correctly deployed, add it to your solution. In Solution Explorer, select your Windows Phone app project. On the Project menu, click Add Existing Item, and add the SRGS grammar. Set the Build Action property for the SRGS grammar to Content, and set the Copy To Output Directory property to Copy if newer. After adding the grammar, use the path syntax given in this example to reference the grammar.

  • You can’t add an SRGS grammar to a grammar set that contains a dictation grammar or a web search grammar.

  • The grammar document specified in the argument to the SpeechGrammarSetAddGrammarFromUri(String, Uri) method must be local. You cannot reference a grammar that’s located on the internet.

  • An accepted convention is to use the .grxml file extension for XML-based grammar documents that conform to SRGS rules.

With multiple grammars loaded into a speech recognizer's grammar set, your app can selectively enable and disable the grammars as users navigate through your app. This ensures that your app listens only for what is pertinent to the current app context. For more info, see Managing loaded grammars to optimize recognition for Windows Phone 8.