SpeechRecognizer class

Bing
 

The SpeechRecognizer class provides the means to start, stop, and monitor speech recognition in an application. You can bind it to a SpeechRecognizerUx control or create a custom UI to expose its methods and events.

public sealed class SpeechRecognizer : IDisposable, ISpeechRecognizerStateControl

The SpeechRecognizer class has the following members.

Name

Description

SpeechRecognizer(string, SpeechAuthorizationParameters)

Initializes a new instance of the SpeechRecognizer class.

Name

Description

RecognizeSpeechToTextAsync()

Starts a speech recognition session, which captures and interprets user speech, and then returns the results as a SpeechRecognitionResult object.

StopListeningAndProcessAudio()

Interrupts the current audio capture and starts analysis on the captured audio data.

RequestCancelOperationl()

Interrupts speech recognition and returns control to the caller. This method can be called at any point in the speech recognition process.

Dispose()

Removes the current SpeechRecognizer instance and all speech data artifacts from memory.

Name

Description

AudioCaptureStateChanged

Raised when the current speech recognition session moves from one state to another.

AudioLevelChanged

Raised when the user changes their speaking volume. Use the SpeechRecognitionAudioLevelChangedEventArgs object associated with this event to get the current audio level.

RecognizerResultReceived

Raised when the SpeechRecognizer identifies a possible interpretation of user speech.

Example

The following code sample creates a complete custom speech UI, using a series of StackPanel objects (XAML) or DIVs (HTML) that are made visible at different times to reflect different phases of the speech recognition process. Before adding the code to your application, you must complete the steps described in How to: Enable a project for the Bing Speech Recognition Control.

<!DOCTYPE html>
<!--This application demonstrates a complete custom speech recognition UI-->
<html>
<head>
    <meta charset="utf-8" />
    <title>SpeechCustomUI_JS</title>

    <!-- WinJS references -->
    <link href="//Microsoft.WinJS.2.0/css/ui-dark.css" rel="stylesheet" />
    <script src="//Microsoft.WinJS.2.0/js/base.js"></script>
    <script src="//Microsoft.WinJS.2.0/js/ui.js"></script>

    <!-- SpeechCustomUI_JS references -->
    <link href="/css/default.css" rel="stylesheet" />
    <script src="/js/default.js"></script>

    <link href="Bing.Speech/css/voiceuicontrol.css" rel="stylesheet" />
    <script src="Bing.Speech/js/voiceuicontrol.js"></script>

    <style>
        body {
            text-align: center;
        }
        .panel {
            display: none;
        }
        .instructionText {
            font-size: x-large;
        }
        .explanatory {
            font-size: large;
        }
        .spokenText {
            font-size: large;
            font-style: italic;
        }
        .bigText {
            font-size: xx-large;
        }
        .biggestText {
            font-size: xx-large;
            font-weight: bold;
        }
        .buttonDiv {
            width: 2em;
            margin: 0 auto;
            font-size: xx-large;
            width: 2em;
        }
        .subTitle {
            font-size: medium;
        }
        .listBox {
            margin: 0 auto;
            display: block;
        }

    </style>
</head>
<body onload="Body_OnLoad();" >

    <!--Panel to show at application start and after cancel.-->  
    <div id="StartPanel" class="panel">
        <p class="instructionText">
            Click the microphone and get ready to say something
        </p>
        <div>
            <!--Starts speech recognition, but may not be ready immediately. -->
            <div id="SpeakButton" onclick="SpeakButton_Click();" class="buttonDiv">
                &#xe1d6;
            </div> 
        </div>
    </div>

    <!--Panel to show while initializing the SpeechRecognizer.
        This panel may not be seen if initialization happens quickly.-->
    <div id="InitPanel" class="panel">
        <p class="instructionText">
            Ready, set...
        </p>
    </div>

    <!--Panel to show while listening for user speech.-->    
        <div id="ListenPanel" class="panel">
        <p class="biggestText">
            Speak!
        </p>
        <!--Shows at different opacity levels depending on speech volume.--> 
        <div id="VolumeMeter" class="bigText" style="opacity:0">
            Volume
        </div>

        <!--Click when done speaking, or wait for app to recognize end of 
            speech.-->
        <div>
            <div id="StopButton" onclick="StopButton_Click();" 
                 class="buttonDiv">
                &#xe15b;
            </div>
            <div class="subTitle">
                Stop
            </div>
        </div>
    </div>

    <!--Panel to show while interpreting speech input.--> 
    <div id="ThinkPanel" class="panel">
        <p class="instructionText">
            Thinking...
        </p>
    </div>

    <!--Panel to show when speech recognition complete.
        May also be shown in case of exceptions.-->
    <div id="CompletePanel" class="panel">
        <p class="instructionText">
            Done.
        </p>
        <br />
        <!--Displays confidence level of final result.-->
        <div id="ConfidenceText" class="explanatory"></div>
        <!--Displays final result text.-->
        <div id="ResultText" class="spokenText"></div>
        <br />
        <!--Displays alternate results. Copies selected text to
        FinalResult.-->
        <div id="AlternatesArea">
            <div class="explanatory">But you might have said:</div>
            <div>
                <select id="AlternatesListBox" class="spokenText listBox" 
                        onchange="AlternatesListBox_SelectionChanged();">
                </select>
            </div>
        </div id="AlternatesArea">
    </div>

    <!--Shows possible text before deciding on final interpretation.
        May flash too quickly to see for easy phrases.-->
    <div id="IntResults" class="panel">
        <div class="instructionText">You might have said...</div>
        <div id="IntermediateResults" class="spokenText"></div>
    </div>

    <!--Cancel button, to be shown in all states except for application
        start -->
    <div style="position: absolute; bottom: 0; width: 100%">
        <div id="CancelButton" onclick="CancelButton_Click();" 
             class="buttonDiv">
            &#xe10a;
            <div class="subTitle">
                Cancel
            </div>
        </div>
    </div>

</body>
</html>

The <AppBarButton> XAML element is only supported in Windows 8.1 and later. If your XAML app will support Windows 8, you must either replace the <AppBarButton> elements with regular Button elements and define your own styles, or do the following additional steps.

To recreate the AppBarButtons

  1. From Solution explorer, expand the Common folder and open StandardStyles.xaml.

  2. The middle portion of the file consists of <Style> elements which have been commented out. These <Style> elements define standard styles for use in Windows Store applications, and are identified by the x:Key attribute.

    Uncomment the style definitions for MicrophoneAppBarButtonStyle, StopAppBarButtonStyle, and ClosePaneAppBarButtonStyle, and then save and close the file.

  3. In MainPage.xaml, replace the AppBarButton elements with the following Button elements.

    
    <Button x:Name="SpeakButton" Click="SpeakButton_Click" 
            Style="{StaticResource MicrophoneAppBarButtonStyle}" 
            HorizontalAlignment="Center" />
    
    <Button x:Name="StopButton" Click="StopButton_Click"
            Style="{StaticResource StopAppBarButtonStyle}"  
            AutomationProperties.Name="Done"  
            HorizontalAlignment="Center" Margin="0,70, 0, 0" />
    
    <Button x:Name="CancelButton" Visibility="Collapsed" Content="&#xE10A;" 
            Style="{StaticResource ClosePaneAppBarButtonStyle}" 
            AutomationProperties.Name="Cancel" HorizontalAlignment="Center" 
            VerticalAlignment="Bottom" Click="CancelButton_Click" />
    
    

Example

The following code loads a SpeechRecognizer object and handles its events. It shows or hides the different panels to reflect UI state changes, and then displays the final text in ResultText and alternate text in AlternatesListBox. Fill in your own ClientID and ClientSecret values before building the project.

using System;
using Windows.UI.Xaml;
using Windows.UI.Xaml.Controls;
using Bing.Speech;

namespace SpeechCustomUi
{
    public sealed partial class MainPage : Page
    {
        public MainPage()
        {
            this.InitializeComponent();
            this.Loaded += MainPage_Loaded;
        }

        SpeechRecognizer SR;
        private void MainPage_Loaded(object sender, RoutedEventArgs e)
        {
            // Apply credentials from the Windows Azure Data Marketplace.
            var credentials = new SpeechAuthorizationParameters();
            credentials.ClientId = "<YOUR CLIENT ID>";
            credentials.ClientSecret = "<YOUR CLIENT SECRET>";

            // Initialize the speech recognizer.
            SR = new SpeechRecognizer("en-US", credentials);

            // Add speech recognition event handlers.
            SR.AudioCaptureStateChanged += SR_AudioCaptureStateChanged;
            SR.AudioLevelChanged += SR_AudioLevelChanged;
            SR.RecognizerResultReceived += SR_RecognizerResultReceived;
        }

        void SR_RecognizerResultReceived(SpeechRecognizer sender,
            SpeechRecognitionResultReceivedEventArgs args)
        {
            IntermediateResults.Text = args.Text;
        }

        void SR_AudioLevelChanged(SpeechRecognizer sender,
            SpeechRecognitionAudioLevelChangedEventArgs args)
        {
            var v = args.AudioLevel;
            if (v > 0) VolumeMeter.Opacity = v / 50;
            else VolumeMeter.Opacity = Math.Abs((v - 50) / 100);
        }

        void SR_AudioCaptureStateChanged(SpeechRecognizer sender,
            SpeechRecognitionAudioCaptureStateChangedEventArgs args)
        {
            // Show the panel that corresponds to the current state.
            switch (args.State)
            {
                case SpeechRecognizerAudioCaptureState.Complete:
                    if (uiState == "ListenPanel" || uiState == "ThinkPanel")
                    {
                        SetPanel(CompletePanel);  
                    }
                    break;
                case SpeechRecognizerAudioCaptureState.Initializing:
                    SetPanel(InitPanel);
                    break;
                case SpeechRecognizerAudioCaptureState.Listening:
                    SetPanel(ListenPanel);
                    break;
                case SpeechRecognizerAudioCaptureState.Thinking:
                    SetPanel(ThinkPanel);
                    break;
                default:
                    break;
            }
        }

        string uiState = "";
        private void SetPanel(StackPanel panel)
        {
            // Hide all the panels.
            InitPanel.Visibility = Visibility.Collapsed;
            ListenPanel.Visibility = Visibility.Collapsed;
            ThinkPanel.Visibility = Visibility.Collapsed;
            CompletePanel.Visibility = Visibility.Collapsed;
            StartPanel.Visibility = Visibility.Collapsed;

            // Show the selected panel and the cancel button.
            panel.Visibility = Visibility.Visible;
            CancelButton.Visibility = Visibility.Visible;

            uiState = panel.Name;
        }


        private async void SpeakButton_Click(object sender, RoutedEventArgs e)
        {
            // Always use a try block because RecognizeSpeechToTextAsync
            // depends on a web service.
            try
            {
                // Start speech recognition.
                var result = await SR.RecognizeSpeechToTextAsync();

                // Display the text.
                FinalResult.Text = result.Text;

                // Show the TextConfidence.
                ShowConfidence(result.TextConfidence);

                // Fill a string array with the alternate results.
                var alternates = result.GetAlternates(5);
                if (alternates.Count > 1)
                {
                    string[] s = new string[alternates.Count];
                    for (int i = 1; i < alternates.Count; i++)
                    {
                        s[i] = alternates[i].Text;
                    }

                    // Populate the alternates ListBox with the array.
                    AlternatesListBox.ItemsSource = s;
                    AlternatesTitle.Visibility = Visibility.Visible;
                }
                else
                {
                    AlternatesTitle.Visibility = Visibility.Collapsed;
                }

                //AlternatesListBox.ItemsSource = result.GetAlternates(5);
            }
            catch (Exception ex)
            {
                // If there's an exception, show it in the Complete panel.
                if (ex.GetType() != typeof(OperationCanceledException))
                {
                    FinalResult.Text = string.Format("{0}: {1}",
                                ex.GetType().ToString(), ex.Message);
                    SetPanel(CompletePanel); 
                }
            }
        }

        private void ShowConfidence(SpeechRecognitionConfidence confidence)
        {
            switch (confidence)
            {
                case SpeechRecognitionConfidence.High:
                    ConfidenceText.Text = "I am almost sure you said:";
                    break;
                case SpeechRecognitionConfidence.Medium:
                    ConfidenceText.Text = "I think you said:";
                    break;
                case SpeechRecognitionConfidence.Low:
                    ConfidenceText.Text = "I think you might have said:";
                    break;
                case SpeechRecognitionConfidence.Rejected:
                    ConfidenceText.Text = "I'm sorry, I couldn't understand you."
                    + " Please click the Cancel button and try again.";
                    break;
            }
        }

        private void CancelButton_Click(object sender, RoutedEventArgs e)
        {
            // Cancel the current speech session and return to start.
            SR.RequestCancelOperation();
            SetPanel(StartPanel);
            CancelButton.Visibility = Visibility.Collapsed;
        }

        private void StopButton_Click(object sender, RoutedEventArgs e)
        {
            // Stop listening and move to Thinking state.
            SR.StopListeningAndProcessAudio();
        }

        private void AlternatesListBox_SelectionChanged(object sender, 
            SelectionChangedEventArgs e)
        {
            // Check in case the ListBox is still empty.
            if (null != AlternatesListBox.SelectedItem)
            {
                // Put the selected text in FinalResult and clear ConfidenceText.
                FinalResult.Text = AlternatesListBox.SelectedItem.ToString();
                ConfidenceText.Text = ""; 
            }
        }
    }
}

Requirements

Minimum Supported Client

Windows 8

Required Extensions

Bing.Speech

Namespace

Bing.Speech

Show: