SpeechRecognizer class

Bing
 

The SpeechRecognizer class provides the means to start, stop, and monitor speech recognition in an application. You can bind it to a SpeechRecognizerUx control or create a custom UI to expose its methods and events.

public sealed class SpeechRecognizer : IDisposable, ISpeechRecognizerStateControl

The SpeechRecognizer class has the following members.

Name

Description

SpeechRecognizer(string, SpeechAuthorizationParameters)

Initializes a new instance of the SpeechRecognizer class.

Name

Description

RecognizeSpeechToTextAsync()

Starts a speech recognition session, which captures and interprets user speech, and then returns the results as a SpeechRecognitionResult object.

StopListeningAndProcessAudio()

Interrupts the current audio capture and starts analysis on the captured audio data.

RequestCancelOperationl()

Interrupts speech recognition and returns control to the caller. This method can be called at any point in the speech recognition process.

Dispose()

Removes the current SpeechRecognizer instance and all speech data artifacts from memory.

Name

Description

AudioCaptureStateChanged

Raised when the current speech recognition session moves from one state to another.

AudioLevelChanged

Raised when the user changes their speaking volume. Use the SpeechRecognitionAudioLevelChangedEventArgs object associated with this event to get the current audio level.

RecognizerResultReceived

Raised when the SpeechRecognizer identifies a possible interpretation of user speech.

Example

The following code sample creates a complete custom speech UI, using a series of StackPanel objects (XAML) or DIVs (HTML) that are made visible at different times to reflect different phases of the speech recognition process. Before adding the code to your application, you must complete the steps described in How to: Enable a project for the Bing Speech Recognition Control.

<Page
    x:Class="SpeechCustomUi.MainPage"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:local="using:SpeechCustomUi"
    xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    mc:Ignorable="d">

    <!--This application demonstrates a complete custom speech recognition UI-->
    <Grid Background="{StaticResource ApplicationPageBackgroundThemeBrush}">
        <Grid.ColumnDefinitions>
            <ColumnDefinition Width="632*"/>
            <ColumnDefinition Width="51*"/>
        </Grid.ColumnDefinitions>

        <!--Panel to show at application start and after cancel.-->
        <StackPanel x:Name="StartPanel" Visibility="Visible">
            <TextBlock Text="Click the microphone and get ready to say something" 
                       FontSize="50" HorizontalAlignment="Center" 
                       VerticalAlignment="Center" />
            <!--Starts speech recognition, but may not be ready immediately. -->
            <AppBarButton x:Name="SpeakButton" Icon="Microphone" Click="SpeakButton_Click"></AppBarButton>
        </StackPanel>

        <!--Panel to show while initializing the SpeechRecognizer.
            This panel may not be seen if initialization happens quickly.-->
        <StackPanel x:Name="InitPanel" Visibility="Collapsed" >
            <TextBlock Text="Ready, set..." FontSize="50" 
                       HorizontalAlignment="Center" VerticalAlignment="Center" />
        </StackPanel>

        <!--Panel to show while listening for user speech.-->        
        <StackPanel x:Name="ListenPanel" Visibility="Collapsed" >
            <TextBlock Text="Speak!" FontSize="80" 
                       HorizontalAlignment="Center" />
            <!--Shows at different opacity levels depending on speech volume.-->            
            <TextBlock x:Name="VolumeMeter" Text="Volume" FontSize="60" 
                       HorizontalAlignment="Center" Margin="0,80,0,0" />
            <!--Click when done speaking, or wait for app to recognize end of 
                speech.-->
            <AppBarButton x:Name="StopButton" Icon="Stop" 
                          HorizontalAlignment="Center" Margin="0,70, 0, 0" 
                          Click="StopButton_Click">Done
            </AppBarButton>
        </StackPanel>

        <!--Panel to show while interpreting speech input.-->        
        <StackPanel x:Name="ThinkPanel" Visibility="Collapsed" >
            <TextBlock Text="Thinking..." FontSize="60" 
                       HorizontalAlignment="Center" />
            <TextBlock Text="You might have said:" FontSize="40" 
                       HorizontalAlignment="Center" Margin="0,50,0,0" />
            <!--Shows possible text before deciding on final interpretation.
                May flash too quickly to see for easy phrases.-->
            <TextBlock x:Name="IntermediateResults" FontSize="40" 
                       HorizontalAlignment="Center" Margin="0,30,0,0" />
        </StackPanel>

        <!--Panel to show when speech recognition complete.
            May also be shown in case of exceptions.-->
        <StackPanel x:Name="CompletePanel" Visibility="Collapsed" >
            <TextBlock Text="Done." FontSize="60" 
                       HorizontalAlignment="Center" />
            <!--Displays confidence level of final result.-->
            <TextBlock x:Name="ConfidenceText" FontSize="40" 
                       HorizontalAlignment="Center" Margin="0,50,0,0" />
            <!--Displays final result text.-->
            <TextBlock x:Name="FinalResult" FontSize="40" 
                       HorizontalAlignment="Center" Margin="0,30,0,0" />
            <TextBlock x:Name="AlternatesTitle" Text="But you might have said:" 
                       FontSize="40" HorizontalAlignment="Center" 
                       Margin="0,50,0,0" />
            <!--Displays alternate results. Copies selected text to 
                FinalResult.-->
            <ListBox x:Name="AlternatesListBox" HorizontalAlignment="Center" 
                     SelectionChanged="AlternatesListBox_SelectionChanged" />
        </StackPanel>

        <!--Cancel button, to be shown in all states except for application
            start -->
        <AppBarButton x:Name="CancelButton" Icon="Cancel" Click="CancelButton_Click"
                      VerticalAlignment="Bottom" HorizontalAlignment="Center"
                      Visibility="Collapsed">
        </AppBarButton>
    </Grid>
</Page>

The <AppBarButton> XAML element is only supported in Windows 8.1 and later. If your XAML app will support Windows 8, you must either replace the <AppBarButton> elements with regular Button elements and define your own styles, or do the following additional steps.

To recreate the AppBarButtons

  1. From Solution explorer, expand the Common folder and open StandardStyles.xaml.

  2. The middle portion of the file consists of <Style> elements which have been commented out. These <Style> elements define standard styles for use in Windows Store applications, and are identified by the x:Key attribute.

    Uncomment the style definitions for MicrophoneAppBarButtonStyle, StopAppBarButtonStyle, and ClosePaneAppBarButtonStyle, and then save and close the file.

  3. In MainPage.xaml, replace the AppBarButton elements with the following Button elements.

    
    <Button x:Name="SpeakButton" Click="SpeakButton_Click" 
            Style="{StaticResource MicrophoneAppBarButtonStyle}" 
            HorizontalAlignment="Center" />
    
    <Button x:Name="StopButton" Click="StopButton_Click"
            Style="{StaticResource StopAppBarButtonStyle}"  
            AutomationProperties.Name="Done"  
            HorizontalAlignment="Center" Margin="0,70, 0, 0" />
    
    <Button x:Name="CancelButton" Visibility="Collapsed" Content="&#xE10A;" 
            Style="{StaticResource ClosePaneAppBarButtonStyle}" 
            AutomationProperties.Name="Cancel" HorizontalAlignment="Center" 
            VerticalAlignment="Bottom" Click="CancelButton_Click" />
    
    

Example

The following code loads a SpeechRecognizer object and handles its events. It shows or hides the different panels to reflect UI state changes, and then displays the final text in ResultText and alternate text in AlternatesListBox. Fill in your own ClientID and ClientSecret values before building the project.

var speechRecognizer;
function Body_OnLoad() {
    // Show start panel.
    document.getElementById("StartPanel").style.display = "block";

    // Apply credentials from the Windows Azure Data Marketplace.
    var credentials = new Bing.Speech.SpeechAuthorizationParameters();
    credentials.clientId = "<YOUR CLIENT ID>";
    credentials.clientSecret = "<YOUR CLIENT SECRET>";

    // Initialize the speech recognizer.
    speechRecognizer = new Bing.Speech.SpeechRecognizer("en-US", credentials);

    // Add speech recognition event handlers.
    speechRecognizer.onaudiocapturestatechanged = SpeechRecognizer_AudioCaptureStateChanged;
    speechRecognizer.onaudiolevelchanged = SpeechRecognizer_AudioLevelChanged;
    speechRecognizer.onrecognizerresultreceived = SpeechRecognizer_RecognizerResultReceived;
}

var cancelled;
function SpeechRecognizer_AudioCaptureStateChanged(args) {
    // Show the div that corresponds to the current state.
    switch (args.state) {
        case SpeechRecognizerAudioCaptureState.Complete:
            document.getElementById("IntResults").style.display = "none";
            if (!cancelled) SetPanel("CompletePanel");
            break;
        case SpeechRecognizerAudioCaptureState.Initializing:
            SetPanel("InitPanel");
            break;
        case SpeechRecognizerAudioCaptureState.Listening:
            SetPanel("ListenPanel");
            break;
        case SpeechRecognizerAudioCaptureState.Thinking:
            SetPanel("ThinkPanel");
            break;
        default:
            break;
    }
}

function SetPanel(panelId) {
    // Hide all the Panels.
    document.getElementById("InitPanel").style.display = "none";
    document.getElementById("ListenPanel").style.display = "none";
    document.getElementById("ThinkPanel").style.display = "none";
    document.getElementById("CompletePanel").style.display = "none";
    document.getElementById("StartPanel").style.display = "none";

    // Show the selected Div and the cancel button.
    document.getElementById(panelId).style.display = "block";
}

function SpeakButton_Click() {
    // Reset the cancel state.
    this.cancelled = false;
    document.getElementById("CancelButton").style.display = "block";
    document.getElementById("StopButton").style.display = "block";

    // Clear the alternates list.
    document.getElementById("AlternatesListBox").innerHTML = "";
    document.getElementById("AlternatesArea").style.display = "none";

    // Declare a string to hold the result text.
    var s = "";

    // Start speech recognition.
    speechRecognizer.recognizeSpeechToTextAsync()
            .then(
                // Write the result to the string.
                function (result) {

                    /* result.text should return a string, but if the user speaks too quietly
                    or is unclear, result.text will return an error object, so we have 
                    to catch it here to prevent interruption. */
                    if (typeof (result.text) == "string") {
                        s = result.text;

                        // Show text confidence.
                        ShowConfidence(result.textConfidence)

                        // If there are alternate results, put them into AlternatesListBox.
                        var alternates = result.getAlternates(5);
                        if (alternates.length > 1) {
                            for (var i = 0; i < alternates.length; i++) {
                                var opt = document.createElement("option");
                                opt.innerHTML = alternates[i].text;
                                document.getElementById("AlternatesListBox").appendChild(opt);
                                document.getElementById("AlternatesArea").style.display = "block";
                            }
                        }
                    }
                    else {
                        // Handle speech that is too quiet or unclear.
                        s = "I'm sorry. I couldn't understand you."
                    }
                },
                // If there's another error, write the error number and message to the string.
                function (error) {
                    s = "Error: (" + error.number + ") " + error.message;
                }
            )
        .done(
        // Write the string to ResultText.
        function (result) {
            document.getElementById("ResultText").innerHTML = window.toStaticHTML(s);
        }
    );
}

function AlternatesListBox_SelectionChanged(sender, e) {
    // Set ResultText to display the selected alternate.
    var alts = document.getElementById("AlternatesListBox");
    var item = alts.childNodes[alts.selectedIndex];
    document.getElementById("ResultText").innerText = item.textContent;
    document.getElementById("ConfidenceText").style.display = "none";
}

function CancelButton_Click(sender, e) {
    // Set the cancelled flag and hide the cancel button.
    this.cancelled = true;
    document.getElementById("CancelButton").style.display = "none";

    // Cancel the current speech session and return to start.
    speechRecognizer.requestCancelOperation();
    SetPanel("StartPanel");
}

function StopButton_Click(sender, e) {
    // Clear the stop button and stop the audio stream.
    document.getElementById("StopButton").style.display = "none";
    speechRecognizer.stopListeningAndProcessAudio();
}

function SpeechRecognizer_AudioLevelChanged(args) {
    // Set the opacity of the volume meter to match the sound coming in.
    var volumeMeter = document.getElementById("VolumeMeter");
    var v = args.audioLevel;
    if (v > 0) volumeMeter.style.opacity = v / 50;
    else volumeMeter.style.opacity = Math.abs((v - 50) / 100);
}

function SpeechRecognizer_RecognizerResultReceived(args) {
    // Write intermediate results to the screen as they come in.
    document.getElementById("IntResults").style.display = "block";
    if (typeof (args.text) == "string") {
        document.getElementById("IntermediateResults").innerText = args.text;
    }
}

function ShowConfidence(confidence) {
    var confidenceText = document.getElementById("ConfidenceText");
    confidenceText.style.display = "block";

    switch (confidence) {
        case Bing.Speech.SpeechRecognitionConfidence.high:
            confidenceText.innerText = "I am almost sure you said:";
            break;
        case Bing.Speech.SpeechRecognitionConfidence.medium:
            confidenceText.innerText = "I think you said:";
            break;
        case Bing.Speech.SpeechRecognitionConfidence.low:
            confidenceText.innerText = "I think you might have said:";
            break;
        case Bing.Speech.SpeechRecognitionConfidence.rejected:
            confidenceText.innerText = "I'm sorry, I couldn't understand you."
            + "\nPlease click the Cancel button and try again.";
            break;
    }
}

Requirements

Minimum Supported Client

Windows 8

Required Extensions

Bing.Speech

Namespace

Bing.Speech

Show: