How to manage issues with audio input (HTML)

Learn how to manage issues with speech-recognition accuracy caused by audio-input quality and condition.

What you need to know

Technologies

Prerequisites

This topic builds on Quickstart: Speech recognition.

To complete this tutorial, have a look through these topics to get familiar with the technologies discussed here:

Instructions

Step 1: Assess audio-input quality

When speech recognition is active, use the RecognitionQualityDegrading event of your speech recognizer to determine whether one or more audio issues might be interfering with speech input. The event argument (SpeechRecognitionQualityDegradingEventArgs) provides the problem property, which describes the issues detected with the audio input.

Recognition can be affected by too much background noise, a muted microphone, and the volume or speed of the speaker.

Here, we configure a speech recognizer and start listening for the recognitionqualitydegrading event.

function buttonSpeechRecognizerQualityDegradingClick() {
    // Create an instance of a speech recognizer.
    var speechRecognizer =
      new Windows.Media.SpeechRecognition.SpeechRecognizer();
    // Listen for audio input issues.
    speechRecognizer.addEventListener(
        "recognitionqualitydegrading",
        onRecognitionQualityDegrading,
        false);
    // Compile the default dictation grammar.
    speechRecognizer.compileConstraintsAsync().then(
      // Success function.
      function (result) {
          // Start recognition.
          speechRecognizer.recognizeWithUIAsync().done();
      },
      // Error function.
      function (err) {
          WinJS.log && WinJS.log("Constraint compilation failed.");
      });
    speechRecognizer.close();
}

Step 2: Manage the speech-recognition experience

Use the description provided by the problem property to help the user improve conditions for recognition.

Here, we create a handler for the recognitionqualitydegrading event that checks for a low volume level. We then use a SpeechSynthesizer object to suggest that the user try speaking louder.

function onRecognitionQualityDegrading(args) {
    // Create an instance of a speech synthesis engine (voice).
    var speechSynthesizer =
        new Windows.Media.SpeechSynthesis.SpeechSynthesizer();

    // Create an object for controlling and playing an audio stream.
    var audio = new Audio();

    // If input speech is too quiet, prompt the user to speak louder.
    if (args.problem === Windows.Media.SpeechRecognition.SpeechRecognitionAudioProblem.tooQuiet) {
        // Generate the audio stream from plain text.
        speechSynthesizer.synthesizeTextToStreamAsync("Try speaking louder").done(
            // Success function.
            function (markersStream) {
                // Convert the stream to a URL Blob.
                var blob = MSApp.createBlobFromRandomAccessStream(markersStream.ContentType, markersStream);

                // Send the Blob to the audio object.
                audio.src = URL.createObjectURL(blob, { oneTimeOnly: true });

                // Start at beginning.
                markersStream.seek(0);
                audio.play();
            },
            // Error function.
            function (err) {
                WinJS.log && WinJS.log("Speech synthesis failed.");
            });
    }
    speechSynthesizer.close();
}

Responding to speech interactions

Designers

Speech design guidelines