Creating Speech Recognition Calculators in UCMA 3.0: Introduction (Part 1 of 4)

Article
01/20/2015

Summary: Add speech recognition and speech synthesis to your Microsoft Unified Communications Managed API (UCMA) 3.0 application by incorporating the recognition and synthesis APIs of Microsoft Speech Platform SDK.

Applies to: Microsoft Unified Communications Managed API (UCMA) 3.0 Core SDK | Microsoft Speech Platform SDK

Published: November 2011 | Provided by: Mark Parker, Microsoft | About the Author

Contents

Introduction
Application Description
Part 2
Additional Resources

This article is the first in a four-part series of articles about how to create a calculator that uses speech recognition and speech synthesis.

Introduction

UCMA 3.0 provides two classes that developers can use to incorporate speech recognition and speech synthesis into their applications. These classes are the SpeechRecognitionConnector and SpeechSynthesisConnector classes.

Application Description

This set of articles describes a UCMA 3.0 application that responds to a spoken question from a user in the form of an arithmetic expression, such as “how much is six times nine.” After the application determines the semantic content of the question, it speaks the computed answer to the user. The application can respond to multiple questions, and stops when the user says “exit.”

The speech recognition engine compares the audio of the user’s utterance with the speech recognition grammar. If the user’s speech matches the rules of the grammar, the grammar returns semantic results that correspond to the spoken arithmetic expression. The application evaluates the arithmetic expression, and uses speech synthesis to speak the answer.

Part 2

Creating Speech Recognition Calculators in UCMA 3.0: Grammar Creation (Part 2 of 4)

Additional Resources

For more information, see the following resources:

About the Author

Mark Parker is a programming writer at Microsoft whose current responsibility is the UCMA SDK documentation. Mark previously worked on the Microsoft Speech Server 2007 documentation.