Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

Microsoft Speech Platform

SPPHRASEELEMENT

SPPHRASEELEMENT contains the information for a spoken word.

<pre IsFakePre="true" xmlns="http://www.w3.org/1999/xhtml"> <strong>typedef struct SPPHRASEELEMENT</strong> <strong>{</strong> <strong>ULONG</strong> <em>ulAudioTimeOffset</em>; <strong>ULONG</strong> <em>ulAudioSizeTime</em>; <strong>ULONG</strong> <em>ulAudioStreamOffset</em>; <strong>ULONG</strong> <em>ulAudioSizeBytes</em>; <strong>ULONG</strong> <em>ulRetainedStreamOffset</em>; <strong>ULONG</strong> <em>ulRetainedSizeBytes</em>; <strong>LPCWSTR </strong> *<em>pszDisplayText</em>; <strong>LPCWSTR</strong> *<em>pszLexicalForm</em>; <strong>const SPPHONEID</strong> *<em>pszPronunciation</em>; <strong>BYTE</strong> <em>bDisplayAttributes</em>; <strong>char</strong> <em>RequiredConfidence</em>; <strong>char</strong> <em>ActualConfidence</em>; <strong>BYTE</strong> <em>Reserved</em>; <strong>float</strong> <em>SREngineConfidence</em>; <strong>} SPPHRASEELEMENT;</strong></pre>

Members

  • ulAudioTimeOffset
    The starting offset of the element in 100-nanosecond units of time relative to the start of the phrase.
  • ulAudioSizeTime
    The length of the element in 100-nanosecond units of time.
  • ulAudioStreamOffset
    The starting offset of the element in bytes relative to the start of the phrase in the original input stream.
  • ulAudioSizeBytes
    The size of the element in bytes in the original input stream.
  • ulRetainedStreamOffset
    The starting offset of the element in bytes relative to the start of the phrase in the retained audio stream
  • ulRetainedSizeBytes
    The size of the element in bytes in the retained audio stream.
  • pszDisplayText
    The display text for this element (for example, ",").
  • pszLexicalForm
    The lexical form of this element (for example, "comma" for ",").
  • pszPronunciation
    The pronunciation for this element as a null-terminated array of SPPHONEID.
  • bDisplayAttributes
    A bit field of SPDISPLAYATTRIBUTES defining extra display information which the application should honor when displaying this word.
  • RequiredConfidence
    The required confidence for this element (either SP_LOW_CONFIDENCE, SP_NORMAL_CONFIDENCE, or SP_HIGH_CONFIDENCE). If a word is prefixed with a '-' (minus), the RequiredConfidence is SP_LOW_CONFIDENCE, and '+' (plus) will set this field to SP_HIGH_CONFIDENCE (for example, "This -is -a +test").
  • ActualConfidence
    The actual confidence for this element (either SP_LOW_CONFIDENCE, SP_NORMAL_CONFIDENCE, or SP_HIGH_CONFIDENCE). This is always at least the RequiredConfidence.
  • Reserved
    Reserved for future use.
  • SREngineConfidence
    The confidence score computed by the SR engine. The value range is engine dependent. It can be used to optimize an application's performance with a specific engine. Using this value will improve the application with a particular speech engine but more than likely will make it worse with other engines and should be used with care. This value is more useful with speaker-independent engines because it allows a large corpus of recorded usage to correctly optimize the overall accuracy of the application.