Using Speech Dictionaries to Improve Handwriting Recognition Results

 

Stefan Wick
Microsoft Corporation

June 2004

Applies to:
   Microsoft Tablet PC Platform SDK
   Windows XP Tablet PC Edition 2005
   Handwriting recognition
   Custom dictionaries

Click here to download the code sample for this article.

Summary: This article describes how the handwriting recognizer uses speech dictionaries. In addition, this article illustrates how to programmatically modify the dictionaries in order to improve results for handwriting recognition in your applications. The corresponding sample, written in C++, uses the Microsoft Tablet PC Platform SDK version 1.7, currently in Beta. Some example code is given in C#. (15 printed pages)

Contents

Introduction
Overview
Using the Sample
Managing the User Dictionary
Installing an Application Dictionary
Removing an Application Dictionary
Testing Handwriting Recognition Improvements
Accessing Speech Dictionaries from Managed Code
Conclusions

Introduction

The handwriting recognizers for western languages in Windows XP Tablet PC Edition 2005 are designed to take advantage of dictionaries. They can recognize known words with higher confidence than unknown words, such as e-mail names or abbreviations.

There are three types of dictionaries:

  • System Dictionary—The system dictionary is part of the handwriting recognizer and contains a typical subset of words in the supported languages.
  • User Dictionary—The user dictionary is a customizable dictionary for each user. Users can add words to and remove words from the user dictionary. This dictionary is empty by default.
  • Application Dictionary—The application dictionary is typically installed with an application. The words in such a dictionary are available to all users on the system. By default, no application dictionary is installed.

The handwriting recognizers share the infrastructure for user and application dictionaries with the Microsoft Speech SDK (SAPI 5.1).

Note   Words added to the current user's dictionary or installed as part of an application dictionary will be available in any application that uses handwriting or speech recognition.

Overview

When your application instantiates a RecognizerContext object the system loads the current user's dictionary and any installed application dictionary. The words in these dictionaries are stored with an associated language identifier. This identifier can either be a locale ID or LANG_INVARIANT. If you use LANG_INVARIANT, the word will be used by any language recognizer. All entries in the dictionaries that match a language that is supported by the current handwriting recognizer are added to an internal word list. The handwriting recognizer recognizes words on that list with higher accuracy than words not on the list. For this reason, you can add words that are not in the system dictionary—such as e-mail names or abbreviations—to the current user's dictionary or an application dictionary to increase recognition accuracy.

The following illustration details the creation of a RecognizerContext object:

Note   The procedure of creating the internal word list for handwriting recognition has been significantly improved in Microsoft Windows® XP Tablet PC Edition 2005. In the previous version the handwriting recognizer only used the current user's dictionary and ignored the language identifier. Furthermore, dramatic performance improvements in this area now allow you to add thousands of words to a dictionary without causing a noticeable delay when creating a RecognizerContext object.

The Tablet PC Input Panel of Windows XP Tablet PC Edition 2005 enables users to modify the current user's dictionary through the correction UI, as shown in the following screenshot. This article focuses on programmatic access to this data.

Using the Sample

In order to compile the sample source code, you must have the following components installed on your computer:

  • Microsoft Visual Studio® .NET 2003.
  • Microsoft Tablet PC Platform SDK version 1.7, currently in Beta.
  • Microsoft Speech SDK 5.1.

The sample application has a user interface that enables you to:

  • View the content of the current user's dictionary.
  • View the content of any installed application dictionary.
  • Filter the content of a selected dictionary by language.
  • Add words to the current user's dictionary.
  • Remove words from the current user's dictionary.
  • Install and remove application dictionaries.
  • Test handwriting recognition results for the current locale and dictionary configuration.

Note   The Add Lexicon and Import Words buttons prompt for a text file containing the words to be added. In the text file, each entry must be on a separate line, and the file must be saved in ANSI encoding.

The following illustration shows the UI for the DictionarySample application.

Important   Tablet PC Input Panel does not add newly installed application dictionaries dynamically. You need to log off from Windows and then log on again to ensure that Input Panel uses any new dictionary.

Managing the user dictionary

Managing the current user's dictionary can be divided into three main sections:

  • Viewing the content of the user dictionary.
  • Adding words to the user dictionary.
  • Removing words from the user dictionary.

Viewing the Content of the User Dictionary

Use the ISpLexicon::GetWords API in order to retrieve words from user or application dictionaries. The following example code illustrates how to retrieve all words for a given locale ID (specified by currentLCID) from the current user's dictionary. The DictionarySample application uses this code in its CUserDictionarySampleDlg::PopulateDictList method.

#include "sapi.h"
#include "sphelper.h" 
...
CComPtr<ISpLexicon> spLex;
HRESULT hr = S_FALSE;
SPWORDLIST speechWords;
DWORD dwGeneration = 0;
DWORD dwCookie = 0;

if SUCCEEDED(spLex.CoCreateInstance(CLSID_SpLexicon))
{
   while (SUCCEEDED(hr))
   {
      memset(&speechWords, 0, sizeof(speechWords));
      hr = spLex->GetWords(eLEXTYPE_USER,
                           &dwGeneration,
                           &dwCookie,
                           &speechWords);
      if FAILED (hr)
      {
         // handle error for failed call to GetWords
         break;
      }
      for (SPWORD *pword = speechWords.pFirstWord;
           pword != NULL;
           pword = pword->pNextWord)
      {
         if (pword->LangID == currentLCID)
         {
            // add word to UI if it matches the current locale
         }
      }
      CoTaskMemFree(speechWords.pvBuffer);
      if (hr == S_OK) break;  // nothing more to retrieve
   }
}

Adding Words to the User Dictionary

Use the ISpLexicon::AddPronunciation API in order to add words to the current user's dictionary. The following example code illustrates how to add a word (specified by pszWord) for a given locale ID (specified by currentLCID) to the current user's dictionary. Use LANG_INVARIANT instead of a locale ID in order to add the word for all languages. The DictionarySample application uses this code when you click Add Word or Import Words.

#include "sapi.h"
#include "sphelper.h" 
...
CComPtr<ISpLexicon> spLex;
if SUCCEEDED(spLex.CoCreateInstance(CLSID_SpLexicon))
{
   if FAILED(spLex->AddPronunciation(pwc,
                                     (WORD)currentLCID,
                                     SPPS_Unknown,
                                     NULL))
   {
         // handle error for failed call to AddPronunciation
   }
}

Removing Words from the User Dictionary

Use the ISpLexicon::RemovePronunciation API in order to remove words from the current user's dictionary. The following example code illustrates how to remove a word (specified by pszWord) for a given locale ID (specified by currentLCID) from the current user's dictionary. Use LANG_INVARIANT instead of a locale ID in order to remove the word for all languages.

Note Internally, LANG_INVARIANT is treated as a regular locale ID. This means if you add a word as English(UK) and again as LANG_INVARIANT, then removing it for LANG_INVARIANT will not remove the entry for English(UK).

The DictionarySample application uses this code when you click Remove Word or Clear Dictionary.

Note   After removing words from the user dictionary, the Tablet PC Input Panel still keeps those words on its internal wordlist until the user logs off from Windows.

#include "sapi.h"
#include "sphelper.h" 
...
CComPtr<ISpLexicon> spLex;
if SUCCEEDED(spLex.CoCreateInstance(CLSID_SpLexicon))
{
   if FAILED(spLex->RemovePronunciation(pwc,
                                           (WORD)currentLCID,
                                           SPPS_Unknown,
                                           NULL))
   {
         // handle error for failed call to RemovePronunciation
   }
}

Installing an Application Dictionary

Microsoft Speech SDK (SAPI 5.1) provides two helper functions that make the creation of on application dictionary straight forward: SpCreateNewTokenEx and SpCreateObjectFromToken. The following example code illustrates how these helper functions are used to install an application dictionary and add words to it for a given locale ID (specified by currentLCID). Use LANG_INVARIANT instead of a locale ID in order to add a word for all languages. The DictionarySample application uses this code when you click Add Lexicon.

Note   Once an application dictionary has been installed it is read-only. To add or remove words, remove the application dictionary, make the changes necessary, and reinstall the application dictionary.

#include "sapi.h"
#include "sphelper.h" 
...
CComPtr<ISpLexicon> spLexicon;
CComPtr<ISpObjectToken> spToken;

WCHAR szTempAppLexCategoryId[MAX_PATH];
wcscpy(szTempAppLexCategoryId, SPREG_LOCAL_MACHINE_ROOT);
wcscat(szTempAppLexCategoryId, L"\\AppLexicons");
if SUCCEEDED(SpCreateNewTokenEx(szTempAppLexCategoryId,
                                NULL,
                                &CLSID_SpUnCompressedLexicon,
                                NULL,
                                0,
                                NULL,
                                &spToken,
                                NULL))
{
   if SUCCEEDED(SpCreateObjectFromToken(spToken, &spLexicon))
   {
      // add each word by calling AddPronunciation
      spLexicon->AddPronunciation(pszWord1,
                                  (WORD)currentLCID,
                                  SPPS_Unknown,
                                  NULL);
      spLexicon->AddPronunciation(pszWord2,
                                  (WORD)currentLCID,
                                  SPPS_Unknown,
                                  NULL);
      // ...
   }
}

Removing an Application Dictionary

Passing NULL to ISpObjectToken::Remove removes the application dictionary specified by its respective token object. The following example code illustrates how to remove an application dictionary (specified by index lLexIndex). The sample application uses this code when the user clicks Remove Lexicon.

Note   As long as there is a running application that holds a reference to a RecognizerContext object, no application dictionary can be removed, except if the application dictionary was installed after the RecognizerContext was instantiated. Because of this, you cannot remove any application dictionary that has been added before the last logon, as Tablet PC Input Panel maintains a RecognizerContext object throughout its lifetime. To work around this, stop the TabTip.exe process before removing the application dictionary. Then reboot your system to ensure that the TabTip.exe process gets restarted properly.

#include "sapi.h"
#include "sphelper.h"
...
CComPtr<ISpObjectTokenCategory> spCat;
CComPtr<IEnumSpObjectTokens> spTokens;
CComPtr<ISpObjectToken> spToken;
HRESULT hr;

if SUCCEEDED(spCat.CoCreateInstance(CLSID_SpObjectTokenCategory))
{
   if SUCCEEDED(spCat->SetId(SPCAT_APPLEXICONS, true))
   {
      if SUCCEEDED(spCat->EnumTokens(NULL, NULL, &spTokens))
      {
         if SUCCEEDED(spTokens->Item((long)i-1, &spToken))
         {
            hr = spToken->Remove(NULL);
            if SUCCEEDED(hr)
               // dictionary removed successfully
            else if (hr == SPERR_TOKEN_IN_USE)
               // dictioanry in use by another application
         }
      }
   }
}

Testing Handwriting Recognition Improvements

There is no direct access to the internal word list that the RecognizerContext object creates based on the dictionaries and the supported languages; however, you can programmatically check whether or not a given instance of the RecognizerContext class references a certain word. When you click String Supported?, the sample application uses the following code:

Note   The IInkRecognizerContext::IsStringSupported method returns true if the word is contained in the system dictionary, user dictionary, or application dictionary for the supported languages; otherwise false.

#include "msinkaut_i.c"
#include "msinkaut.h"
#include "sapi.h"
#include "sphelper.h"
...
CComPtr<IInkRecognizerContext> spRecoCtxt;
CComPtr<IInkRecognizers> spRecos;
CComPtr<IInkRecognizer> spReco;
HRESULT hr;
VARIANT_BOOL varBool;
CComBSTR bstr = L"John123";

// create collection of installed handwriting recognizers
if SUCCEEDED(spRecos.CoCreateInstance(CLSID_InkRecognizers))
{
   // create a recognizer context for the current locale
   if SUCCEEDED(spRecos->GetDefaultRecognizer(currentLCID, &spReco))
   {
if SUCCEEDED(spReco->CreateRecognizerContext(&spRecoCtxt))
{
   // check if string is supported
         if SUCCEEDED(spRecoCtxt->IsStringSupported(bstr,
                                                    &varBool))
         {
            if (varBool == VARIANT_TRUE)
               MessageBox(L"Supported");
            else
               MessageBox(L"Not supported");
         }
      }
   }
}

You can verify handwriting recognition results by writing in the Handwriting Test field in the DictionarySample form. This field is an InkPicture control. The IInkRecognitionAlternates object is displayed in the ListBox control on the bottom right of the form and are listed in order of confidence. The following example code illustrates how to recognize strokes collected by an InkPicture control and how to retrieve the recognition alternates.

CComPtr<IInkDisp> spInk;
CComPtr<IInkStrokes> spStrokes;
CComPtr<IInkRecognitionResult> spRecoResult;
InkRecognitionStatus recoStatus;

// obtain the Ink object from the InkPicture control
if SUCCEEDED(spInkPicture->get_Ink(&spInk))
{
   // obtain the strokes collection from the Ink object
   if SUCCEEDED(spInk->get_Strokes(&spStrokes))
   {
      // assign strokes to the recognizer context and recognize them
      if SUCCEEDED(spRecoCtxt->putref_Strokes(spStrokes))
      {
         if SUCCEEDED(spRecoCtxt->Recognize(&recoStatus, &spRecoResult))
         {
            if (recoStatus == IRS_NoError)
            {
               CComPtr<IInkRecognitionAlternates> spAlts;
               // get the recognition alternates
               if SUCCEEDED(spRecoResult->AlternatesFromSelection(
                  IRAS_Start,
                  IRAS_All,
                  IRAS_DefaultCount,
                  &spAlts))
               {
                  long lAltCount = 0;
                  if SUCCEEDED(spAlts->get_Count(&lAltCount))
                  {
                     for (long l=0; l<lAltCount; l++)
                     {
                        CComPtr<IInkRecognitionAlternate> spAlt;
                        if SUCCEEDED(spAlts->Item(l, &spAlt))
                        {
                           CComBSTR RecoString;
                           if SUCCEEDED(spAlt->get_String(&RecoString))
                           {
                              // display RecoString in your UI
                           }                           
                        }
                     }
                  }
               }
            }
         }
      }
   }
}

Accessing Speech Dictionaries from Managed Code

From managed code you can access the speech dictionaries by calling speech automation APIs through COM Interop:

  1. In Visual Studio, on the Project menu, click Add Reference.
  2. In the Add Reference dialog, on the COM tab, click Microsoft Speech Object Library, click Select, and then click OK.

Viewing the Contents of the User Dictionary by Using C#

The following C# example code illustrates how to retrieve all words from the current user's dictionary by calling the GetWords method on the SpLexiconClass object.

using SpeechLib;
...
ISpeechLexiconWords splexWords;
SpLexiconClass lex = new SpLexiconClass();
int generationId = 0;
splexWords = lex.GetWords(SpeechLexiconType.SLTUser, out generationId);
foreach(ISpeechLexiconWord splexWord in splexWords)
{
   // display splexWord.Word in your UI
}

Adding Words to the User Dictionary by Using C#

The following C# example code illustrates how to add a word (specified by newWord) for a given locale ID (specified by currentLCID) to the current user's dictionary by calling the AddPronunciation method on the SpLexiconClass object.

using SpeechLib;
...
SpLexiconClass lex = new SpLexiconClass();
lex.AddPronunciation(newWord,
                     currentLCID,
                     SpeechPartOfSpeech.SPSVerb,
                     null);

Removing Words from the User Dictionary by Using C#

The following C# example code illustrates how to remove a word (specified by wordToRemove) for a given locale ID (specified by currentLCID) from the current user's dictionary by calling the RemovePronunciation method on the SpLexiconClass object.

using SpeechLib;
...
SpLexiconClass lex = new SpLexiconClass();
lex.RemovePronunciation(wordToRemove,
                        currentLCID,
                        SpeechPartOfSpeech.SPSVerb,
                        null);      

Conclusions

  • Words that are in the system dictionary, user dictionary, or application dictionary are recognized with higher confidence and accuracy by the handwriting recognizer.
  • Words can be programmatically added and removed to and from the current user's dictionary by using Speech APIs.
  • Application dictionaries can be programmatically installed and removed by using Speech APIs.
  • If you want to add words to the dictionary for all users on the system, install an application dictionary. Otherwise, add them to the user's dictionary.
  • If a word applies to all languages, use LANG_INVARIANT when adding the word to the dictionary.
  • Speech automation APIs can be accessed from managed code through COM Interop.