Add a Bilingual Translation Dictionary Service to Microsoft Office System Applications

 

André McQuaid
Richard Bready
Microsoft Corporation

May 2004

Applies to:
    Microsoft® Office 2003 Editions
    Microsoft Office Excel 2003
    Microsoft Office OneNote™ 2003
    Microsoft Office Outlook® 2003
    Microsoft Office PowerPoint® 2003
    Microsoft Office Publisher 2003
    Microsoft Office Visio® 2003
    Microsoft Office Word 2003

Summary: The Research task pane, in many Microsoft Office System applications, translates words and phrases in any of several different language pairs, using the content of bilingual dictionaries. Dictionaries for some language pairs are provided with Microsoft Office System applications. You can add bilingual dictionary content for other language pairs to this task pane by using a downloadable code sample. A link to that code sample appears in this document. You can use that code sample to build additional content, to build a new translation dictionary service based on that added content, and to install that new service for use with Microsoft Office System applications. (23 printed pages)

Download odc_ofbiltransdict2003.exe.

Download the odc_ofbiltransdict2003.exe file. (215 KB)

Contents

Introduction
Overview
Scenario
Adding a New Translation Dictionary Service
Formatting Content Using XML-equivalent Tags
Interfaces, Helper and Subordinate Objects, and Object Chaining
Building the Service
Creating a Windows Installer Setup Package
Conclusion

Introduction

Business today is international. Business communication and business documents may contain words and phrases from many languages. The Microsoft Office System makes it easy to translate words and phrases from one language to another, without leaving the screen that you are working in. The Research task pane, available in many Office System applications, enables Office users to input a word or phrase in one language, select one of several other languages, and quickly learn the second-language equivalent for that word or phrase.

Best of all, the number of languages available for translation can grow. If you are a publisher, a translation service, or a software company interested in adding bilingual content to Microsoft Office System applications, you can do that yourself. By using the code sample and the instructions in this article, you can create a translation dictionary service that works in the Research task pane in Microsoft Office System applications. Whether for your own use, or as a service that you make available to others, such a translation dictionary service adds your bilingual content to those applications.

You can quickly deploy the working code sample that this article refers to quickly with little or no customization. (You need to ensure that your bilingual dictionary content is ready for the code to accept.) The code sample uses a simple and easily comprehensible process involving familiar technologies.

Overview

This article, used with the code sample provided in the download, enables you to add a new translation dictionary service to the Research task pane of a Microsoft Office System application.

When you input a word or phrase in one language into the Research task pane, and then specify a second language, the task pane displays a translation of that word or phrase. The translation dictionary service searches the content of bilingual dictionaries in response to the user input, and then it displays dictionary content as the output translation. If an input word is not in the content of a bilingual dictionary used by the Research task pane, this task pane does not show results.

The Research task pane can translate only to languages and from languages for which it has bilingual dictionary content. The Microsoft Office System distributes a number of bilingual dictionaries. You can add a bilingual dictionary or several bilingual dictionaries for use with the Research task pane.

This article explains how you can create a translation dictionary service for use with the Research task pane of Microsoft Office System applications. The process of creating a translation dictionary service includes preparing dictionary content so that it is properly tagged with XML-equivalent tags capable of being processed into Rich Text Format (RTF) by the display text converter. The process also includes the engineering that is necessary to build the content files, and the development effort necessary to build a translation dictionary service that functions in Microsoft Office System applications.

This article is designed as a resource for Solution Providers (SP), Independent Software Vendors (ISV), and developers interested in building translation dictionary services for Microsoft Office System applications. To use this article effectively, you should be familiar with the following:

  • Component Object Model (COM)
  • Microsoft Visual Studio®, Microsoft Visual C++®, or an equivalent development environment

The code sample contains program files that you can use to prepare a functional translation dictionary service. A translation dictionary service using only those files has limited functionality. You can extend the functionality of a translation dictionary service by building on the sample source code provided in the download. This article explains how you can add customized functionality to the translation dictionary service. This article does not tell you how to add any specific functionality beyond that provided in the sample source code.

If you are interested in creating a translation dictionary service for the Translate task pane in Word 2002, this article will be useful to you, but you must download the code sample for Word 2002 from the Translation Dictionaries Content Development Kit for Word 2002.

Note   You can install only one local translation dictionary service for any particular language pair in the Research task pane of a Microsoft Office System application.

You can access more than one online translation dictionary service for a language pair from that task pane. A local, offline version of the translation dictionary service is installed on the user's computer, and content is queried directly in the computer. These local translation dictionary services are the ones described in this document. An online translation dictionary service, on the other hand, queries the content of a registered Web service. These online services are typically created using the Web-service-oriented Research Software Development Kit (SDK).

An English-to-French translation dictionary service is already installed, so if another English-to-French translation dictionary service is then installed, it replaces the functionality offered by the original one, even if the one you wish to add has its own name and a different vocabulary. That is because the task pane uses an ID-based system to recognize installed languages, and it has only a single identifier for each language. Therefore, after you install a Spanish-to-Swahili translation dictionary service in an application's Research task pane, you cannot install a second Spanish-to-Swahili translation dictionary service in that same task pane.

Scenario

Raman works for a publisher that wants to make its bilingual medical dictionaries available to Microsoft Office System applications users who have paid a fee to obtain this translation service. The publisher cooperates with Office Marketplace to create a listing that promotes this content to customers. Raman and his team prepare the dictionary files by mapping the tags already in those files to the XML-equivalent tags needed for display in the Research task pane. They process copies of the dictionary files to remove the previous tags and add the XML-equivalent tags.

Raman investigates the possibility of adding functionality to the basic translation dictionary service. Word stemming, which removes inflected forms from lookup words, would be a plus, but stemming requires additional content for each language, so the publisher decides to use only the basic functionality. With the code sample available from Office Online Downloads, Raman's team builds the original bilingual dictionary files into a service that enables users of Microsoft Office System to look up medical terms in Swedish and see translations in Hindi, Urdu, or Sinhalese.

Hospitals, physicians, and other medical workers contact the publisher through Office Marketplace and purchase the right to install this service on their computers. Those users are then able to access these bilingual medical dictionaries in the Research task pane of Word 2003, PowerPoint 2003, and other Microsoft Office System applications.

Adding a New Translation Dictionary Service

Overview

A translation dictionary service must be built in order for the service to function correctly. You begin the build process by creating a source file that contains your bilingual dictionary content, properly tagged with the XML-equivalent tags listed in Table 1. You then determine what functions your translation dictionary service provides, and you create any additional program objects necessary to deliver those functions. (The minimal objects required for a full-functioning translation dictionary service are provided in the code sample.) Finally, you use BuildLexRefService.exe, which is in the code sample, to transform the content source file into a set of destination files. During that process, you also build a sequence of program objects to operate on the destination files. You can then install those destination files and program objects in the Research task pane of several Microsoft Office System applications to enable your translation dictionary service to function.

Detailed Steps

Here is a more detailed description of the actions you perform to add a new translation dictionary service:

  1. Produce XML-equivalent tags for the text of the content and the index.
  2. Determine what helper objects are needed for the translation dictionary service, and create any that you do not already have.
  3. Build the service, by creating an initialization file that describes the service, and then running BuildLexRefService.exe. You can download that .exe file by using the link in the Introduction to this article.
  4. Create a Microsoft Windows® Installer setup package that installs and registers the service.

Formatting Content Using XML-equivalent Tags

Overview

You can use this code sample to build a plain-text source file. To use this code sample, you must create a text file that contains your bilingual dictionary content and that is tagged using the XML-equivalent tags shown in Table 1.

Note   If you choose to use an alternative source-file format or destination-file format, you need to write your own code to build those files, and you need to use your own code instead of the code sample, or modify the code sample accordingly. It is possible to create a translation dictionary service by other methods than those described here, but this article and this code sample do not provide assistance in any such effort.

Tagging the Content

The content of a new translation dictionary service should consist of a list of headwords in one language linked to translations in a second language. Each headword has one or more translations. In addition to second-language words that are equivalents for the headwords, the second-language translation content may include other classes of information, such as part-of-speech identifications, inflected forms and other grammatical information, phrases, idioms, samples of usage, context information, etc. The additional information (if any) that is actually included depends on your original bilingual dictionary content.

You must use the XML-equivalent tags listed in Table 1 in content for a new translation dictionary service created with this code sample and this article. The content that you are using to create a new translation dictionary service may start with a different set of tags than the ones listed in the table. Some dictionaries are tagged in detail, on a linguistic basis. Other dictionaries are tagged only to indicate fonts and other information needed to print the content in book form.

Because the range of possible content tagging systems is great and unpredictable, the tags listed below are not actual well-formed XML. They are XML-equivalent tags, as simple and as general as possible. They contain all the information necessary to tag bilingual dictionary content for use in the Research task pane. The only function of these XML-equivalent tags is to govern the display of text in the Rich Text Format (RTF). These tags are only for display purposes, and they are specific to the system of converting text into RTF that is used by the task pane.

These tags are not well-formed XML. There is no Document Type Description (DTD) or XML schema associated with these tags. You cannot use true XML for the translation dictionary service, because the structure of the potential content is unpredictable and unknown. Because the potential content includes many already published bilingual dictionaries, it is impossible to require that the content conform to the requirements of the application. The translation dictionary service is designed to work with the widest possible range of content, and the XML-equivalent tags are designed as a method of displaying content in the task pane regardless of the content's data structure.

Perhaps your dictionary content uses different tags or classification systems than the one in these XML-equivalent tags. If so, you must convert your content into a version that uses these tags. By following the system for tagging that is shown below, you can map tags or information in your dictionary content to one or more of the XML-equivalent tags. Then you must run a conversion process that inserts these XML-equivalent tags into a version of your dictionary content, removing any other tags.

If your dictionary contains additional forms of information and classification that cannot be shown by these simple XML-equivalent tags, that additional information is not preserved—as such—in the content of the translation dictionary service. For example, if your dictionary has specific labels for slang, archaic language, and other information about usage, those labels can be tagged only to control their appearance in the RTF display, for example as bold, italic, and so on. The XML-to-RTF converter recognizes only these XML-equivalent tags. (The XML-to-RTF converter transforms the content into RTF, a format used by Microsoft Office System applications for display in the Research task pane.) If your content contains spelled-out pronunciations or any other content in a form that is not supported by Microsoft Office System RTF, you need to remove that content.

For examples of tagged content, please see the Target.txt file, included in the download, odc_ofbiltransdict2003.exe available from the Microsoft Download Center.

Table 1. XML-equivalent tags for Rich Text Format (RTF) display of bilingual dictionary content

Tag Required or Optional Description
Headword Required Text following this tag is the first-language word to be looked up and translated. The headword is the target of the lookup process. The text of the headword appears in boldface type at the top of the article.
Num Optional Text following this tag is a number, in boldface type, on a new line, indicating one of two or more numbered items in the text of the second-language translation. Do NOT use this tag IF a number is only a part of the translation information text. IF a number is only part of the translation information text, it should appear as a lightface text character, without forcing a new line.
NEWLINE Optional Text following this tag begins on a new line.
Homograph Optional Text following this tag is a first-language headword that has the same spelling as a different first-language headword. (For example, in English, "tear" is a verb meaning to rip, and "tear" is a noun meaning a drop of liquid from the eye, and these are two different words, homographs.) The dictionary service looks up homographs together, and it displays homographs at the same time, but it shows that they are different words. The number of homographs that are displayed is not limited by the service; it depends on the content of your dictionary.
B Optional Text following this tag is in boldface.
I Optional Text following this tag is in italics.

Note You may combine B and I tags: text following both tags is in both boldface and italics.

Label Optional Text following this tag is a label indicating special limits on information in the translation (for example, a meaning for a special subject such as LAW or MEDICINE; or a special usage such as SLANG or OBSOLETE). Text following this tag is in small capital letters.
Pos Optional Text following this tag is grammatical part-of-speech information. Text following this tag is in italics.
Text Optional Text following this tag is in roman lightface. It is not in boldface. It is not in italics. It is not a number that indicates a numbered item of content.
Super Optional Text following this tag is superscript. It is slightly above the level of the level of the text next to it (for example, the "2" in E=Mc2 is a superscript number).
Sub Optional Text following this tag is subscript. It is slightly below the level of the text next to it (for example, the "2" in H2O is a subscript number).
Scaps Optional Text following this tag is all in capital letters that are smaller than the capital letters used in the other text.
Headword2 Optional Text following this tag is a cross-reference to a first-language headword. Text following this tag appears in boldface. The cross-reference is not an active link. The lookup process for the translation dictionary service does not recognize text following this tag as a headword for lookup purposes, unless the user enters it in a new separate process.

This tag is used when necessary to indicate a variant spelling or variant form of the lookup word in the first language, when this variant spelling or variant form is shown in the translation information. For example, if you try to translate "theater" in the English-to-French dictionary service, the task pane displays this text:

américanisme voirtheatre

Here the boldface word theatre is a headword in English that has a full translation.

Interfaces, Helper and Subordinate Objects, and Object Chaining

ILexRefService COM Interface

A single main COM interface named ILexRefService underpins the architecture of the translation dictionary service. This interface is implemented by COM objects that make up a translation dictionary service. The following methods are present on this interface:

Figure 1. Methods present on ILexRefService interface (Click picture to view larger image)

For translation dictionary service functionality, two of these methods are of greatest importance:

SetAttribute: This method is used to specify translation dictionary service attributes of objects in this service. For example, you can call this method during initialization to specify the location of the lookup dictionary. That specified location us used in subsequent lookup operations. This method is also called during the build process, when the build tool is invoked, to specify the location of the source content and the location of built target content for the build object.

Exchange: This method performs the actual work in the operations of a translation dictionary service. In the lookup operation, this method is called to initiate a lookup and return the translation (if any). In the build operation, this function is called to generate the built target content.

Most of the other functions in the code sample contain only default implementations or minimal implementations. Therefore, you can use them as written, without any modification.

Helper Objects and Subordinate Objects

A translation dictionary service is composed of a number of functional objects that implement the ILexRefService interface. All objects that are required during the operations of a translation dictionary service are called helper objects. The code sample in the download contains only two helper objects: the lookup object, and the build object. The lookup object performs the fundamental act of checking to see if a word is a headword in a translation dictionary. The build object is discussed in the section titled The Build Object--a Special Case. These two helper objects are necessary and sufficient to create a translation dictionary service with the simplest functionality, and so you can choose to use only the code sample, if you wish.

Alternately, you can create your own helper objects, including your own lookup object and build object, if you wish. You can also add as many helper objects as you choose to create. Your decision to create additional helper objects depends on the complexity of the content and functionality that you wish to provide to the end user of your translation dictionary service.

A typical example of an additional helper object is a stemmer object. A stemmer reduces input words to their root forms before lookup, and that reduction widens the scope for potential matches. For example, a simple stemmer for English removes
"-ed" and "-ing" from input words, so that "walked" and "walking" match "walk." You may wish to add a stemmer helper object to the translation dictionary service. If so, you must create that helper object, which is not in the sample. You may not use any of the stemmer objects supplied with Microsoft Office System, because those supplied stemmer objects are proprietary intellectual property.

Adding a stemmer or any other extra helper object to your translation dictionary service is a straightforward process. All of the helper objects that you create must implement the ILexRefService COM interface. This interface is described in full in lexrefservice.idl, which is included with the code sample, and lexrefservice.idl can be included as part of any additional helper objects.

You must describe all helper objects in the initialization file as a part of the service's object chain, and you must install and register their .dll files in the setup package described below.

In addition, you can create so-called subordinate objects that extend the functionality of the helper objects for the translation dictionary service. A subordinate object is a piece of functionality called directly by one or more of the helper objects. A helper object may have more than one subordinate object; its subordinate objects do not form part of the object chain (see below for a discussion of object chaining). A typical example of a subordinate object is a word breaker, which uses rules (usually language-specific rules) to search for words contained in longer character strings, thereby increasing the scope for potential matches by the lookup helper object.

The code sample does not contain any subordinate objects, or any methods for linking subordinate objects to the helper objects. If you wish to use subordinate objects in your translation dictionary service, you must create them yourself, and link them to the helper objects yourself. The use of subordinate objects is not a requirement of a translation dictionary service. However, subordinate objects can perform useful tasks (for example, word breaking) that are known to add valuable functionality at different points of the service operation. Because any subordinate object must implement ILexRefService, you can use any subordinate object also as a helper object in the main object chain.

Object Chaining

The helper objects that implement the ILexRefService interface are "chained" together to form a functional translation dictionary service. The term "object chaining" is used because these objects are called in sequence during translation dictionary service lookups, and the output of each object is passed on to the next object in the chain. The object chain sequence is initially described for a new translation dictionary service in the service's initialization file, which is the primary input to the build process. The sequence of helper objects in the chain is also specified in the build process. One of the outputs of the build process is a binary .its file, which describes the service, including in that description the object chain sequence.

For example, consider the following object chain: Stemmer TO Lookup TO XMLToRTF. When the translation dictionary service in this example is queried for a word, the Stemmer object is called first to return the root form of that word. It outputs the root form, which is passed to the Lookup object. The Lookup object is called to determine if a translation exists. If a translation does exist, the Lookup object outputs that content and passes it to the last helper object in the chain. The XMLToRTF object is called to transform the translation content, tagged with XML-equivalent tags, into RTF, a format suitable for the client Office System application.

Figure 2. Sample object chain for helper objects implementing the ILexRefService interface (Click picture to view larger image)

Note   Two of the helper objects in this example are not included in the code sample. The code sample does not contain a stemmer object or the XMLToRTF object. If you wish to include a stemmer object, you must create one. The XMLToRTF object is supplied with Microsoft Office System applications containing the Reference task pane.

The following example code illustrates the operation of the service manager for the translation dictionary during a lookup operation:

   For Each Helper Object in the Object Chain
      Call ILexRefService::Exchange( Input Term, Output Term )
         Making use of any subordinate objects if required
      Input Term = Output Term
   Next

"Input Term" is the original text being looked up in the translation dictionary. On completion, "Output Term" contains the translation, if any exists in the content.

The relationship between helper objects and subordinate objects is similar to the relationship between the translation dictionary service and its component helper objects. The important difference between the two sorts of relationship is that two helper objects (Lookup and XMLToRTF) are required for the successful function of a translation dictionary service, but no subordinate objects are required for the successful function of a translation dictionary service. A helper object can have zero or more subordinate objects. The helper object maintains its subordinate object chain (which it can do in any way that functions effectively). The internal manager for the translation dictionary service maintains the helper object chain. Figure 3 illustrates the relationship between a helper object and subordinate objects; dotted lines indicate optional components.

Figure 3. Relationship between a sample helper object and two sample subordinate objects implementing the ILexRefService interface

On service startup, each helper object is notified of its related subordinate objects, if any. This notification occurs by using an ILexRefService::SetAttribute call on the helper object, with each subordinate object's IUnknown interface pointer passed as a parameter. The resulting interface between helper object and subordinate object should be queried for the ILexRefService interface (because a call to QueryInterface ensures that each subordinate object's reference count is incremented, and incrementing the object reference counts is important to proper function). After that query, the interface with a subordinate object can be used as part of the operation of the helper object. The use of the interface with a subordinate object typically occurs by a call by the helper object to the subordinate object's ILexRefService::Exchange method.

A helper object's subordinate objects (if any) are configured as part of the project configuration (discussed below). A subordinate object's attributes are also configured as part of the project configuration. Any subordinate object attributes are passed to the subordinate object on service startup. A helper object can query subordinate object attributes if ILexRefService::GetAttribute is properly implemented.

The code sample below shows how a helper object is notified of one of its subordinate objects. If a helper object has multiple subordinate objects, multiple SetAttribute calls occur. You can make calls to the subordinate object's GetAttribute method to differentiate between the subordinate objects. For example, the subordinate objects may have different name attributes, which you can query at run time.

const IID IID_ILexRefService = {0x00000002,0x0000,0x0000,{0xB0,0x18,0x41,0xA1,0x4A,0x57,0x36,0xCB}};

STDMETHODIMP CHelperObject::SetAttribute(VARIANT *pvarKey, VARIANT *pvarData)
{
   HRESULT hr = S_OK;
   
   // Are we being notified of a subordinate object?
   // Note VT_LEXREFSLAVEOBJECT is defined in lexrefservice.idl
   if (VT_LEXREFSLAVEOBJECT == pvarKey->vt && VT_UNKNOWN == pvarData->vt)
   {
      ILexRefService * pLexRefServiceTemp = NULL;
      
      // Query the interface for the known ILexRefService
      // QueryInterface will call AddRef to increment its ref count
      // The object will therefore not be released and destroyed when the fn returns
      if (FAILED(hr = pvarData->punkVal->QueryInterface(IID_ILexRefService, (void **)&pLexRefServiceTemp)))
      {
         // Handle error condition

         return hr;
      }

      // The pLexRefServiceTemp interface can now be cached for future use
      // Usually as part of CHelperObject::Exchange

   }
   
   return hr;
}

The Build Object—a Special Case

The build object used as part of the build process is a special case. This object implements the ILexRefService interface, and thus it is a helper object. However, it is not used again after the build process. This object is used in the construction of the target dictionary file. It transforms the source content file into the destination file, and it performs error checking. It is not used in subsequent dictionary lookup operations that call the target file. During the build process, the SetAttribute method is used to pass information from the project initialization file to the build object (such information, for example, as the location of the source data and the location of the target data). During the build process, the Exchange method is called on the build object to verify and process each source term and store it in the built target file.

Typically any updates to the build object occur in these two methods only. The SetAttribute method should cache attributes relevant to each build step. The Exchange method verifies and produces the target lexical data. Most of the source-code updates required for the build object therefore occur in the Exchange method, because attributes are typically only cached in the SetAttribute call for later use as part of the Exchange call.

Building the Service

The first two steps in creating a translation dictionary service with this code sample are:

  1. Create a source file that contains your dictionary's entire content, properly formatted and tagged with the XML-equivalent tags shown in the table above.
  2. Design the object chain for your translation dictionary service to use, and create any objects needed for your object chain (except the lookup object and the build object provided in the code sample, if you choose to use those in your object chain, and except the XMLToRTF object, which is supplied with Microsoft Office System).

Those two steps are described in the previous sections of this article.

The next three steps in creating a translation dictionary service with this code sample are:

  1. Create an initialization file that describes your service's object chain. (The initialization file may also contain any extra attributes needed by your objects.)

  2. Check to be sure that you are building on a computer on which both the Microsoft Office System and the Translation Dictionaries feature are installed.

    If the Translation Dictionaries feature was not installed by default, you can easily install it: In any of the Microsoft Office System applications listed in this article's "Applies to" section, on the Tools menu, point to Language and click Translate. Then perform a translation by using one of the included translation dictionary services. However, if translation options are set to query only the online translation dictionaries, performing a translation using the method described here does not install the offline Translation Dictionaries feature. It may be necessary to go through Office Setup if local, offline translation dictionaries are set to "Not Available".

  3. Run BuildLexRefService.exe in the code sample to create the destination files of your service.

BuildLexRefService.exe requires the following items as input:

  • A source file that contains the dictionary content as text properly tagged with the tags listed in Table 1.

  • A project initialization file describing the translation dictionary service, including in that description all attributes and the object chain information.

  • An empty .its file (this file stores attributes and related information for your translation dictionary service, such as the service's file name and GUID, once the service has been built). The .its file is a binary file that describes a translation dictionary service in a format known to Microsoft Office System applications.

  • The build object dynamic link library (DLL) specified in the project initialization file.

    This DLL needs to be registered before the build process. You can use the regsvr32.exe tool to register the DLL. This utility is normally located in the Windows System32 folder. Run regsvr32 from the command line, passing the full path to the DLL you are registering as its only parameter. For example,

    c:\Windows\System32>regsvr32 "c:\Dev\Sample\BuildObject\ReleaseMinDependency\BuildObject.dll"

Figure 4. Sample build process for destination files for the translation dictionary service (Click picture to view larger image)

Creating the Initialization File

For the purposes of the build process, the entire translation dictionary service is configured in one project initialization file. In fact, the full path to this initialization file is the only parameter required by the build tool (BuildLexRefService.exe). The format of this project initialization file is that of a standard Windows initialization text file (.ini file) with a number of sections (delimited by using '[' and ']'), where each section can contain zero or more key, value pairs (delimited by using the '=' character). For example, note the following (italicized text appears as example text):

[Attributes] – Section

Name = SAMPLE – Key = Value

Description = Sample Translation Dictionary Service – Key = Value

A project initialization file contains a number of standard sections, each section containing standard key names:

The [Attributes] Section

The attributes section contains details of the translation dictionary service as a whole, and contains the following keys:

  • Name. The translation dictionary service name
  • Description. A description of the translation dictionary service
  • GUID. A unique identifier for the translation dictionary service in GUID (Globally Unique IDentifier) format
  • Locator. This key specifies the location of the project binary .its file. The build process populates the project binary .its file with all project information specified in the project initialization file. "!Attributes" is present by default in the .its file path in the Locator. This file path identifies the section in the .its file where this specified project information is to be stored. The .its file is a binary file that is accessed and modified as part of the build process, when the build tool is invoked, using proprietary technology by using the ILexRefServiceAttribute interface.
  • Loader ProgID. This key specifies the ProgID of the COM class that implements the ILexRefServiceAttribute interface that is used to populate the project binary .its file that is specified in the Locator key above.

The [Object Chains] section

This section as described in this document contains only one key, called "Main", with no value. At least one object chain must be specified, and there must always be a key called "Main," because that name is used to denote the default object chain. Note that a project initialization file can contain more than one object chain. If more than one object chain is used, additional items are contained and named in this section.

The [Object Chain:Main] section

The Object Chain section contains information pertaining to an individual object chain (in this case the object chain called "Main"). The Object Chain section typically contains a key called "Object Sequences" that names the ordered objects that make up the object chain. The build process looks for an object chain called "Main" as the default object chain.

This object chain section also contains two key, value pairs:

;UUID_ULexRefPremiumServiceKeyforOffice_RichEditText

GUID = {127BA6C6-22D4-4B84-996D-FAE36135FA01}

;UUID_ULexRefPremiumServicePrivilegeKey

Privilege = {127BA6C6-22D4-4B84-996D-FAE36135FA00}

Note the use of ';' to denote a comment. These key, value pairs are stored in the .its file as part of the build process and passed to the underlying manager for the translation dictionary service when the service is invoked. When the translation dictionary service is started by the client Office System application, these attributes are read from the service's .its file and passed to the manager for the translation dictionary service. These particular attributes grant the translation dictionary service particular rights, and failure to specify them results in the service failing to start.

The [Object:AnObject] section

This section contains attributes specific to an individual named object. The "ProgID" key is required for this section and is the named COM identified for an individual class. The "ProgID" key allows the object to be created for use as part of an object chain used in a translation dictionary service. When an individual named object is created, the attributes contained in this section (for example, the location of a file) are passed to the object by using the ILexRefService::SetAttribute Com interface call. The object can cache these attributes, if required, and refer to them when called upon to perform work, typically by a call to its implementation of ILexRefService::Exchange during a request from a client Office System application to the translation dictionary service.

The individual object section can specify a number of additional optional key, value pairs, including:

  • Name. This key names the individual object.
  • Slave Objects. If this object (so named in the code) makes use of one or more subordinate objects, this key specifies those subordinate objects, using an ordered comma-separated list of their names.
  • Build Object. This key specifies the name of the object used to build and verify the lexical data for use as part of the translation dictionary service. This information is specified once as part of the lookup object section. The build process analyzes the project configuration to determine which objects are referenced, and the build object reference in the main lookup object section ensures a valid reference to this object for the purposes of the build.

The [Build Object:BuildObj] section

Similar to the "Object:" section, the [Build Object:BuildObj] section contains a required ProgID key, together with any attributes required by the build object. The build object is invoked only during the one-off build process using BuildLexRefService.exe. The attribute keys in this section specify build-specific attributes for use in the building and verification of the final lexical data used in the translation dictionary service.

Following are the minimum requirements for an initialization file.

  • The file must have an Attributes section and an Object Chains section.
  • The Attributes section must have Name, Description, GUID, Locator, and Loader ProgID attributes.
  • The Object Chains section must have Main declared for default object chain.
  • The Object Chain:name section must exist for every object chain that is declared in the Object Chains section.
  • Every Object Chain:name section must have an Object Sequences attribute consisting of the names of objects delimited by commas.
  • The Object:name section must exist for every object that is declared in the Object Sequences attributes.
  • Any Object:name section may have the Build Object attribute defined, or not defined.
  • The Build Object:name section must exist for any build object defined in any Build Object attribute.
  • Every Object:name section and every Build Object:name section must have the ProgID attribute defined. The ProgID is the COM ProgID of Lexical Reference Service Code.

Structure of the Initialization File in a Blank Sample

The blank sample initialization file that is shown below can run through the BuildLexRefService.exe. This blank sample initialization file shows the minimum requirements for valid description of an object chain. This sample file also demonstrates the use of subordinate objects (SubordinateA and SubordinateB).

[Attributes] required
Name = Sample Project File required
Description = This is a Sample Project File required
GUID = {127BA6C6-22D4-4B84-996D-FAE36135FA10} required,
     must be a unique identifier for every service, and can be created using GUIDGen.exe, 
    a tool available for download at Microsoft.com, or as part of Microsoft development tool suites
Locator = C:\Sample.ITS!Attributes required, all information   contained in this project file gets persisted to this    location using ILlexicalReferenceServiceAttributeLoader      interface through Loader ProgID
Loader ProgID = LR.LexRefBilingualServiceAttribute.1.0 required
 
[Object Chains] required
Main required
 
[Object Chain:Main] required
Object Sequences = ObjectA, ObjectB required
 
[Object:ObjectA] required because of Object Sequences in [Object Chain: Main] section
ProgID = ObjectA ProgID required in Object section
Attribute1 = Setting for Attribute1
Attribute2 = Setting for Attribute2
AttributeN = Setting for AttributeN
 
[Object:ObjectB] required because of Object Sequences in   [Object Chain: Main] section
Build Object = BuildObjectB
Name = Object B
ProgID = ObjectB ProgID required in Object section
Slave Objects = SubordinateA, SubordinateB
Attribute1 = Setting for Attribute1
Attribute2 = Setting for Attribute2
AttributeN = Setting for AttributeN
 
[Object:SubordinateA] required because of Slave Objects in   [Object:ObjectB] section
ProgID = SubordinateA ProgID required in Object section
Attribute1 = Setting for Attribute1
Attribute2 = Setting for Attribute2
AttributeN = Setting for AttributeN
 
[Object:SubordinateB] required because of Slave Objects in   [Object:ObjectB] section
ProgID = SubordinateB ProgID required in Object section
Attribute1 = Setting for Attribute1
Attribute2 = Setting for Attribute2
AttributeN = Setting for AttributeN
 
[Build Object:BuildObjectB] required because of Build Object in   [Object:ObjectB] section
ProgID = ObjectB ProgID
Attribute1 = Setting for Attribute1
Attribute2 = Setting for Attribute2
AttributeN = Setting for AttributeN

Structure of the Initialization File in a Populated Sample

The sample initialization file that is shown below was used to create an English-to-French translation dictionary service. The service described by this sample initialization file can accept an English word as input for lookup, and it can return a French translation as output for display. The service described by this sample initialization file has an added helper object—a word stemmer—in its object chain.

[Attributes]
Name = English to French
Description = Test English to French Content Service
GUID = {127BA6C6-22D4-4B84-996D-FAE36135FA10}
Locator = C:\BILINGUALS10\CONTENTBUILD\TESTENFR.ITS!Attributes
Loader ProgID = LR.LexRefBilingualServiceAttribute.1.0

[Object Chains]
Main

[Object Chain:Main]
Object Sequences = EnglishToForeignStemmer, EnglishToForeign, XMLToRTF

;UUID_ULexRefPremiumServiceKeyforOffice_RichEditText
GUID = {127BA6C6-22D4-4B84-996D-FAE36135FA01}

;UUID_ULexRefPremiumServicePrivilegeKey
Privilege = {127BA6C6-22D4-4B84-996D-FAE36135FA00}

[Object:EnglishToForeignStemmer]
ProgID = LR.LexRefEnglishStemmer.1.0
Slave Objects = EnglishToForeign

[Object:EnglishToForeign]
Build Object = BuildEnglishToForeign
Name = English to Foreign Bilinguals
ProgID = LR.LexRefBilingualService.1.0

[Object:XMLToRTF]
ProgID = LR.LexRefSampleObject.1.0

[Build Object:BuildEnglishToForeign]
ProgID = LR.LexRefBilingualBuildService.1.0
MDB File Path = C:\BILINGUALS10\CONTENTBUILD\TEST.MDB
ITS File Path = C:\BILINGUALS10\CONTENTBUILD\TESTENFR.ITS
Project File Path = C:\BILINGUALS10\CONTENTBUILD\TEST.ITP
SQL Query = SELECT field1, field2 FROM Hw_enfr;
Index Name = TESTWW

Using the .its File

An .its file stores certain attributes of your service, such as the service's file name and GUID, once the service is built. An unused .its file, not populated with any stored attributes, is needed as an input file for the build process. When the service is built, this file is automatically altered so that it can be installed with the rest of your service files. An unused, unpopulated .its file with no stored attributes (SAMPLE.its) is provided in the code sample to serve this purpose.

Note   The SAMPLE.its file is not re-usable. Each time that you build or rebuild a service, you must copy this file to the folder from which you are building your service. If you want to rebuild your service or create another service, you must use a new copy of the unpopulated .its file at build time.

The Build Process

Once the project initialization file and accompanying files (source lexical data, empty .its file, and build object DLL) are completed, the translation dictionary service is ready to build. The build process populates the empty binary .its file with all of the service attributes necessary to give this file a form that the client Office System application can understand. The build process also validates the source lexical data and produces the final lexical data for use by the service.

BuildLexRefService.exe is the build tool. In order to build a translation dictionary service, you must copy all of the necessary files (project initialization file, source lexical data, empty .its file, and build object DLL) into a single folder. You must also ensure that the build object DLL is registered before you start the actual build process. Finally, you must open the command prompt and type "BuildLexRefService path" where path is the full path to the initialization file.

After you type that command to start the build process, the build tool parses the project initialization file and output the results to the command prompt window. You should check this output to make sure that the parsing has not found errors. Errors may cause the build to fail and may cause the .its file to remain unmodified. You need to correct any errors in the project initialization file before you run the build process again.

Next the build tool creates an instance of the build object. The ProgID attribute in the [BuildObject:x] section is used for this purpose. All of the attributes in this section are then passed to the object by the ILexRefService::SetAttribute call. Among those attributes are, for example, the source lexical data file name and the target lexical data file name. Any attributes required for the build process should be stored at this point. The build process is then invoked by an ILexRefService::Exchange call on the build object. The build process next validates the consistency of the source data and produces the final lexical data.

To complete the build process, the project information for the translation dictionary service is stored in the empty .its file. Almost all of the information in the project initialization file is stored in the .its file in a binary format recognizable to Office System applications. The information stored in the .its file includes all attributes, the object chain definition, and all object information. On service start-up (typically when the translation dictionary service is first used), the .its file is read, and this stored information is used to construct the service object chain with component helper objects (and subordinate objects, if any). Attributes are also passed to the appropriate objects at service start-up. The translation dictionary service is then available for use.

Before a newly-installed translation dictionary service can be set up and made available to Microsoft Office System applications, those applications need to locate the .its file that describes the newly installed service. This information is stored in the registry, and it is the responsibility of the installation package for the translation dictionary service to ensure that the registry is populated correctly.

Creating a Windows Installer Setup Package

The final step in creating a translation dictionary service is to construct a Windows Installer-based setup package containing the files and settings for the service.

For more information about Windows Installer see Windows Installer in the Microsoft Windows Platform SDK. You can use the Orca tool in that SDK to browse the contents of the sample's Windows Installer Database (.msi) file.

Your setup package must install all of the files used by your service, and it must also register all of the .dll files that you installed. Steps to register .dll files are described in the first part of this article's section, Building the Service. Also, you must add the information in the Registry table below to the Registry table of the setup package, and you must add the information in the PublishComponent table below to the PublishComponent table of the setup package.

Note the following information:

Some of the GUIDs listed below are distinct, even though they differ by only a single digit.

  • If a GUID is not explicitly provided here, the GUID in the sample .msi file should be used exactly as it is provided in the sample .msi file.
  • {FAD573D7-E564-11D3-8F5D-00C04F9CF4A0} is the main GUID used during building from the Attributes section of the initialization file. This GUID must be identical in several places (the initialization file, and the relevant setup tables, and the relevant registry entries), to ensure the correct operation of the translation dictionary service. If it is not identical in the file and the tables and the registry entries, the service does not function properly.

Table 2. Registry Table: GUID information used by the translation dictionary service to find the .its file

Key Name Value
SOFTWARE\Microsoft\Microsoft Reference\Bilinguals 1.0\{FAD573D7-E564-11D3-8F5D-00C04F9CF4A0} Loader ProgID LR.LexRefBilingualServiceAttribute.1.0
SOFTWARE\Microsoft\Microsoft Reference\Bilinguals 1.0\{FAD573D7-E564-11D3-8F5D-00C04F9CF4A0}   Sample Translation Dictionary Service
SOFTWARE\Microsoft\Microsoft Reference\Bilinguals 1.0\{FAD573D7-E564-11D3-8F5D-00C04F9CF4A0} Locator [INSTALLDIR]sample.its!Attributes

The three rows of the Registry table tell the translation dictionary service where to find the .its file that now contains the content service attributes.

Table 3. PublishComponent Table: GUID and LCID information used to identify a new translation dictionary service

ComponentID Qualifier Component_ AppData Feature_
{C3C48C3D-37B6-4C96-859A-C84F57D2D108} 1033/1033 SAMPLE.FAD573D7_E564_11D3_8F5D_00C04F9CF4AC {FAD573D7-E564-11D3-8F5D-00C04F9CF4A0} YOUR_MSI_FEATURE

The PublishComponent table tells Microsoft Office System about the new translation dictionary service. The Component_ column contains the component ID of your setup package. The location IDs (LCIDs) in the Qualifier column determine which languages show in the drop-down list of available translation dictionary services in the Research task pane. The LCID information shown in the PublishComponent table specifies English (U.S.), twice.

**Note   **Because the LCID system governs the drop-down list, you can install only one translation dictionary service for any particular language pair in the Research task pane of a particular Microsoft Office System application.

The ComponentID column in the PublishComponent table contains a GUID that refers to the Office System translation dictionary feature. Office System applications containing this feature look under a known key in the registry, which includes this ComponentID, to identify installed translation dictionary services. Those applications use the information under that key to locate the service's .its file. You can find this information for Microsoft Office System applications at the following locations:

HKCR\Installer\Components\ D3C84C3C 6B73 69C4 58A9 8CF4752D1D80 HKLM\SOFTWARE\Classes\Installer\Components\ D3C84C3C 6B73 69C4 58A9 8CF4752D1D80

For more information, see the sample .msi file. Note, however, that you can use the sample .msi file once only. You cannot use this .msi file as a base for multiple installations of the translation dictionary service, because each individual installation requires unique identifiers in the .msi file used for the installation process.

After you create and install your setup package on any computer that has Microsoft Office System previously installed, your new translation dictionary service appears in the drop-down list in the Research task pane, and users can activate it there. To open that task pane in a Microsoft Office System application, on the Tools menu point to Language, and then click Translate.

Conclusion

The customizable nature of the architecture for the translation dictionary service makes it possible for the owners of bilingual dictionaries or specialized bilingual lexicons to make their content available to users of Microsoft Office Systems through the Research task pane. It is a straightforward process to prepare bilingual dictionary content for display in this task pane. To do so, tag it with the XML-equivalent tags that govern the Rich Text Format display of the content. After you tag the content, the downloadable code sample enables content owners to create a basic translation service without the need for any other code. After it is installed on a computer with Microsoft Office System also installed, the translation dictionary service displays the newly built content in several Microsoft Office System applications, listed in this article's "Applies to" section.

Those who wish to do so may add functionality to a translation dictionary service by creating additional helper objects, such as stemmer objects, or by creating subordinate objects such as word breakers. While that additional functionality is not provided in this code sample, this article describes the logic of the method by which other objects can be inserted into a service's object chain.

The increasing amount of business dealing and intellectual contact between people with different native languages makes it ever more important and more valuable to obtain quick translations of words or phrases during the course of work. The Research task pane enables Microsoft Office System users to find translations without leaving the work screen of many applications. Making additional bilingual dictionary content available to those users opens a business opportunity to the owners of that content, while extending the range and agility of the applications. Bilingual dictionaries help increase the productivity and the accuracy of information workers in contact with colleagues, data, and markets around the world.

© Microsoft Corporation. All rights reserved.