Global Appeal

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

Aa155739.offvba(en-us,office.10).gif

Sound+Vision

Access 2000 Internationalization and Localization

By Michael Kaplan and Rebecca Riordan

Whether you're building applications for commercial distribution, or you're an in-house developer for a multi-national company, the time has probably come to consider deployment in multiple languages. Microsoft Office has learned some important lessons from its sales, and although lying is often considered the leading cause of statistics, we'll throw a few true statistics at you to put this in perspective.

Over 60 percent of Office sales are to non-US countries, and between 70 and 100 percent of the people in those countries find that, when given a choice between a localized product and the English product, they use the localized product. Clearly, users are saying they want the product's interface to be in their own language. If it's your intention to "crack" an international market, ignore this request at your own (or your product's) peril.

Office 2000 has taken some big steps forward in this regard. Excel, Word, and Access 2000 now have the same .exe files for all languages, and they simply plug in different language DLLs depending on the language. In fact, Microsoft actually allows companies to buy the "Language Pack" set for Office 2000, which is a set of eight CDs that allows you to switch languages without reinstalling the product each time (very similar to the Windows 2000 Multilanguage UI Pack). Although this product is only available to large corporations, a smaller, openly available product contains the proofing tools (spelling/thesaurus) for all languages (including many languages into which no localization is currently done, such as Catalan, Galician, and Ukrainian).

Access 2000 stores data as Unicode. This is critical for multi-language applications, because it means that a single .mdb can store data in multiple languages. (In Access 97, the Jet engine stored data as ANSI, which meant you could only store data on a single code page.) In fact, Access 2000 uses Uniscribe internally, which allows languages like Arabic, Hebrew, and Thai to be displayed, even if you're on an operating system that doesn't provide a way to type in the data. Uniscribe can also display many of the new "Unicode only" languages supported in Windows 2000: You need Windows 2000 to easily type in languages like Hindi and Georgian, but display of the data in the database is easy on any platform.

Internationalization and Localization

This takes care of the underlying platform on which your application is run. But what about making your application leverage all of this work? Creating an application that can be successfully deployed in multiple languages/locales requires two distinct processes: internationalization and localization. Internationalization, sometimes called "globalization," is the process of ensuring that the database schema, user interface, and underlying code base will support multiple languages. Localization, on the other hand, is the actual translation of the product for a specific locale.

Note that we said "locale" here, not "language." Locale is a broader concept than language, and it includes such issues as the local time, currency, and the conventional display of dates and numbers. For example, a user in Paris and one in Quebec might both be speaking French, but the former writes dates as dd/mm/yyyy and uses a 24-hour clock, and the latter uses the mm/dd/yyyy and am/pm conventions. Fortunately, Windows makes most locale information easily available to applications. All you have to do is ensure that your system can handle all the varieties of format and structure. We'll be looking at that a little later in this article.

The internationalization and localization processes correspond roughly to the design and implementation phases of the development process. "The later in the development process a problem is found, the more expensive it is to fix" rule is especially true in this case. You don't want to wait until user testing to discover that your way-cool universal error handler needs another parameter in order to support Japanese, a change that requires using 2,375 distinct procedures.

Sometimes you may decide to delay the localization process until after the base language version has been successfully deployed, or even until the next version of the product. If there's any chance the application will be deployed in multiple languages, internationalization should be built in from the outset. As a rule, the internationalization process is the most important, as few organizations will be impressed by a localized product that doesn't work properly.

In cases where the localization is done later, the implementation phase is actually broken into two parts: the base language implementation, and the localized implementation. Even if localization will be delayed, however, it is a good idea to consider some of the issues for the localization phase just so you make the move to a localized product in the future easier. As always in system design, you must be certain that it will be possible to implement the application. So even though we'll be concentrating on internationalization here, we'll look at a few pertinent implementation techniques.

Internationalization can potentially affect all areas of an application, and consideration to it must be given throughout the design process. However, it will most likely impact the database schema and the user interface, and we'll concentrate on these areas.

Database Schema

It's fairly unlikely that internationalization will affect the core logical structure of a data model. Orders still consist of multiple line items, whether they're placed in New York or Tokyo. However, when designing the database schema, you should pay careful attention to the length of the fields.

English is a relatively compact language. You can expect character data to grow by anything from 10 to 100 percent in other languages, depending on the length of the original string and the destination language. As a general rule, the shorter the string, the longer it's likely to be in another language. Languages like German prove to be a good test case for this; on average, a German translation of an English string is about 30 percent bigger than its English source.

Because Jet uses variable length fields for character data, there won't be any performance or space hits, so you can afford to be generous with the defined length of character fields. (Microsoft SQL Server also supports variable length data types, but there is a slight performance hit during edits.)

Numeric data lengths may also need to be adjusted in international applications. A currency amount expressed in Italian Lire will be several orders of magnitude larger than one in US Dollars, for example.

In fact, Access itself has an interesting feature in regards to the "Currency" format of the Currency data type. It will replace the named format called "Currency" with a hard-coded string representing the original format in the language on which the format string was set. This really makes sense, as an application with data in it might suffer from real problems if suddenly $100.00 meant £100.00. For example, when taking a database created on a US English locale to one on a French locale, all occurrences of the format fields that said "Currency" were replaced with "$ # ##0,00;($# ##0,00)" throughout the database. Internally, Access remembers that this really was a Currency format on US English, and you can easily see it change to "$ # ##0.00;($# ##0.00)" for German (Germany) and then back to "Currency" in US English. Internally, there must be some property that lets Access know the language on which the "Currency" field was originally set. It's too bad that this "property" isn't something we can set!

This can cause a small problem for international applications, where you really wanted the behavior to be making use of the actual Currency format no matter what the language. The workaround is that you will have to explicitly set the format to Currency in your code later, rather than rely on this property at the table or query level.

Another important issue if your application is intended for most of Europe is support for the Euro. Access provides two separate support means: One is a Euro format that works in the same way as the Currency format, except for the replacement of the control panel's currency symbol with the Euro (€) symbol. Another way is the EuroConvert function, which allows for conversion between any currency in the European Union and the Euro, using the appropriate amount of decimal places for accuracy that the locale requires.

If any of the tables in your database schema support financial calculations, such as tax tables for example, you must expect these to change drastically between locales. Sometimes it will be possible to make the database schema sufficiently generic to support multiple locales. An example of this sort of adjustment can be seen in FIGURES 1 and 2. The original structure, shown in FIGURE 1, allows a single "SalesTax" percentage to be applied to the order total. This is appropriate to the United States, but may not work elsewhere.

Aa155739.vba200007mk_f_image001(en-us,office.10).gif
FIGURE 1: One way to view relationships.

The second structure, shown in FIGURE 2, allows multiple taxes and fees (e.g. duty) to be assigned to an order, and also allows the specific calculation to be determined at run time by calling the function specified in the Calculation field.

Aa155739.vba200007mk_f_image003(en-us,office.10).jpg
FIGURE 2: Adjusting to support multiple locales.

User Interface

Internationalizing the user interface of your application means that the application must be capable of being translated into multiple languages, and provide comfortable input for users speaking those languages. In practical terms, you must consider two aspects of the user interface: the display of static text and graphics, and the display and input of dynamic text.

Static Text and Graphics

When allowing for the translation of text displayed to users from English to another language, the primary issue is, of course, one of allowing sufficient display space. Unfortunately, the exact amount of space required is impossible to predict. Given that screen real estate is always scarce, this can often require considerable ingenuity in your form layouts.

One problem is leaving sufficient space for the expansion of labels without leaving an unattractive (and potentially confusing) gap between the label and the control. Stacking the label on top of the control, rather than next to it, can help, but at the cost of vertical space, which may require implementing a tab control. Then the length of the tab captions may force the tabs into two rows, which takes more vertical space, which ... you get the idea.

Unfortunately, there's no magic wand for these problems. You can only err on the side of spaciousness, hope for the best, and be prepared to adjust the form layout during the localization phase. As a general rule, the 30 percent heuristic derived from German translations is probably a good minimum, with the understanding that it's a good idea to leave as much extra space as you comfortably can without sacrificing the quality of the user interface.

For Far East languages, Windows 2000 supports Input Method Editors (IMEs) on all versions, so there are very few special issues with Far East text that don't also exist in other languages. The only major exception to this is font size: The typical eight-point font setting isn't adequate for these languages; a nine- to 10-point minimum is something to try instead. Access 2000 even adds a Vertical property to labels and textboxes for when vertical text is needed.

Localization into bi-directional languages, such as Arabic, Hebrew, and Yiddish, is slightly more complicated, because users who read these languages will expect the user interface to be "flipped" to right-to-left (see FIGURES 3 and 4). The important properties that control what the user will see in the controls themselves are TextAlign and ReadingOrder. In prior versions, the setting of these properties would be ignored if you were not on Hebrew or Arabic Windows, but in Access 2000 they're always useable.

Aa155739.vba200007mk_f_image005(en-us,office.10).gif
FIGURE 3: A standard UI.

Aa155739.vba200007mk_f_image007(en-us,office.10).gif
FIGURE 4: A "flipped" UI.

Although it's possible to do complex work to make this sort of "flipped UI" possible, simple code, such as the function shown in FIGURE 5, provides an easy way to allow localization of this sort. It relies on the "Office UI language" and changes its behavior if the UI language is Arabic or Hebrew (for many add-ins to Access, following the UI language of Access makes the most sense).

Public Sub LocalizeForm(frm As 
      Access.Form) 
       
      
        On 
      Error Resume 
      Next
       
        
      Dim ctl As Access.Control
        
      Dim fBidi As 
      Boolean
        
      Dim UILang As 
      Long
       
        UILang = 
      LanguageSettings.LanguageID(msoLanguageIDUI) 
       
        ' Set the 
      BiDi flag for Arabic or Hebrew. 
        fBidi = ((UILang = 1025) 
      Or (UILang = 1037)) 
        If 
      fBidi Then frm.Orientation = 1
        ' 
      CONSIDER: Fix the form's caption, etc.? 
       
        
      For Each ctl 
      In frm.Controls
          
       If IsNumeric(ctl.Tag) 
      Then
            
       Select Case 
      ctl.ControlType
      
              
       Case acCommandButton, acToggleButton, _
      
                   acPage, 
      acLabel
      
                
       ' Change the caption based on an ID in the tag. 
      
      
                ctl.Caption 
      = ' Some way to look up strings! 
            
       End Select
          
       End If
          
       If fBidi 
Then
            
       ' Flip controls, RTL. 
      
            ctl.Left = (frm.Width - 
      ctl.Left - ctl.Width) 
            
       ' Handle TextAlign. 
            
       Select Case 
      ctl.ControlType
      
              
       Case acComboBox, acLabel, acListBox, 
      acTextBox
      
                ctl.TextAlign 
      = 3  ' Right. 
            
       End Select
            
       Select Case 
      ctl.ControlType
      
              
       Case acCheckBox, acComboBox, acCommandButton, 
      _
      
                   acLabel, 
      acListBox, acOptionButton, _
      
                   acTextBox, 
      acToggleButton
      
                ctl.ReadingOrder 
      = 2  ' rtl. 
            
      End Select
          
       End If
        
      Next ctl
      End Sub

FIGURE 5: Allowing for localization at the form level.

The internationalization process can affect the graphics used in your application, as well as the text. Purely decorative graphics are unlikely to be a problem, but you will need to be careful about any images that are intended to be meaningful to the user, such as toolbar graphics. As a general rule, you should avoid including text in your graphics that would, of course, need to be translated. You should also avoid images that are only symbolically, rather then directly, meaningful. A sort button with "ABC" on it (such as those seen in some Access wizards!) won't be very helpful to users whose alphabet doesn't contain these letters. Similarly, a football goalpost used to call the "Sales Goals" form, for example, is only going to be mnemonically helpful to people familiar with American football. That being said, however, most people are pretty good at associating images with concepts, no matter how tenuous the cultural association might be. So while the goalpost might not be immediately obvious, learning its meaning won't be difficult for most users.

Images that might be laughable, confusing, or actively offensive are a more worrisome issue. Urban legends abound regarding marketing disasters of this type. Some countries actually have laws that can prevent the sale of your product if specific references appear to be made in your software. An example of the latter that many users of early betas of Office 97 saw was the SuperPup Office Assistant making an upward pointing gesture that resembled a Nazi salute. Later beta versions and the shipping version of Office 97 removed this gesture; if it had been left in, the sale of the product in Germany may well have been banned entirely.

Unfortunately, this kind of information can be difficult to come by. The best option is to run the images by someone with personal knowledge of the locales in which your application will be deployed, and to do so early enough in the development process so they can still be changed without causing delivery delays or cost overruns.

Dynamic Text

The capture and display of dynamic text - primarily control contents - is typically less problematic. You must ensure the controls are long enough to accept the values, and that format and validation definitions are either sufficiently generic, or identified as part of the localization process. This is especially true for date formats, where misunderstandings between mm/dd/yyyy and dd/mm/yyyy formats can wreak havoc on an application's data.

For Far East locales, the various IME properties control the behavior of the control. The IMEHold property determines whether the user will be able to use IME at all (an example of a case where Windows itself disallows the IME is in the Windows logon password TextBox). Assuming the IME is allowed in a particular control, the IMEMode property controls the default mode in which the IME should be opened, and the IMESentenceMode to determine what type of additional conversion to allow by default. The choices for "Sentence Mode" are Normal, Plural (supports additional dictionaries with name, geographic, and postal data), Speaking (supports conversational language), and No Conversion (characters are settled without extra conversion).

Another interesting property is the FELineBreak property, which helps prevent line breaks that separate punctuation from the text they're delimiting. One important difference between Access 2000 and prior versions is that these properties are always present through code, even if they don't appear in the property sheet. A full discussion of user expectations regarding these properties is beyond the scope of this article, but luckily Access 2000 Help covers all three properties. Again, someone with personal knowledge of the locales should prove invaluable here.

For bi-directional languages, the TextAlign and ReadingOrder properties can be handled with logic, such as that shown in the procedure in FIGURE 5 (the LocalizeForm procedure handles static and dynamic controls). You can set the KeyboardLanguage property if you want to change the keyboard language from the user's current settings to English, Arabic, or Hebrew when the control gets focus. If you don't set the property, then the user can choose which keyboard to use.

In most other circumstances, these issues are fairly straightforward, and, provided they're addressed early enough in the development process, not too difficult to resolve. It's sometimes the case, however, that the order of the controls on a form, or the flow of work between forms, needs to be adjusted for different locales. The most common example of this is an address component. The name, order, and structure of addresses vary between locales. If you need to treat addresses as structured data, there are several alternatives.

You can, of course, ignore the issue completely. A couple of controls being slightly out of their optimum order isn't usually the end of the world. In The Netherlands, where Rebecca lives, for example, postal codes are conventionally placed before the city name. But an envelope with the postal code in the last place will still be delivered (Amazon.com does it all the time).

If the application needs to accommodate users who primarily do data entry from forms (and, therefore, not looking at the screen), an application that has controls in an awkward order goes from being a minor inconvenience to a major pain. If this situation applies to your application, you should give serious consideration to localizing the layout of the controls. Access does this in its Database Wizard by generating forms and reports from "city state zip" to "zip city state" layout, depending on the machine's regional settings. To accomplish this, the wizard calls the GetUserDefaultLCID API function, and then uses the "city state zip" format for all locales listed in the table in FIGURE 6.

Language

Locale ID

English (US)

&H409

English (UK)

&H809

English (Australia)

&HC09

English (Canada)

&H1009

English (New Zealand)

&H1409

English (Ireland)

&H1809

English (South Africa)

&H1C09

Portuguese (Brazil)

&H416

Portuguese (Portugal)

&H816

FIGURE 6: Languages/locales that prefer a "city state zip" address order.

All other languages default to the (more common) "zip city state" layout. This is by no means a perfect solution (e.g. some countries don't use postal codes, and in the United States the post office prefers the text to be in all capital letters with no punctuation), but it does provide a starting point.

Other examples of internationally friendly behavior from the Access wizards that you can borrow include:

  • Input masks displayed by the Input Mask wizard are "country based" (see the CountryCode function in the listing in FIGURE 7 for an example of getting this information).
  • The Mailing Label Wizard uses a procedure similar to the FEnglishMeasurements procedure in the listing in FIGURE 7 to determine whether to show metric labels by default.
  • The Database Wizard reports are sized such that the information on the pages will fit whether the default printer's paper size is A4 or 8.5x11.
Const LOCALE_IMEASURE = &HD  ' 0 = metric, 1 = US. 
      Const LOCALE_ICOUNTRY = &H5  ' Country code. 
      
      Const IMEASURE_ENGLISH = 1
      Const CTRY_DEFAULT = 0
      Const CTRY_AUSTRALIA = 61
      Const CTRY_AUSTRIA = 43
      Const CTRY_BELGIUM = 32
      Const CTRY_BRAZIL = 55
      Const CTRY_CANADA = 2
      Const CTRY_DENMARK = 45
      Const CTRY_FINLAND = 358
      Const CTRY_FRANCE = 33
      Const CTRY_GERMANY = 49
      Const CTRY_ICELAND = 354
      Const CTRY_IRELAND = 353
      Const CTRY_ITALY = 39
      Const CTRY_JAPAN = 81
      Const CTRY_MEXICO = 52
      Const CTRY_NETHERLANDS = 31
      Const CTRY_NEW_ZEALAND = 64
      Const CTRY_NORWAY = 47
      Const CTRY_PORTUGAL = 351
      Const CTRY_PRCHINA = 86
      Const CTRY_SOUTH_KOREA = 82
      Const CTRY_SPAIN = 34
      Const CTRY_SWEDEN = 46
      Const CTRY_SWITZERLAND = 41
      Const CTRY_TAIWAN = 886
      Const CTRY_UNITED_KINGDOM = 44
      Const CTRY_UNITED_STATES = 1
       Declare Function GetLocaleInfo 
      Lib "kernel32" _
        Alias "GetLocaleInfoA" (ByVal lcid As Long,_
        ByVal LCTYPE As Long, lpData As Any, _
        ByVal cchData As Integer) As Long
       Declare Function GetUserDefaultLCID _
        Lib "kernel32" ()As Long
       Function FEnglishMeasurements()As Boolean
        FEnglishMeasurements =  (Val(StGetLocaleInfo( _
          LOCALE_IMEASURE)) = IMEASURE_ENGLISH) 
      End Function
       Function CountryCode()As Long
        CountryCode = Val(StGetLocaleInfo(LOCALE_ICOUNTRY)) 
      End Function
       Function StGetLocaleInfo(ByVal LCTYPE As Long) As String
         Dim lcid As Long
        Dim cch As Long
        Dim stBuff As String * 255
         ' Get current language ID. 
        lcid = GetUserDefaultLCID()
        ' Ask for the locale info. 
        cch = GetLocaleInfo( _
                lcid, LCTYPE, ByVal stBuff, Len(stBuff)) 
        StGetLocaleInfo = Left$(stBuff,  cch) 
      End Function

FIGURE 7: Using the GetUserDefaultLCID API function.

Conclusion

Today's worldwide marketplace provides interesting challenges, and lucrative opportunities, for the applications you develop. With Windows 2000 and Office 2000 leading the way, users will no longer be expecting a user interface that doesn't take international issues into account. This means you must address these issues, or users may look around for a product that does. In addition to the specific issues and tips discussed here, this article will hopefully inspire a "global" frame of mind and allow you to create applications that will behave well on any locale to which you wish to deploy.

Michael Kaplan is the owner and lead developer of Trigeminal Software, Inc., a consulting firm that focuses on all types of solutions in Microsoft Visual Basic, Access, and SQL Server, especially relating to replication and multinational applications. A former member of the Microsoft Access development team, he has spoken at many conferences and contributed to several publications and books on VB, Access, and SQL Server development. You can reach Michael at michka@trigeminal.com or visit him on the Web at his truly worldwide (localized!) Web site at http://www.trigeminal.com/.

Rebecca Riordan is an independent consultant specializing in the design of database and work support systems. With 17 years of experience in the field, Rebecca has earned an international reputation for designing and implementing computer systems that are technically sound, reliable, and effectively meet her clients' needs. She is the author of Designing Relational Database Systems [Microsoft Press, 1999]. You can reach Rebecca at rebeccar@attglobal.net.

Page view tracker