Global Appeal
This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
Sound+Vision
Access 2000 Internationalization and Localization
By Michael Kaplan and Rebecca Riordan
Whether you're building applications for commercial
distribution, or you're an in-house developer for a multi-national
company, the time has probably come to consider deployment in multiple
languages. Microsoft Office has learned some important lessons from its
sales, and although lying is often considered the leading cause of
statistics, we'll throw a few true statistics at you to put this in
perspective.
Over 60 percent of Office sales are to non-US countries,
and between 70 and 100 percent of the people in those countries find that,
when given a choice between a localized product and the English product,
they use the localized product. Clearly, users are saying they want the
product's interface to be in their own language. If it's your intention to
"crack" an international market, ignore this request at your own (or your
product's) peril.
Office 2000 has taken some big steps forward in this
regard. Excel, Word, and Access 2000 now have the same .exe files for all
languages, and they simply plug in different language DLLs depending on
the language. In fact, Microsoft actually allows companies to buy the
"Language Pack" set for Office 2000, which is a set of eight CDs that
allows you to switch languages without reinstalling the product each time
(very similar to the Windows 2000 Multilanguage UI Pack). Although this
product is only available to large corporations, a smaller, openly
available product contains the proofing tools (spelling/thesaurus) for all
languages (including many languages into which no localization is
currently done, such as Catalan, Galician, and Ukrainian).
Access 2000 stores data as Unicode. This is critical for
multi-language applications, because it means that a single .mdb can store
data in multiple languages. (In Access 97, the Jet engine stored data as
ANSI, which meant you could only store data on a single code page.) In
fact, Access 2000 uses Uniscribe internally, which allows languages like
Arabic, Hebrew, and Thai to be displayed, even if you're on an operating
system that doesn't provide a way to type in the data. Uniscribe can also
display many of the new "Unicode only" languages supported in Windows
2000: You need Windows 2000 to easily type in languages like Hindi and
Georgian, but display of the data in the database is easy on any platform.
Internationalization and Localization
This takes care of the underlying platform on which your
application is run. But what about making your application leverage all of
this work? Creating an application that can be successfully deployed in
multiple languages/locales requires two distinct processes:
internationalization and localization. Internationalization, sometimes
called "globalization," is the process of ensuring that the database
schema, user interface, and underlying code base will support multiple
languages. Localization, on the other hand, is the actual translation of
the product for a specific locale.
Note that we said "locale" here, not "language." Locale
is a broader concept than language, and it includes such issues as the
local time, currency, and the conventional display of dates and numbers.
For example, a user in Paris and one in Quebec might both be speaking
French, but the former writes dates as dd/mm/yyyy and uses a 24-hour
clock, and the latter uses the mm/dd/yyyy and am/pm conventions.
Fortunately, Windows makes most locale information easily available to
applications. All you have to do is ensure that your system can handle all
the varieties of format and structure. We'll be looking at that a little
later in this article.
The internationalization and localization processes
correspond roughly to the design and implementation phases of the
development process. "The later in the development process a problem is
found, the more expensive it is to fix" rule is especially true in this
case. You don't want to wait until user testing to discover that your
way-cool universal error handler needs another parameter in order to
support Japanese, a change that requires using 2,375 distinct procedures.
Sometimes you may decide to delay the localization
process until after the base language version has been successfully
deployed, or even until the next version of the product. If there's any
chance the application will be deployed in multiple languages,
internationalization should be built in from the outset. As a rule, the
internationalization process is the most important, as few organizations
will be impressed by a localized product that doesn't work properly.
In cases where the localization is done later, the
implementation phase is actually broken into two parts: the base language
implementation, and the localized implementation. Even if localization
will be delayed, however, it is a good idea to consider some of the issues
for the localization phase just so you make the move to a localized
product in the future easier. As always in system design, you must be
certain that it will be possible to implement the application. So even
though we'll be concentrating on internationalization here, we'll look at
a few pertinent implementation techniques.
Internationalization can potentially affect all areas of
an application, and consideration to it must be given throughout the
design process. However, it will most likely impact the database schema
and the user interface, and we'll concentrate on these areas.
Database Schema
It's fairly unlikely that internationalization will
affect the core logical structure of a data model. Orders still consist of
multiple line items, whether they're placed in New York or Tokyo. However,
when designing the database schema, you should pay careful attention to
the length of the fields.
English is a relatively compact language. You can expect
character data to grow by anything from 10 to 100 percent in other
languages, depending on the length of the original string and the
destination language. As a general rule, the shorter the string, the
longer it's likely to be in another language. Languages like German prove
to be a good test case for this; on average, a German translation of an
English string is about 30 percent bigger than its English source.
Because Jet uses variable length fields for character
data, there won't be any performance or space hits, so you can afford to
be generous with the defined length of character fields. (Microsoft SQL
Server also supports variable length data types, but there is a slight
performance hit during edits.)
Numeric data lengths may also need to be adjusted in
international applications. A currency amount expressed in Italian Lire
will be several orders of magnitude larger than one in US Dollars, for
example.
In fact, Access itself has an interesting feature in
regards to the "Currency" format of the Currency data type. It will
replace the named format called "Currency" with a hard-coded string
representing the original format in the language on which the format
string was set. This really makes sense, as an application with data in it
might suffer from real problems if suddenly $100.00 meant £100.00. For
example, when taking a database created on a US English locale to one on a
French locale, all occurrences of the format fields that said "Currency"
were replaced with "$ # ##0,00;($# ##0,00)" throughout the database.
Internally, Access remembers that this really was a Currency format on US
English, and you can easily see it change to "$ # ##0.00;($# ##0.00)" for
German (Germany) and then back to "Currency" in US English. Internally,
there must be some property that lets Access know the language on which
the "Currency" field was originally set. It's too bad that this "property"
isn't something we can set!
This can cause a small problem for international
applications, where you really wanted the behavior to be making use of the
actual Currency format no matter what the language. The workaround is that
you will have to explicitly set the format to Currency in your code later,
rather than rely on this property at the table or query level.
Another important issue if your application is intended
for most of Europe is support for the Euro. Access provides two separate
support means: One is a Euro format that works in the same way as the
Currency format, except for the replacement of the control panel's
currency symbol with the Euro () symbol. Another way is the EuroConvert function, which
allows for conversion between any currency in the European Union and the
Euro, using the appropriate amount of decimal places for accuracy that the
locale requires.
If any of the tables in your database schema support
financial calculations, such as tax tables for example, you must expect
these to change drastically between locales. Sometimes it will be possible
to make the database schema sufficiently generic to support multiple
locales. An example of this sort of adjustment can be seen in FIGURES 1
and 2. The original structure, shown in FIGURE 1, allows a single
"SalesTax" percentage to be applied to the order total. This is
appropriate to the United States, but may not work elsewhere.
FIGURE
1: One way to view relationships.
The second structure, shown in FIGURE 2, allows multiple
taxes and fees (e.g. duty) to be assigned to an order, and also allows the
specific calculation to be determined at run time by calling the function
specified in the Calculation field.
FIGURE
2: Adjusting to support multiple locales.
User Interface
Internationalizing the user interface of your
application means that the application must be capable of being translated
into multiple languages, and provide comfortable input for users speaking
those languages. In practical terms, you must consider two aspects of the
user interface: the display of static text and graphics, and the display
and input of dynamic text.
Static Text and Graphics
When allowing for the translation of text displayed to
users from English to another language, the primary issue is, of course,
one of allowing sufficient display space. Unfortunately, the exact amount
of space required is impossible to predict. Given that screen real estate
is always scarce, this can often require considerable ingenuity in your
form layouts.
One problem is leaving sufficient space for the
expansion of labels without leaving an unattractive (and potentially
confusing) gap between the label and the control. Stacking the label on
top of the control, rather than next to it, can help, but at the cost of
vertical space, which may require implementing a tab control. Then the
length of the tab captions may force the tabs into two rows, which takes
more vertical space, which ... you get the idea.
Unfortunately, there's no magic wand for these problems.
You can only err on the side of spaciousness, hope for the best, and be
prepared to adjust the form layout during the localization phase. As a
general rule, the 30 percent heuristic derived from German translations is
probably a good minimum, with the understanding that it's a good idea to
leave as much extra space as you comfortably can without sacrificing the
quality of the user interface.
For Far East languages, Windows 2000 supports Input
Method Editors (IMEs) on all versions, so there are very few special
issues with Far East text that don't also exist in other languages. The
only major exception to this is font size: The typical eight-point font
setting isn't adequate for these languages; a nine- to 10-point minimum is
something to try instead. Access 2000 even adds a Vertical property
to labels and textboxes for when vertical text is needed.
Localization into bi-directional languages, such as
Arabic, Hebrew, and Yiddish, is slightly more complicated, because users
who read these languages will expect the user interface to be "flipped" to
right-to-left (see FIGURES 3 and 4). The important properties that control
what the user will see in the controls themselves are TextAlign and
ReadingOrder. In prior versions, the setting of these properties
would be ignored if you were not on Hebrew or Arabic Windows, but in
Access 2000 they're always useable.
FIGURE
3: A standard UI.
FIGURE
4: A "flipped" UI.
Although it's possible to do complex work to make this
sort of "flipped UI" possible, simple code, such as the function shown in
FIGURE 5, provides an easy way to allow localization of this sort. It
relies on the "Office UI language" and changes its behavior if the UI
language is Arabic or Hebrew (for many add-ins to Access, following the UI
language of Access makes the most sense).
Public Sub LocalizeForm(frm As
Access.Form)
On
Error Resume
Next
Dim ctl As Access.Control
Dim fBidi As
Boolean
Dim UILang As
Long
UILang =
LanguageSettings.LanguageID(msoLanguageIDUI)
' Set the
BiDi flag for Arabic or Hebrew.
fBidi = ((UILang = 1025)
Or (UILang = 1037))
If
fBidi Then frm.Orientation = 1
'
CONSIDER: Fix the form's caption, etc.?
For Each ctl
In frm.Controls
If IsNumeric(ctl.Tag)
Then
Select Case
ctl.ControlType
Case acCommandButton, acToggleButton, _
acPage,
acLabel
' Change the caption based on an ID in the tag.
ctl.Caption
= ' Some way to look up strings!
End Select
End If
If fBidi
Then
' Flip controls, RTL.
ctl.Left = (frm.Width -
ctl.Left - ctl.Width)
' Handle TextAlign.
Select Case
ctl.ControlType
Case acComboBox, acLabel, acListBox,
acTextBox
ctl.TextAlign
= 3 ' Right.
End Select
Select Case
ctl.ControlType
Case acCheckBox, acComboBox, acCommandButton,
_
acLabel,
acListBox, acOptionButton, _
acTextBox,
acToggleButton
ctl.ReadingOrder
= 2 ' rtl.
End Select
End If
Next ctl
End Sub
FIGURE 5: Allowing for localization at the form level.
The internationalization process can affect the graphics
used in your application, as well as the text. Purely decorative graphics
are unlikely to be a problem, but you will need to be careful about any
images that are intended to be meaningful to the user, such as toolbar
graphics. As a general rule, you should avoid including text in your
graphics that would, of course, need to be translated. You should also
avoid images that are only symbolically, rather then directly, meaningful.
A sort button with "ABC" on it (such as those seen in some Access
wizards!) won't be very helpful to users whose alphabet doesn't contain
these letters. Similarly, a football goalpost used to call the "Sales
Goals" form, for example, is only going to be mnemonically helpful to
people familiar with American football. That being said, however, most
people are pretty good at associating images with concepts, no matter how
tenuous the cultural association might be. So while the goalpost might not
be immediately obvious, learning its meaning won't be difficult for most
users.
Images that might be laughable, confusing, or actively
offensive are a more worrisome issue. Urban legends abound regarding
marketing disasters of this type. Some countries actually have laws that
can prevent the sale of your product if specific references appear to be
made in your software. An example of the latter that many users of early
betas of Office 97 saw was the SuperPup Office Assistant making an upward
pointing gesture that resembled a Nazi salute. Later beta versions and the
shipping version of Office 97 removed this gesture; if it had been left
in, the sale of the product in Germany may well have been banned entirely.
Unfortunately, this kind of information can be difficult
to come by. The best option is to run the images by someone with personal
knowledge of the locales in which your application will be deployed, and
to do so early enough in the development process so they can still be
changed without causing delivery delays or cost overruns.
Dynamic Text
The capture and display of dynamic text - primarily
control contents - is typically less problematic. You must ensure the
controls are long enough to accept the values, and that format and
validation definitions are either sufficiently generic, or identified as
part of the localization process. This is especially true for date
formats, where misunderstandings between mm/dd/yyyy and dd/mm/yyyy formats
can wreak havoc on an application's data.
For Far East locales, the various IME properties control
the behavior of the control. The IMEHold property determines
whether the user will be able to use IME at all (an example of a case
where Windows itself disallows the IME is in the Windows logon password
TextBox). Assuming the IME is allowed in a particular control, the
IMEMode property controls the default mode in which the IME should
be opened, and the IMESentenceMode to determine what type of
additional conversion to allow by default. The choices for "Sentence Mode"
are Normal, Plural (supports additional dictionaries with
name, geographic, and postal data), Speaking (supports
conversational language), and No Conversion (characters are settled
without extra conversion).
Another interesting property is the FELineBreak
property, which helps prevent line breaks that separate punctuation from
the text they're delimiting. One important difference between Access 2000
and prior versions is that these properties are always present through
code, even if they don't appear in the property sheet. A full discussion
of user expectations regarding these properties is beyond the scope of
this article, but luckily Access 2000 Help covers all three properties.
Again, someone with personal knowledge of the locales should prove
invaluable here.
For bi-directional languages, the TextAlign and
ReadingOrder properties can be handled with logic, such as that
shown in the procedure in FIGURE 5 (the LocalizeForm procedure handles
static and dynamic controls). You can set the KeyboardLanguage
property if you want to change the keyboard language from the user's
current settings to English, Arabic, or Hebrew when the control gets
focus. If you don't set the property, then the user can choose which
keyboard to use.
In most other circumstances, these issues are fairly
straightforward, and, provided they're addressed early enough in the
development process, not too difficult to resolve. It's sometimes the
case, however, that the order of the controls on a form, or the flow of
work between forms, needs to be adjusted for different locales. The most
common example of this is an address component. The name, order, and
structure of addresses vary between locales. If you need to treat
addresses as structured data, there are several alternatives.
You can, of course, ignore the issue completely. A
couple of controls being slightly out of their optimum order isn't usually
the end of the world. In The Netherlands, where Rebecca lives, for
example, postal codes are conventionally placed before the city name. But
an envelope with the postal code in the last place will still be delivered
(Amazon.com does it all the time).
If the application needs to accommodate users who
primarily do data entry from forms (and, therefore, not looking at the
screen), an application that has controls in an awkward order goes from
being a minor inconvenience to a major pain. If this situation applies to
your application, you should give serious consideration to localizing the
layout of the controls. Access does this in its Database Wizard by
generating forms and reports from "city state zip" to "zip city state"
layout, depending on the machine's regional settings. To accomplish this,
the wizard calls the GetUserDefaultLCID API function, and then uses
the "city state zip" format for all locales listed in the table in FIGURE
6.
|
Language |
Locale ID |
|
English (US) |
&H409 |
|
English (UK) |
&H809 |
|
English (Australia) |
&HC09 |
|
English (Canada) |
&H1009 |
|
English (New Zealand) |
&H1409 |
|
English (Ireland) |
&H1809 |
|
English (South Africa) |
&H1C09 |
|
Portuguese (Brazil) |
&H416 |
|
Portuguese (Portugal) |
&H816 |
FIGURE 6: Languages/locales that prefer a "city state
zip" address order.
All other languages default to the (more common) "zip
city state" layout. This is by no means a perfect solution (e.g. some
countries don't use postal codes, and in the United States the post office
prefers the text to be in all capital letters with no punctuation), but it
does provide a starting point.
Other examples of internationally friendly behavior from
the Access wizards that you can borrow include:
- Input masks displayed by the Input Mask wizard are "country based"
(see the CountryCode function in the listing in FIGURE 7 for an
example of getting this information).
- The Mailing Label Wizard uses a procedure similar to the
FEnglishMeasurements procedure in the listing in FIGURE 7 to
determine whether to show metric labels by default.
- The Database Wizard reports are sized such that the information on
the pages will fit whether the default printer's paper size is A4 or
8.5x11.
Const LOCALE_IMEASURE = &HD ' 0 = metric, 1 = US.
Const LOCALE_ICOUNTRY = &H5 ' Country code.
Const IMEASURE_ENGLISH = 1
Const CTRY_DEFAULT = 0
Const CTRY_AUSTRALIA = 61
Const CTRY_AUSTRIA = 43
Const CTRY_BELGIUM = 32
Const CTRY_BRAZIL = 55
Const CTRY_CANADA = 2
Const CTRY_DENMARK = 45
Const CTRY_FINLAND = 358
Const CTRY_FRANCE = 33
Const CTRY_GERMANY = 49
Const CTRY_ICELAND = 354
Const CTRY_IRELAND = 353
Const CTRY_ITALY = 39
Const CTRY_JAPAN = 81
Const CTRY_MEXICO = 52
Const CTRY_NETHERLANDS = 31
Const CTRY_NEW_ZEALAND = 64
Const CTRY_NORWAY = 47
Const CTRY_PORTUGAL = 351
Const CTRY_PRCHINA = 86
Const CTRY_SOUTH_KOREA = 82
Const CTRY_SPAIN = 34
Const CTRY_SWEDEN = 46
Const CTRY_SWITZERLAND = 41
Const CTRY_TAIWAN = 886
Const CTRY_UNITED_KINGDOM = 44
Const CTRY_UNITED_STATES = 1
Declare Function GetLocaleInfo
Lib "kernel32" _
Alias "GetLocaleInfoA" (ByVal lcid As Long,_
ByVal LCTYPE As Long, lpData As Any, _
ByVal cchData As Integer) As Long
Declare Function GetUserDefaultLCID _
Lib "kernel32" ()As Long
Function FEnglishMeasurements()As Boolean
FEnglishMeasurements = (Val(StGetLocaleInfo( _
LOCALE_IMEASURE)) = IMEASURE_ENGLISH)
End Function
Function CountryCode()As Long
CountryCode = Val(StGetLocaleInfo(LOCALE_ICOUNTRY))
End Function
Function StGetLocaleInfo(ByVal LCTYPE As Long) As String
Dim lcid As Long
Dim cch As Long
Dim stBuff As String * 255
' Get current language ID.
lcid = GetUserDefaultLCID()
' Ask for the locale info.
cch = GetLocaleInfo( _
lcid, LCTYPE, ByVal stBuff, Len(stBuff))
StGetLocaleInfo = Left$(stBuff, cch)
End Function
FIGURE 7: Using the GetUserDefaultLCID API
function.
Conclusion
Today's worldwide marketplace provides interesting
challenges, and lucrative opportunities, for the applications you develop.
With Windows 2000 and Office 2000 leading the way, users will no longer be
expecting a user interface that doesn't take international issues into
account. This means you must address these issues, or users may look
around for a product that does. In addition to the specific issues and
tips discussed here, this article will hopefully inspire a "global" frame
of mind and allow you to create applications that will behave well on any
locale to which you wish to deploy.
Michael Kaplan is the owner and lead developer of
Trigeminal Software, Inc., a consulting firm that focuses on all types of
solutions in Microsoft Visual Basic, Access, and SQL Server, especially
relating to replication and multinational applications. A former member of
the Microsoft Access development team, he has spoken at many conferences
and contributed to several publications and books on VB, Access, and SQL
Server development. You can reach Michael at michka@trigeminal.com or
visit him on the Web at his truly worldwide (localized!) Web site at http://www.trigeminal.com/.
Rebecca Riordan is an independent consultant specializing
in the design of database and work support systems. With 17 years of
experience in the field, Rebecca has earned an international reputation
for designing and implementing computer systems that are technically
sound, reliable, and effectively meet her clients' needs. She is the
author of Designing Relational Database Systems [Microsoft Press, 1999].
You can reach Rebecca at rebeccar@attglobal.net.