Supporting Multilingual Text Display with Big Fonts

Article
02/06/2008

The multiple input languages that Windows 95 supports might fall into a single charset or they might span several charsets. To support the display of text that spans multiple charsets, the system uses "big fonts." As you can see in Appendix H, each Windows charset (code page) contains the set of ASCII characters. Though a big font can represent multiple Windows charsets, it contains only one set of glyphs for ASCII. Just as applications identify Arial Bold and Arial Italic as separate logical fonts, they identify Arial Greek and Arial Russian as separate logical fonts as well. Arial can now have a Greek or Russian property in the same way that it can have a bold or italic style; just as you would save font style information in your document files, you should now save font charset information. The glyphs are all contained in a single font file to maintain consistency of font metrics.

Font, Locale, and Charset Signatures

The TrueType open specification outlines the standard for big fonts, which is shared by Windows 95 and Windows NT. From now on, font vendors need to tag each font they create with a font signature, which is part of the OS/2 table in a TrueType open font. Windows and Windows-based applications use the following structure to exchange font signature information:

typedef struct tagFONTSIGNATURE {
DWORD fsUsb[4];
DWORD fsCsb[2];
} FONTSIGNATURE;

A font signature contains two sets of bits, one for Unicode subranges and one for Windows and OEM code pages. The lower 32 bits of fsCsb refer to Windows code pages, and the upper 32 bits refer to OEM code pages. The fsUsb field contains an extra bit that is reserved for the future in case additional Unicode subranges are added. If a font contains glyphs to represent characters in a particular charset, the font vendor will set the bits that identify the charset and the corresponding Unicode range. Strictly speaking, font signatures identify code pages, but you can use the table in Figure 6-5 of the previous section to determine which code-page ID corresponds to a particular charset. (See Appendix M for the list of font signature bit values.)

Windows 95 introduces two other signatures that are closely related to font signatures. The first is the charset signature, which contains a charset ID, the corresponding code-page ID, and a generic font signature whose bit fields identify the charset/code page in question.

typedef struct tagCHARSETINFO {
UINT ciCharset;
UINT ciACP;
FONTSIGNATURE fs;
} CHARSETINFO;

The other signature is the locale signature, which identifies the Unicode subranges and the charsets that can express the characters used in the locale.

typedef struct tagLOCALEFONTSIGNATURE {
DWORD lsUsb[4];
DWORD lsCsbDefault[2];
DWORD lsCsbSupported[2];
} LOCALEFONTSIGNATURE;

Only one bit is set in the lsCsbDefault field—the bit for the charset most commonly associated with the locale. In contrast, the lsCsbSupported field sets the bits for all charsets that will work for the locale. For example, the default charset for English locales is ANSI, but because all Windows charsets contain English letters, numbers, and basic punctuation, all Windows charsets support English. Therefore, all bits are set in the lsCsbSupported field for English locale signatures.

The WM_INPUTLANGCHANGE message conveniently hands you the default charset ID of the new input language so that you don't have to look for it. When you trap this message, you can find the charset value in wParam.

You can use font, locale, and charset signatures to determine whether or not to accept a WM_INPUTLANGCHANGEREQUEST message. For example, suppose your application allows the user to change to only the input languages that the current font can accommodate. The code below shows how you can retrieve various signatures and compare them. The lParam of WM_INPUTLANGCHANGEREQUEST contains the keyboard layout handle for the input language the user is requesting. The low word of this handle is a language ID, which the sample code passes to GetLocaleInfo in order to retrieve a locale signature.

If your application stores the font charset information whenever the font changes (in this example, in a variable called iCurrFontCharset), it can call the API function TranslateCharsetInfo in order to retrieve a charset signature. You can actually call TranslateCharsetInfo in three ways—you can pass in a charset, a code page, or a generic font signature. The function determines the other charset signature fields. Once the signature identifying the charset of the current font has been retrieved, the sample code below compares it with the locale signature's charset information to see whether the current font can accommodate the requested input language. If it can, it passes the message to DefWindowProc.

int iCurrFontCharset;
LOCALEFONTSIGNATURE ls;
CHARSETINFO cs;

switch (wMsg)
{

...

case WM_INPUTLANGCHANGEREQUEST:
GetLocaleInfo(LOWORD(lParam), LOCALE_FONTSIGNATURE,
(LPSTR)&ls, sizeof(ls));
TranslateCharsetInfo( ((LONG)iCurrFontCharset),
&cs, TCI_SRCCHARSET );
if (cs.fs.fsCsb[0] & ls.lsCsbSupported[0])
return DefWindowProc(hwnd, wMsg, wParam, lParam);
else
return (LPARAM)0;

...

}

You can achieve the same results by comparing the locale signature with the font signature of the current font. The following code calls GetTextCharsetInfo on the current device context to retrieve the font signature for the active font:

FONTSIGNATURE fs;
LOCALEFONTSIGNATURE ls;

switch(wMsg)
{

...

case WM_INPUTLANGCHANGEREQUEST:
GetLocaleInfo(LOWORD(lParam), LOCALE_FONTSIGNATURE,
(LPSTR)&ls, sizeof(ls));
GetTextCharsetInfo(hdc, &fs, 0);
if (fs.fsCsb[0] & ls.lsCsbSupported[0])
return DefWindowProc(hwnd, wMsg, wParam, lParam);
else
return (LPARAM)0;

...

}

Share via

Supporting Multilingual Text Display with Big Fonts

Additional resources