Support for Multibyte Character Sets (MBCSs)
Multibyte character sets (MBCSs) are an alternative to Unicode for supporting character sets, like Japanese and Chinese, that cannot be represented in a single byte. If you are programming for an international market, consider using either Unicode or MBCS, or enabling your program so you can build it for either by changing a switch.
The most common MBCS implementation is double-byte character sets (DBCSs). Visual C++ in general, and MFC in particular, is fully enabled for DBCS.
For samples, see the MFC source code files.
For platforms used in markets whose languages use large character sets, the best alternative to Unicode is MBCS. MFC supports MBCS by using internationalizable data types and C run-time functions. You should do the same in your code.
Under MBCS, characters are encoded in either 1 or 2 bytes. In 2-byte characters, the first, or lead byte, signals that both it and the following byte are to be interpreted as one character. The first byte comes from a range of codes reserved for use as lead bytes. Which ranges of bytes can be lead bytes depends on the code page in use. For example, Japanese code page 932 uses the range 0x81 through 0x9F as lead bytes, but Korean code page 949 uses a different range.
Consider all the following in your MBCS programming.
Note: |
|---|
Behavior is undefined if you define both _UNICODE and _MBCS. |
The Mbctype.h and Mbstring.h header files define MBCS-specific functions and macros, which you might need in some cases. For example, _ismbblead tells you whether a specific byte in a string is a lead byte.
For international portability, code your program with Unicode or multibyte character sets (MBCSs).
Note: