Windows apps
Collapse the table of content
Expand the table of content
Information
The topic you requested is included in another documentation set. For convenience, it's displayed below. Choose Switch to see the topic in its original location.

Character Sets

A "character set" is a mapping of characters to their identifying code values. The character set most commonly used in computers today is Unicode, a global standard for character encoding. Internally, Windows applications use the UTF-16 implementation of Unicode. In UTF-16, most characters are identified by two-byte codes. The less commonly used supplementary characters are each represented by a surrogate pair, which is a pair of two-byte codes. For more information, see Surrogates and Supplementary Characters.

Some Windows applications must work with the older character sets that are native to Windows Me/98/95. Windows code pages allow your application to work with these character sets. These character sets can be divided into:

  • Single-byte character sets (SBCS). In an SBCS, each character is identified by a value one byte wide.
  • Multibyte character sets, in particular the double-byte character sets (DBCS). Multibyte character sets provide a means to represent the large number of characters in many Asian languages.

For more information, see the following topics:

Related topics

About Unicode and Character Sets

 

 

Show:
© 2017 Microsoft