Share via


Encoding Support for Code PagesĀ 

The use of Unicode in the .NET Framework simplifies the development of world-ready applications because you no longer need to reference a code page.

A code page is a list of selected character codes (characters represented as code points) in a certain order. Code pages are usually defined to support specific languages or groups of languages that share common writing systems. Windows code pages contain 256 code points and are zero-based. In most code pages, the code points 0 through 127 represent the same characters. This allows for continuity and legacy code. The code points 128 through 255 differ significantly between code pages.

For example, code page 1253 provides character codes that are required in the Greek writing system. Code page 1252 provides the characters for Latin writing systems including English, German, and French. The last 128 code points in code page 1253 contain the Greek characters, and the last 128 code points in code page 1252 contain the accent characters. As a result, you cannot store Greek and German in the same code stream unless you include an identifier that indicates the referenced code page.

The Double-Byte Character Sets (DBCS) scheme was developed for languages such as Chinese, Japanese, and Korean that contain more than 256 characters. In DBCS, a pair of code points (a double byte) represents each character. When handling DBCS data, the first byte of a DBCS character (the lead byte) is not processed by itself. It is processed in combination with the trail byte that follows immediately after it. This scheme still does not allow for the combination of two languages, such as Japanese and Chinese, in the same data stream because one pair of double-byte code points could represent different characters depending on the code page.

The .NET Framework provides support for characters encoded using code pages. You can use the Encoding.GetEncoding Method (Int32) to create a target encoding object for a specified code page. Specify a code page number as the Int32 parameter. The following code example creates an Encoding enc for the code page 1252.

Encoding enc = Encoding.GetEncoding(1252)
Encoding enc = Encoding.GetEncoding(1252);

After you create an encoding object that corresponds to a specified code page, you can use the object to perform other operations supported by the System.Text.Encoding class. For an example of using the Encoding class, see the "Using the Encoding Class" subtopic of the Using Unicode Encoding topic.

See Also

Concepts

Unicode in the .NET Framework

Other Resources

Encoding and Localization