Export (0) Print
Expand All

Encoding Class

Represents a character encoding.

Namespace:  System.Text
Assembly:  mscorlib (in mscorlib.dll)

[SerializableAttribute]
[ComVisibleAttribute(true)]
public abstract class Encoding : ICloneable

The Encoding type exposes the following members.

  NameDescription
Protected methodSupported by the XNA FrameworkSupported by Portable Class LibraryEncoding()Initializes a new instance of the Encoding class.
Protected methodSupported by the XNA FrameworkEncoding(Int32)Initializes a new instance of the Encoding class that corresponds to the specified code page.
Top

  NameDescription
Public propertyStatic memberSupported by the XNA FrameworkASCIIGets an encoding for the ASCII (7-bit) character set.
Public propertyStatic memberSupported by the XNA FrameworkSupported by Portable Class LibraryBigEndianUnicodeGets an encoding for the UTF-16 format that uses the big endian byte order.
Public propertyBodyNameWhen overridden in a derived class, gets a name for the current encoding that can be used with mail agent body tags.
Public propertySupported by the XNA FrameworkCodePageWhen overridden in a derived class, gets the code page identifier of the current Encoding.
Public propertyDecoderFallbackGets or sets the DecoderFallback object for the current Encoding object.
Public propertyStatic memberSupported by the XNA FrameworkDefaultGets an encoding for the operating system's current ANSI code page.
Public propertyEncoderFallbackGets or sets the EncoderFallback object for the current Encoding object.
Public propertyEncodingNameWhen overridden in a derived class, gets the human-readable description of the current encoding.
Public propertyHeaderNameWhen overridden in a derived class, gets a name for the current encoding that can be used with mail agent header tags.
Public propertyIsBrowserDisplayWhen overridden in a derived class, gets a value indicating whether the current encoding can be used by browser clients for displaying content.
Public propertyIsBrowserSaveWhen overridden in a derived class, gets a value indicating whether the current encoding can be used by browser clients for saving content.
Public propertyIsMailNewsDisplayWhen overridden in a derived class, gets a value indicating whether the current encoding can be used by mail and news clients for displaying content.
Public propertyIsMailNewsSaveWhen overridden in a derived class, gets a value indicating whether the current encoding can be used by mail and news clients for saving content.
Public propertyIsReadOnlyWhen overridden in a derived class, gets a value indicating whether the current encoding is read-only.
Public propertyIsSingleByteWhen overridden in a derived class, gets a value indicating whether the current encoding uses single-byte code points.
Public propertyStatic memberSupported by the XNA FrameworkSupported by Portable Class LibraryUnicodeGets an encoding for the UTF-16 format using the little endian byte order.
Public propertyStatic memberUTF32Gets an encoding for the UTF-32 format using the little endian byte order.
Public propertyStatic memberSupported by the XNA FrameworkUTF7Gets an encoding for the UTF-7 format.
Public propertyStatic memberSupported by the XNA FrameworkSupported by Portable Class LibraryUTF8Gets an encoding for the UTF-8 format.
Public propertySupported by the XNA FrameworkSupported by Portable Class LibraryWebNameWhen overridden in a derived class, gets the name registered with the Internet Assigned Numbers Authority (IANA) for the current encoding.
Public propertyWindowsCodePageWhen overridden in a derived class, gets the Windows operating system code page that most closely corresponds to the current encoding.
Top

  NameDescription
Public methodSupported by the XNA FrameworkCloneWhen overridden in a derived class, creates a shallow copy of the current Encoding object.
Public methodStatic memberSupported by the XNA FrameworkSupported by Portable Class LibraryConvert(Encoding, Encoding, Byte[])Converts an entire byte array from one encoding to another.
Public methodStatic memberSupported by the XNA FrameworkSupported by Portable Class LibraryConvert(Encoding, Encoding, Byte[], Int32, Int32)Converts a range of bytes in a byte array from one encoding to another.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryEqualsDetermines whether the specified Object is equal to the current instance. (Overrides Object.Equals(Object).)
Protected methodSupported by the XNA FrameworkSupported by Portable Class LibraryFinalizeAllows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.)
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetByteCount(Char[])When overridden in a derived class, calculates the number of bytes produced by encoding all the characters in the specified character array.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetByteCount(String)When overridden in a derived class, calculates the number of bytes produced by encoding the characters in the specified string.
Public methodGetByteCount(Char*, Int32)When overridden in a derived class, calculates the number of bytes produced by encoding a set of characters starting at the specified character pointer.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetByteCount(Char[], Int32, Int32)When overridden in a derived class, calculates the number of bytes produced by encoding a set of characters from the specified character array.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetBytes(Char[])When overridden in a derived class, encodes all the characters in the specified character array into a sequence of bytes.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetBytes(String)When overridden in a derived class, encodes all the characters in the specified string into a sequence of bytes.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetBytes(Char[], Int32, Int32)When overridden in a derived class, encodes a set of characters from the specified character array into a sequence of bytes.
Public methodGetBytes(Char*, Int32, Byte*, Int32)When overridden in a derived class, encodes a set of characters starting at the specified character pointer into a sequence of bytes that are stored starting at the specified byte pointer.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetBytes(Char[], Int32, Int32, Byte[], Int32)When overridden in a derived class, encodes a set of characters from the specified character array into the specified byte array.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetBytes(String, Int32, Int32, Byte[], Int32)When overridden in a derived class, encodes a set of characters from the specified string into the specified byte array.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetCharCount(Byte[])When overridden in a derived class, calculates the number of characters produced by decoding all the bytes in the specified byte array.
Public methodGetCharCount(Byte*, Int32)When overridden in a derived class, calculates the number of characters produced by decoding a sequence of bytes starting at the specified byte pointer.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetCharCount(Byte[], Int32, Int32)When overridden in a derived class, calculates the number of characters produced by decoding a sequence of bytes from the specified byte array.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetChars(Byte[])When overridden in a derived class, decodes all the bytes in the specified byte array into a set of characters.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetChars(Byte[], Int32, Int32)When overridden in a derived class, decodes a sequence of bytes from the specified byte array into a set of characters.
Public methodGetChars(Byte*, Int32, Char*, Int32)When overridden in a derived class, decodes a sequence of bytes starting at the specified byte pointer into a set of characters that are stored starting at the specified character pointer.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetChars(Byte[], Int32, Int32, Char[], Int32)When overridden in a derived class, decodes a sequence of bytes from the specified byte array into the specified character array.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetDecoderWhen overridden in a derived class, obtains a decoder that converts an encoded sequence of bytes into a sequence of characters.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetEncoderWhen overridden in a derived class, obtains an encoder that converts a sequence of Unicode characters into an encoded sequence of bytes.
Public methodStatic memberSupported by the XNA FrameworkGetEncoding(Int32)Returns the encoding associated with the specified code page identifier.
Public methodStatic memberSupported by the XNA FrameworkSupported by Portable Class LibraryGetEncoding(String)Returns the encoding associated with the specified code page name.
Public methodStatic memberGetEncoding(Int32, EncoderFallback, DecoderFallback)Returns the encoding associated with the specified code page identifier. Parameters specify an error handler for characters that cannot be encoded and byte sequences that cannot be decoded.
Public methodStatic memberGetEncoding(String, EncoderFallback, DecoderFallback)Returns the encoding associated with the specified code page name. Parameters specify an error handler for characters that cannot be encoded and byte sequences that cannot be decoded.
Public methodStatic memberGetEncodingsReturns an array that contains all encodings.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetHashCodeReturns the hash code for the current instance. (Overrides Object.GetHashCode().)
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetMaxByteCountWhen overridden in a derived class, calculates the maximum number of bytes produced by encoding the specified number of characters.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetMaxCharCountWhen overridden in a derived class, calculates the maximum number of characters produced by decoding the specified number of bytes.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetPreambleWhen overridden in a derived class, returns a sequence of bytes that specifies the encoding used.
Public methodGetString(Byte[])When overridden in a derived class, decodes all the bytes in the specified byte array into a string.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetString(Byte[], Int32, Int32)When overridden in a derived class, decodes a sequence of bytes from the specified byte array into a string.
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryGetTypeGets the Type of the current instance. (Inherited from Object.)
Public methodIsAlwaysNormalized()Gets a value indicating whether the current encoding is always normalized, using the default normalization form.
Public methodIsAlwaysNormalized(NormalizationForm)When overridden in a derived class, gets a value indicating whether the current encoding is always normalized, using the specified normalization form.
Protected methodSupported by the XNA FrameworkSupported by Portable Class LibraryMemberwiseCloneCreates a shallow copy of the current Object. (Inherited from Object.)
Public methodSupported by the XNA FrameworkSupported by Portable Class LibraryToStringReturns a string that represents the current object. (Inherited from Object.)
Top

Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. In contrast, decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters. For information about the Unicode Transformation Formats (UTFs) and other encodings supported by Encoding, see Character Encoding in the .NET Framework.

Note that Encoding is intended to operate on Unicode characters instead of arbitrary binary data, such as byte arrays. If your application must encode arbitrary binary data into text, it should use a protocol such as uuencode, which is implemented by methods such as Convert.ToBase64CharArray.

The .NET Framework provides the following implementations of the Encoding class to support current Unicode encodings and other encodings:

  • ASCIIEncoding encodes Unicode characters as single 7-bit ASCII characters. This encoding only supports character values between U+0000 and U+007F. Code page 20127. Also available through the ASCII property.

  • UTF7Encoding encodes Unicode characters using the UTF-7 encoding. This encoding supports all Unicode character values. Code page 65000. Also available through the UTF7 property.

  • UTF8Encoding encodes Unicode characters using the UTF-8 encoding. This encoding supports all Unicode character values. Code page 65001. Also available through the UTF8 property.

  • UnicodeEncoding encodes Unicode characters using the UTF-16 encoding. Both little endian and big endian byte orders are supported. Also available through the Unicode property and the BigEndianUnicode property.

  • UTF32Encoding encodes Unicode characters using the UTF-32 encoding. Both little endian (code page 12000) and big endian (code page 12001) byte orders are supported. Also available through the UTF32 property.

The Encoding class is primarily intended to convert between different encodings and Unicode. Often one of the derived Unicode classes is the correct choice for your application.

Your applications use the GetEncoding method to obtain other encodings. They should use the GetEncodings method to get a list of all encodings.

The following table lists the supported encodings and their associated code pages. An asterisk in the last column indicates that the code page is natively supported by the .NET Framework, regardless of the underlying platform.

Code page

Name

Display name

37

IBM037

IBM EBCDIC (US-Canada)

437

IBM437

OEM United States

500

IBM500

IBM EBCDIC (International)

708

ASMO-708

Arabic (ASMO 708)

720

DOS-720

Arabic (DOS)

737

ibm737

Greek (DOS)

775

ibm775

Baltic (DOS)

850

ibm850

Western European (DOS)

852

ibm852

Central European (DOS)

855

IBM855

OEM Cyrillic

857

ibm857

Turkish (DOS)

858

IBM00858

OEM Multilingual Latin I

860

IBM860

Portuguese (DOS)

861

ibm861

Icelandic (DOS)

862

DOS-862

Hebrew (DOS)

863

IBM863

French Canadian (DOS)

864

IBM864

Arabic (864)

865

IBM865

Nordic (DOS)

866

cp866

Cyrillic (DOS)

869

ibm869

Greek, Modern (DOS)

870

IBM870

IBM EBCDIC (Multilingual Latin-2)

874

windows-874

Thai (Windows)

875

cp875

IBM EBCDIC (Greek Modern)

932

shift_jis

Japanese (Shift-JIS)

936

gb2312

Chinese Simplified (GB2312)

*

949

ks_c_5601-1987

Korean

950

big5

Chinese Traditional (Big5)

1026

IBM1026

IBM EBCDIC (Turkish Latin-5)

1047

IBM01047

IBM Latin-1

1140

IBM01140

IBM EBCDIC (US-Canada-Euro)

1141

IBM01141

IBM EBCDIC (Germany-Euro)

1142

IBM01142

IBM EBCDIC (Denmark-Norway-Euro)

1143

IBM01143

IBM EBCDIC (Finland-Sweden-Euro)

1144

IBM01144

IBM EBCDIC (Italy-Euro)

1145

IBM01145

IBM EBCDIC (Spain-Euro)

1146

IBM01146

IBM EBCDIC (UK-Euro)

1147

IBM01147

IBM EBCDIC (France-Euro)

1148

IBM01148

IBM EBCDIC (International-Euro)

1149

IBM01149

IBM EBCDIC (Icelandic-Euro)

1200

utf-16

Unicode

*

1201

unicodeFFFE

Unicode (Big endian)

*

1250

windows-1250

Central European (Windows)

1251

windows-1251

Cyrillic (Windows)

1252

Windows-1252

Western European (Windows)

*

1253

windows-1253

Greek (Windows)

1254

windows-1254

Turkish (Windows)

1255

windows-1255

Hebrew (Windows)

1256

windows-1256

Arabic (Windows)

1257

windows-1257

Baltic (Windows)

1258

windows-1258

Vietnamese (Windows)

1361

Johab

Korean (Johab)

10000

macintosh

Western European (Mac)

10001

x-mac-japanese

Japanese (Mac)

10002

x-mac-chinesetrad

Chinese Traditional (Mac)

10003

x-mac-korean

Korean (Mac)

*

10004

x-mac-arabic

Arabic (Mac)

10005

x-mac-hebrew

Hebrew (Mac)

10006

x-mac-greek

Greek (Mac)

10007

x-mac-cyrillic

Cyrillic (Mac)

10008

x-mac-chinesesimp

Chinese Simplified (Mac)

*

10010

x-mac-romanian

Romanian (Mac)

10017

x-mac-ukrainian

Ukrainian (Mac)

10021

x-mac-thai

Thai (Mac)

10029

x-mac-ce

Central European (Mac)

10079

x-mac-icelandic

Icelandic (Mac)

10081

x-mac-turkish

Turkish (Mac)

10082

x-mac-croatian

Croatian (Mac)

12000

utf-32

Unicode (UTF-32)

*

12001

utf-32BE

Unicode (UTF-32 Big endian)

*

20000

x-Chinese-CNS

Chinese Traditional (CNS)

20001

x-cp20001

TCA Taiwan

20002

x-Chinese-Eten

Chinese Traditional (Eten)

20003

x-cp20003

IBM5550 Taiwan

20004

x-cp20004

TeleText Taiwan

20005

x-cp20005

Wang Taiwan

20105

x-IA5

Western European (IA5)

20106

x-IA5-German

German (IA5)

20107

x-IA5-Swedish

Swedish (IA5)

20108

x-IA5-Norwegian

Norwegian (IA5)

20127

us-ascii

US-ASCII

*

20261

x-cp20261

T.61

20269

x-cp20269

ISO-6937

20273

IBM273

IBM EBCDIC (Germany)

20277

IBM277

IBM EBCDIC (Denmark-Norway)

20278

IBM278

IBM EBCDIC (Finland-Sweden)

20280

IBM280

IBM EBCDIC (Italy)

20284

IBM284

IBM EBCDIC (Spain)

20285

IBM285

IBM EBCDIC (UK)

20290

IBM290

IBM EBCDIC (Japanese katakana)

20297

IBM297

IBM EBCDIC (France)

20420

IBM420

IBM EBCDIC (Arabic)

20423

IBM423

IBM EBCDIC (Greek)

20424

IBM424

IBM EBCDIC (Hebrew)

20833

x-EBCDIC-KoreanExtended

IBM EBCDIC (Korean Extended)

20838

IBM-Thai

IBM EBCDIC (Thai)

20866

koi8-r

Cyrillic (KOI8-R)

20871

IBM871

IBM EBCDIC (Icelandic)

20880

IBM880

IBM EBCDIC (Cyrillic Russian)

20905

IBM905

IBM EBCDIC (Turkish)

20924

IBM00924

IBM Latin-1

20932

EUC-JP

Japanese (JIS 0208-1990 and 0212-1990)

20936

x-cp20936

Chinese Simplified (GB2312-80)

*

20949

x-cp20949

Korean Wansung

*

21025

cp1025

IBM EBCDIC (Cyrillic Serbian-Bulgarian)

21866

koi8-u

Cyrillic (KOI8-U)

28591

iso-8859-1

Western European (ISO)

*

28592

iso-8859-2

Central European (ISO)

28593

iso-8859-3

Latin 3 (ISO)

28594

iso-8859-4

Baltic (ISO)

28595

iso-8859-5

Cyrillic (ISO)

28596

iso-8859-6

Arabic (ISO)

28597

iso-8859-7

Greek (ISO)

28598

iso-8859-8

Hebrew (ISO-Visual)

*

28599

iso-8859-9

Turkish (ISO)

28603

iso-8859-13

Estonian (ISO)

28605

iso-8859-15

Latin 9 (ISO)

29001

x-Europa

Europa

38598

iso-8859-8-i

Hebrew (ISO-Logical)

*

50220

iso-2022-jp

Japanese (JIS)

*

50221

csISO2022JP

Japanese (JIS-Allow 1 byte Kana)

*

50222

iso-2022-jp

Japanese (JIS-Allow 1 byte Kana - SO/SI)

*

50225

iso-2022-kr

Korean (ISO)

*

50227

x-cp50227

Chinese Simplified (ISO-2022)

*

51932

euc-jp

Japanese (EUC)

*

51936

EUC-CN

Chinese Simplified (EUC)

*

51949

euc-kr

Korean (EUC)

*

52936

hz-gb-2312

Chinese Simplified (HZ)

*

54936

GB18030

Chinese Simplified (GB18030)

*

57002

x-iscii-de

ISCII Devanagari

*

57003

x-iscii-be

ISCII Bengali

*

57004

x-iscii-ta

ISCII Tamil

*

57005

x-iscii-te

ISCII Telugu

*

57006

x-iscii-as

ISCII Assamese

*

57007

x-iscii-or

ISCII Oriya

*

57008

x-iscii-ka

ISCII Kannada

*

57009

x-iscii-ma

ISCII Malayalam

*

57010

x-iscii-gu

ISCII Gujarati

*

57011

x-iscii-pa

ISCII Punjabi

*

65000

utf-7

Unicode (UTF-7)

*

65001

utf-8

Unicode (UTF-8)

*

If the data to be converted is available only in sequential blocks (such as data read from a stream) or if the amount of data is so large that it needs to be divided into smaller blocks, your application should use the Decoder or the Encoder provided by the GetDecoder method or the GetEncoder method, respectively, of a derived class.

The UTF-16 and the UTF-32 encoders can use the big endian byte order (most significant byte first) or the little endian byte order (least significant byte first). For example, the Latin Capital Letter A (U+0041) is serialized as follows (in hexadecimal):

  • UTF-16 big endian byte order: 00 41

  • UTF-16 little endian byte order: 41 00

  • UTF-32 big endian byte order: 00 00 00 41

  • UTF-32 little endian byte order: 41 00 00 00

It is generally more efficient to store Unicode characters using the native byte order. For example, it is better to use the little endian byte order on little endian platforms, such as Intel computers.

The GetPreamble method retrieves an array of bytes that includes the byte order mark (BOM). If this byte array is prefixed to an encoded stream, it helps the decoder to identify the encoding format used.

For more information on byte order and the byte order mark, see The Unicode Standard at the Unicode home page.

Note that the encoding classes allow errors to:

  • Silently change to a "?" character.

  • Use a "best fit" character.

  • Change to an application-specific behavior through use of the EncoderFallback and DecoderFallback classes with the U+FFFD Unicode replacement character.

Your applications are recommended to throw exceptions on all data stream errors. An application either uses a "throwonerror" flag when applicable or uses the EncoderExceptionFallback and DecoderExceptionFallback classes. Best fit fallback is often not recommended because it can cause data loss or confusion and is slower than simple character replacements. For ANSI encodings, the best fit behavior is the default.

The following example converts a string from one encoding to another.

NoteNote

The byte[] array is the only type in this example that contains the encoded data. The .NET Char and String types are themselves Unicode, so the GetChars call decodes the data back to Unicode.


using System;
using System.Text;

class Example
{
   static void Main()
   {
      string unicodeString = "This string contains the unicode character Pi (\u03a0)";

      // Create two different encodings.
      Encoding ascii = Encoding.ASCII;
      Encoding unicode = Encoding.Unicode;

      // Convert the string into a byte array.
      byte[] unicodeBytes = unicode.GetBytes(unicodeString);

      // Perform the conversion from one encoding to the other.
      byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);

      // Convert the new byte[] into a char[] and then into a string.
      char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
      ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
      string asciiString = new string(asciiChars);

      // Display the strings created before and after the conversion.
      Console.WriteLine("Original string: {0}", unicodeString);
      Console.WriteLine("Ascii converted string: {0}", asciiString);
   }
}
// The example displays the following output:
//    Original string: This string contains the unicode character Pi (Π)
//    Ascii converted string: This string contains the unicode character Pi (?)


.NET Framework

Supported in: 4, 3.5, 3.0, 2.0, 1.1, 1.0

.NET Framework Client Profile

Supported in: 4, 3.5 SP1

Portable Class Library

Supported in: Portable Class Library

Windows 7, Windows Vista SP1 or later, Windows XP SP3, Windows XP SP2 x64 Edition, Windows Server 2008 (Server Core not supported), Windows Server 2008 R2 (Server Core supported with SP1 or later), Windows Server 2003 SP2

The .NET Framework does not support all versions of every platform. For a list of the supported versions, see .NET Framework System Requirements.

Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.

Community Additions

ADD
Show:
© 2014 Microsoft