0 out of 3 rated this helpful - Rate this topic

UnicodeEncoding Class

Represents a UTF-16 encoding of Unicode characters.

System.Object
  System.Text.Encoding
    System.Text.UnicodeEncoding

Namespace:  System.Text
Assembly:  mscorlib (in mscorlib.dll)
[SerializableAttribute]
[ComVisibleAttribute(true)]
public class UnicodeEncoding : Encoding

The UnicodeEncoding type exposes the following members.

  Name Description
Public method Supported by the XNA Framework Supported by Portable Class Library UnicodeEncoding() Initializes a new instance of the UnicodeEncoding class.
Public method Supported by the XNA Framework Supported by Portable Class Library UnicodeEncoding(Boolean, Boolean) Initializes a new instance of the UnicodeEncoding class. Parameters specify whether to use the big endian byte order and whether to provide a Unicode byte order mark.
Public method Supported by the XNA Framework Supported by Portable Class Library UnicodeEncoding(Boolean, Boolean, Boolean) Initializes a new instance of the UnicodeEncoding class. Parameters specify whether to use the big endian byte order, whether to provide a Unicode byte order mark, and whether to throw an exception when an invalid encoding is detected.
Top
  Name Description
Public property BodyName When overridden in a derived class, gets a name for the current encoding that can be used with mail agent body tags. (Inherited from Encoding.)
Public property Supported by the XNA Framework CodePage When overridden in a derived class, gets the code page identifier of the current Encoding. (Inherited from Encoding.)
Public property DecoderFallback Gets or sets the DecoderFallback object for the current Encoding object. (Inherited from Encoding.)
Public property EncoderFallback Gets or sets the EncoderFallback object for the current Encoding object. (Inherited from Encoding.)
Public property EncodingName When overridden in a derived class, gets the human-readable description of the current encoding. (Inherited from Encoding.)
Public property HeaderName When overridden in a derived class, gets a name for the current encoding that can be used with mail agent header tags. (Inherited from Encoding.)
Public property IsBrowserDisplay When overridden in a derived class, gets a value indicating whether the current encoding can be used by browser clients for displaying content. (Inherited from Encoding.)
Public property IsBrowserSave When overridden in a derived class, gets a value indicating whether the current encoding can be used by browser clients for saving content. (Inherited from Encoding.)
Public property IsMailNewsDisplay When overridden in a derived class, gets a value indicating whether the current encoding can be used by mail and news clients for displaying content. (Inherited from Encoding.)
Public property IsMailNewsSave When overridden in a derived class, gets a value indicating whether the current encoding can be used by mail and news clients for saving content. (Inherited from Encoding.)
Public property IsReadOnly When overridden in a derived class, gets a value indicating whether the current encoding is read-only. (Inherited from Encoding.)
Public property IsSingleByte When overridden in a derived class, gets a value indicating whether the current encoding uses single-byte code points. (Inherited from Encoding.)
Public property Supported by the XNA Framework Supported by Portable Class Library WebName When overridden in a derived class, gets the name registered with the Internet Assigned Numbers Authority (IANA) for the current encoding. (Inherited from Encoding.)
Public property WindowsCodePage When overridden in a derived class, gets the Windows operating system code page that most closely corresponds to the current encoding. (Inherited from Encoding.)
Top
  Name Description
Public method Supported by the XNA Framework Clone When overridden in a derived class, creates a shallow copy of the current Encoding object. (Inherited from Encoding.)
Public method Supported by the XNA Framework Supported by Portable Class Library Equals Determines whether the specified Object is equal to the current UnicodeEncoding object. (Overrides Encoding.Equals(Object).)
Protected method Supported by the XNA Framework Supported by Portable Class Library Finalize Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.)
Public method Supported by the XNA Framework Supported by Portable Class Library GetByteCount(Char[]) When overridden in a derived class, calculates the number of bytes produced by encoding all the characters in the specified character array. (Inherited from Encoding.)
Public method Supported by the XNA Framework Supported by Portable Class Library GetByteCount(String) Calculates the number of bytes produced by encoding the characters in the specified String. (Overrides Encoding.GetByteCount(String).)
Public method GetByteCount(Char*, Int32) Calculates the number of bytes produced by encoding a set of characters starting at the specified character pointer. (Overrides Encoding.GetByteCount(Char*, Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetByteCount(Char[], Int32, Int32) Calculates the number of bytes produced by encoding a set of characters from the specified character array. (Overrides Encoding.GetByteCount(Char[], Int32, Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetBytes(Char[]) When overridden in a derived class, encodes all the characters in the specified character array into a sequence of bytes. (Inherited from Encoding.)
Public method Supported by the XNA Framework Supported by Portable Class Library GetBytes(String) When overridden in a derived class, encodes all the characters in the specified string into a sequence of bytes. (Inherited from Encoding.)
Public method Supported by the XNA Framework Supported by Portable Class Library GetBytes(Char[], Int32, Int32) When overridden in a derived class, encodes a set of characters from the specified character array into a sequence of bytes. (Inherited from Encoding.)
Public method GetBytes(Char*, Int32, Byte*, Int32) Encodes a set of characters starting at the specified character pointer into a sequence of bytes that are stored starting at the specified byte pointer. (Overrides Encoding.GetBytes(Char*, Int32, Byte*, Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetBytes(Char[], Int32, Int32, Byte[], Int32) Encodes a set of characters from the specified character array into the specified byte array. (Overrides Encoding.GetBytes(Char[], Int32, Int32, Byte[], Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetBytes(String, Int32, Int32, Byte[], Int32) Encodes a set of characters from the specified String into the specified byte array. (Overrides Encoding.GetBytes(String, Int32, Int32, Byte[], Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetCharCount(Byte[]) When overridden in a derived class, calculates the number of characters produced by decoding all the bytes in the specified byte array. (Inherited from Encoding.)
Public method GetCharCount(Byte*, Int32) Calculates the number of characters produced by decoding a sequence of bytes starting at the specified byte pointer. (Overrides Encoding.GetCharCount(Byte*, Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetCharCount(Byte[], Int32, Int32) Calculates the number of characters produced by decoding a sequence of bytes from the specified byte array. (Overrides Encoding.GetCharCount(Byte[], Int32, Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetChars(Byte[]) When overridden in a derived class, decodes all the bytes in the specified byte array into a set of characters. (Inherited from Encoding.)
Public method Supported by the XNA Framework Supported by Portable Class Library GetChars(Byte[], Int32, Int32) When overridden in a derived class, decodes a sequence of bytes from the specified byte array into a set of characters. (Inherited from Encoding.)
Public method GetChars(Byte*, Int32, Char*, Int32) Decodes a sequence of bytes starting at the specified byte pointer into a set of characters that are stored starting at the specified character pointer. (Overrides Encoding.GetChars(Byte*, Int32, Char*, Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetChars(Byte[], Int32, Int32, Char[], Int32) Decodes a sequence of bytes from the specified byte array into the specified character array. (Overrides Encoding.GetChars(Byte[], Int32, Int32, Char[], Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetDecoder Obtains a decoder that converts a UTF-16 encoded sequence of bytes into a sequence of Unicode characters. (Overrides Encoding.GetDecoder().)
Public method Supported by the XNA Framework Supported by Portable Class Library GetEncoder Obtains an encoder that converts a sequence of Unicode characters into a UTF-16 encoded sequence of bytes. (Overrides Encoding.GetEncoder().)
Public method Supported by the XNA Framework Supported by Portable Class Library GetHashCode Returns the hash code for the current instance. (Overrides Encoding.GetHashCode().)
Public method Supported by the XNA Framework Supported by Portable Class Library GetMaxByteCount Calculates the maximum number of bytes produced by encoding the specified number of characters. (Overrides Encoding.GetMaxByteCount(Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetMaxCharCount Calculates the maximum number of characters produced by decoding the specified number of bytes. (Overrides Encoding.GetMaxCharCount(Int32).)
Public method Supported by the XNA Framework Supported by Portable Class Library GetPreamble Returns a Unicode byte order mark encoded in UTF-16 format, if the constructor for this instance requests a byte order mark. (Overrides Encoding.GetPreamble().)
Public method GetString(Byte[]) When overridden in a derived class, decodes all the bytes in the specified byte array into a string. (Inherited from Encoding.)
Public method Supported by the XNA Framework Supported by Portable Class Library GetString(Byte[], Int32, Int32) Decodes a range of bytes from a byte array into a string. (Overrides Encoding.GetString(Byte[], Int32, Int32).)

In XNA Framework 3.0, this member is inherited from Encoding.GetString(Byte[], Int32, Int32).


In Portable Class Library Portable Class Library, this member is inherited from Encoding.GetString(Byte[], Int32, Int32).
Public method Supported by the XNA Framework Supported by Portable Class Library GetType Gets the Type of the current instance. (Inherited from Object.)
Public method IsAlwaysNormalized() Gets a value indicating whether the current encoding is always normalized, using the default normalization form. (Inherited from Encoding.)
Public method IsAlwaysNormalized(NormalizationForm) When overridden in a derived class, gets a value indicating whether the current encoding is always normalized, using the specified normalization form. (Inherited from Encoding.)
Protected method Supported by the XNA Framework Supported by Portable Class Library MemberwiseClone Creates a shallow copy of the current Object. (Inherited from Object.)
Public method Supported by the XNA Framework Supported by Portable Class Library ToString Returns a string that represents the current object. (Inherited from Object.)
Top
  Name Description
Public field Static member Supported by the XNA Framework CharSize Represents the Unicode version 2.0 character size in bytes. This field is a constant.
Top

Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. Decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters.

The Unicode Standard assigns a code point (a number) to each character in every supported script. A Unicode Transformation Format (UTF) is a way to encode that code point. The Unicode Standard uses the following UTFs:

  • UTF-8, which represents each code point as a sequence of one to four bytes.

  • UTF-16, which represents each code point as a sequence of one to two 16-bit integers.

  • UTF-32, which represents each code point as a 32-bit integer.

Note Note

The UTF-7 encoding supports certain protocols for which it is required, most often e-mail or newsgroup protocols. Since UTF-7 is not particularly secure or robust, it should generally not be used. UTF-8 should normally be preferred to UTF-7.

For more information about the UTFs and other encodings supported by System.Text, see Character Encoding in the .NET Framework and Using Unicode Encoding.

The GetByteCount method determines how many bytes result in encoding a set of Unicode characters, and the GetBytes method performs the actual encoding.

Likewise, the GetCharCount method determines how many characters result in decoding a sequence of bytes, and the GetChars and GetString methods perform the actual decoding.

UnicodeEncoding corresponds to the Windows code pages 1200 (little endian byte order) and 1201 (big endian byte order).

The encoder can use the big endian byte order (most significant byte first) or the little endian byte order (least significant byte first). For example, the Latin Capital Letter A (code point U+0041) is serialized as follows (in hexadecimal):

  • Big endian byte order: 00 00 00 41

  • Little endian byte order: 41 00 00 00

It is generally more efficient to store Unicode characters using the native byte order. For example, it is better to use the little endian byte order on little endian platforms, such as Intel computers.

Optionally, the UnicodeEncoding object provides a preamble, which is an array of bytes that can be prefixed to the sequence of bytes resulting from the encoding process. If the preamble contains a byte order mark (BOM), it helps the decoder determine the byte order and the transformation format or UTF. The GetPreamble method retrieves an array of bytes that can include the BOM.

Note Note

To enable error detection and to make the class instance more secure, the application should use the UnicodeEncoding constructor that takes a throwOnInvalidBytes parameter, and set that parameter to true. With error detection, a method that detects an invalid sequence of characters or bytes throws a ArgumentException. Without error detection, no exception is thrown, and the invalid sequence is generally ignored.

The following example demonstrates how to encode a string of Unicode characters into a byte array, using UnicodeEncoding. The byte array is decoded into a string to demonstrate that there is no loss of data.


using System;
using System.Text;

class UnicodeEncodingExample {
    public static void Main() {
        // The encoding.
        UnicodeEncoding unicode = new UnicodeEncoding();

        // Create a string that contains Unicode characters.
        String unicodeString =
            "This Unicode string contains two characters " +
            "with codes outside the traditional ASCII code range, " +
            "Pi (\u03a0) and Sigma (\u03a3).";
        Console.WriteLine("Original string:");
        Console.WriteLine(unicodeString);

        // Encode the string.
        Byte[] encodedBytes = unicode.GetBytes(unicodeString);
        Console.WriteLine();
        Console.WriteLine("Encoded bytes:");
        foreach (Byte b in encodedBytes) {
            Console.Write("[{0}]", b);
        }
        Console.WriteLine();

        // Decode bytes back to string.
        // Notice Pi and Sigma characters are still present.
        String decodedString = unicode.GetString(encodedBytes);
        Console.WriteLine();
        Console.WriteLine("Decoded bytes:");
        Console.WriteLine(decodedString);
    }
}


.NET Framework

Supported in: 4, 3.5, 3.0, 2.0, 1.1, 1.0

.NET Framework Client Profile

Supported in: 4, 3.5 SP1

Portable Class Library

Supported in: Portable Class Library

Windows 7, Windows Vista SP1 or later, Windows XP SP3, Windows XP SP2 x64 Edition, Windows Server 2008 (Server Core not supported), Windows Server 2008 R2 (Server Core supported with SP1 or later), Windows Server 2003 SP2

The .NET Framework does not support all versions of every platform. For a list of the supported versions, see .NET Framework System Requirements.
Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.
Did you find this helpful?
(1500 characters remaining)
Community Content Add
Annotations FAQ