Export (0) Print
Expand All
3 out of 8 rated this helpful - Rate this topic

UTF8Encoding Class

Represents a UTF-8 encoding of Unicode characters.

Namespace: System.Text
Assembly: mscorlib (in mscorlib.dll)

[SerializableAttribute] 
[ComVisibleAttribute(true)] 
public class UTF8Encoding : Encoding
/** @attribute SerializableAttribute() */ 
/** @attribute ComVisibleAttribute(true) */ 
public class UTF8Encoding extends Encoding
SerializableAttribute 
ComVisibleAttribute(true) 
public class UTF8Encoding extends Encoding

Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. Decoding is the reverse; it is the process of transforming a sequence of encoded bytes into a set of Unicode characters.

The Unicode Standard assigns a code point (a number) to each character in every supported script. A Unicode Transformation Format (UTF) is a way to encode that code point. The Unicode Standard version 3.2 uses the following UTFs:

  • UTF-8, which represents each code point as a sequence of one to four bytes.

  • UTF-16, which represents each code point as a sequence of one to two 16-bit integers.

  • UTF-32, which represents each code point as a 32-bit integer.

The GetByteCount method determines how many bytes result in encoding a set of Unicode characters, and the GetBytes method performs the actual encoding.

Likewise, the GetCharCount method determines how many characters result in decoding a sequence of bytes, and the GetChars and GetString methods perform the actual decoding.

Optionally, the UTF8Encoding provides a preamble, which is an array of bytes that you can prefix to the sequence of bytes resulting from the encoding process. If the preamble contains a byte order mark (code point U+FEFF), it helps the decoder determine the byte order and the transformation format or UTF. The Unicode byte order mark is serialized as EF BB BF (in hexadecimal). The GetPreamble method returns an array of bytes containing the byte order mark.

For more information on Unicode encoding, byte order, and the byte order mark, see The Unicode Standard at www.unicode.org.

NoteNote

To enable error detection and to make the class instance more secure, use the UTF8Encoding constructor that takes a throwOnInvalidBytes parameter and set that parameter to true. With error detection, a method that detects an invalid sequence of characters or bytes throws an ArgumentException. Without error detection, no exception is thrown, and the invalid sequence is generally ignored.

UTF8Encoding corresponds to the Windows code page 65001.

NoteNote

The state of a UTF-8 encoded object is not preserved if the object is serialized and deserialized using different .NET Framework versions.

The following example demonstrates how to use a UTF8Encoding to encode a string of Unicode characters and store them in a byte array. Notice that when encodedBytes is decoded back to a string there is no loss of data.

using System;
using System.Text;

class UTF8EncodingExample {
    public static void Main() {
        // Create a UTF-8 encoding.
        UTF8Encoding utf8 = new UTF8Encoding();
        
        // A Unicode string with two characters outside an 8-bit code range.
        String unicodeString =
            "This unicode string contains two characters " +
            "with codes outside an 8-bit code range, " +
            "Pi (\u03a0) and Sigma (\u03a3).";
        Console.WriteLine("Original string:");
        Console.WriteLine(unicodeString);

        // Encode the string.
        Byte[] encodedBytes = utf8.GetBytes(unicodeString);
        Console.WriteLine();
        Console.WriteLine("Encoded bytes:");
        foreach (Byte b in encodedBytes) {
            Console.Write("[{0}]", b);
        }
        Console.WriteLine();
        
        // Decode bytes back to string.
        // Notice Pi and Sigma characters are still present.
        String decodedString = utf8.GetString(encodedBytes);
        Console.WriteLine();
        Console.WriteLine("Decoded bytes:");
        Console.WriteLine(decodedString);
    }
}

import System.*;
import System.Text.*;

class UTF8EncodingExample
{
    public static void main(String[] args)
    {
        // Create a UTF-8 encoding.
        UTF8Encoding utf8 = new UTF8Encoding();

        // A Unicode string with two characters outside an 8-bit code range.
        String unicodeString = "This unicode string contains two characters "
            + "with codes outside an 8-bit code range, " 
            + "Pi (\u03a0) and Sigma (\u03a3).";
        Console.WriteLine("Original string:");
        Console.WriteLine(unicodeString);

        // Encode the string.
        ubyte encodedBytes[] = utf8.GetBytes(unicodeString);
        Console.WriteLine();
        Console.WriteLine("Encoded bytes:");
        for (int iCtr = 0; iCtr < encodedBytes.length; iCtr++) {
            ubyte b = encodedBytes[iCtr];
            Console.Write("[{0}]", String.valueOf(b));
        }
        Console.WriteLine();

        // Decode bytes back to string.
        // Notice Pi and Sigma characters are still present.
        String decodedString = utf8.GetString(encodedBytes);
        Console.WriteLine();
        Console.WriteLine("Decoded bytes:");
        Console.WriteLine(decodedString);
    } //main
} //UTF8EncodingExample

System.Object
   System.Text.Encoding
    System.Text.UTF8Encoding
Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.

Windows 98, Windows 2000 SP4, Windows CE, Windows Millennium Edition, Windows Mobile for Pocket PC, Windows Mobile for Smartphone, Windows Server 2003, Windows XP Media Center Edition, Windows XP Professional x64 Edition, Windows XP SP2, Windows XP Starter Edition

The .NET Framework does not support all versions of every platform. For a list of the supported versions, see System Requirements.

.NET Framework

Supported in: 2.0, 1.1, 1.0

.NET Compact Framework

Supported in: 2.0, 1.0
Did you find this helpful?
(1500 characters remaining)
Thank you for your feedback

Community Additions

ADD
Show:
© 2014 Microsoft. All rights reserved.