Windows Dev Center

UTF8Encoding Class

Represents a UTF-8 encoding of Unicode characters.

System.Object
  System.Text.Encoding
    System.Text.UTF8Encoding

Namespace:  System.Text
Assembly:  mscorlib (in mscorlib.dll)

public class UTF8Encoding : Encoding

The UTF8Encoding type exposes the following members.

  NameDescription
Public methodUTF8Encoding()Initializes a new instance of the UTF8Encoding class.
Public methodUTF8Encoding(Boolean)Initializes a new instance of the UTF8Encoding class. A parameter specifies whether to provide a Unicode byte order mark.
Public methodUTF8Encoding(Boolean, Boolean)Initializes a new instance of the UTF8Encoding class. Parameters specify whether to provide a Unicode byte order mark and whether to throw an exception when an invalid encoding is detected.
Top

  NameDescription
Public propertyWebNameWhen overridden in a derived class, gets the name registered with the Internet Assigned Numbers Authority (IANA) for the current encoding. (Inherited from Encoding.)
Top

  NameDescription
Public methodCloneWhen overridden in a derived class, creates a shallow copy of the current Encoding object. (Inherited from Encoding.)
Public methodEqualsDetermines whether the specified Object is equal to the current UTF8Encoding object. (Overrides Encoding.Equals(Object).)
Protected methodFinalizeAllows an object to try to free resources and perform other cleanup operations before the Object is reclaimed by garbage collection. (Inherited from Object.)
Public methodGetByteCount(Char[])When overridden in a derived class, calculates the number of bytes produced by encoding all the characters in the specified character array. (Inherited from Encoding.)
Public methodGetByteCount(String)Calculates the number of bytes that would be produced by encoding the characters in the specified String. (Overrides Encoding.GetByteCount(String).)
Public methodGetByteCount(Char[], Int32, Int32)Calculates the number of bytes that would be produced by encoding a set of characters from the specified character array. (Overrides Encoding.GetByteCount(Char[], Int32, Int32).)
Public methodGetBytes(Char[])When overridden in a derived class, encodes all the characters in the specified character array into a sequence of bytes. (Inherited from Encoding.)
Public methodGetBytes(String)When overridden in a derived class, encodes all the characters in the specified string into a sequence of bytes. (Inherited from Encoding.)
Public methodGetBytes(Char[], Int32, Int32)When overridden in a derived class, encodes a set of characters from the specified character array into a sequence of bytes. (Inherited from Encoding.)
Public methodGetBytes(Char*, Int32, Byte*, Int32)Security Critical. Encodes a set of characters starting at the specified character pointer into a sequence of bytes that are stored starting at the specified byte pointer. (Overrides Encoding.GetBytes(Char*, Int32, Byte*, Int32).)
Public methodGetBytes(Char[], Int32, Int32, Byte[], Int32)Encodes a set of characters from the specified character array into the specified byte array. (Overrides Encoding.GetBytes(Char[], Int32, Int32, Byte[], Int32).)
Public methodGetBytes(String, Int32, Int32, Byte[], Int32)Encodes a set of characters from the specified string into the specified byte array. (Overrides Encoding.GetBytes(String, Int32, Int32, Byte[], Int32).)
Public methodGetCharCount(Byte[])When overridden in a derived class, calculates the number of characters produced by decoding all the bytes in the specified byte array. (Inherited from Encoding.)
Public methodGetCharCount(Byte[], Int32, Int32)Calculates the number of characters produced by decoding a sequence of bytes from the specified byte array. (Overrides Encoding.GetCharCount(Byte[], Int32, Int32).)
Public methodGetChars(Byte[])When overridden in a derived class, decodes all the bytes in the specified byte array into a set of characters. (Inherited from Encoding.)
Public methodGetChars(Byte[], Int32, Int32)When overridden in a derived class, decodes a sequence of bytes from the specified byte array into a set of characters. (Inherited from Encoding.)
Public methodGetChars(Byte[], Int32, Int32, Char[], Int32)Decodes a sequence of bytes from the specified byte array into the specified character array. (Overrides Encoding.GetChars(Byte[], Int32, Int32, Char[], Int32).)
Public methodGetDecoderObtains a decoder that converts a UTF-8 encoded sequence of bytes into a sequence of Unicode characters. (Overrides Encoding.GetDecoder().)
Public methodGetEncoderObtains an encoder that converts a sequence of Unicode characters into a UTF-8 encoded sequence of bytes. (Overrides Encoding.GetEncoder().)
Public methodGetHashCodeReturns the hash code for the current instance. (Overrides Encoding.GetHashCode().)
Public methodGetMaxByteCountCalculates the maximum number of bytes produced by encoding the specified number of characters. (Overrides Encoding.GetMaxByteCount(Int32).)
Public methodGetMaxCharCountCalculates the maximum number of characters produced by decoding the specified number of bytes. (Overrides Encoding.GetMaxCharCount(Int32).)
Public methodGetPreambleReturns a Unicode byte order mark encoded in UTF-8 format. (Overrides Encoding.GetPreamble().)
Public methodGetStringDecodes a range of bytes from a byte array into a string. (Overrides Encoding.GetString(Byte[], Int32, Int32).)
Public methodGetTypeGets the Type of the current instance. (Inherited from Object.)
Protected methodMemberwiseCloneCreates a shallow copy of the current Object. (Inherited from Object.)
Public methodToStringReturns a string that represents the current object. (Inherited from Object.)
Top

Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. Decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters. UTF-8 encoding represents each code point as a sequence of one to four bytes.

You can instantiate a UTF8Encoding object in any of the following ways:

  • By retrieving the UTF8Encoding object returned by the UTF8 property.

  • By calling the GetEncoding method with "utf-8" as the value of its name parameter.

  • By calling one of the overloads of the UTF8Encoding class constructor. Unlike the other ways to instantiate a UTF8Encoding object, which return a default UTF8Encoding object, overloads of the class constructor allow you to define whether encodings include a preamble and whether an exception is thrown if an invalid encoding is encountered.

The GetByteCount method determines how many bytes result in encoding a set of Unicode characters, and the GetBytes method performs the actual encoding.

Likewise, the GetCharCount method determines how many characters result in decoding a sequence of bytes, and the GetChars and GetString methods perform the actual decoding.

Optionally, the UTF8Encoding object provides a preamble, which is an array of bytes that can be prefixed to the sequence of bytes resulting from the encoding process. If the preamble contains a byte order mark (BOM), it helps the decoder determine the byte order and the transformation format or UTF. The GetPreamble method retrieves an array of bytes that can include the BOM. For more information on byte order and the byte order mark, see The Unicode Standard at the Unicode home page.

NoteNote:

To enable error detection and to make the class instance more secure, the application should use the UTF8Encoding constructor that takes a throwOnInvalidBytes parameter and set that parameter to true. With error detection, a method that detects an invalid sequence of characters or bytes throws a ArgumentException. Without error detection, no exception is thrown, and the invalid sequence is generally ignored.

The following example demonstrates how to use a UTF8Encoding to encode a string of Unicode characters and store them in a byte array. Notice that when encodedBytes is decoded back to a string there is no loss of data.


using System;
using System.Text;

class Example
{
   public static void Demo(System.Windows.Controls.TextBlock outputBlock)
   {
      // Create a UTF-8 encoding.
      UTF8Encoding utf8 = new UTF8Encoding();

      // A Unicode string with two characters outside an 8-bit code range.
      String unicodeString =
          "This unicode string contains two characters " +
          "with codes outside an 8-bit code range, " +
          "Pi (\u03a0) and Sigma (\u03a3).";
      outputBlock.Text += "Original string:" + "\n";
      outputBlock.Text += unicodeString + "\n";

      // Encode the string.
      Byte[] encodedBytes = utf8.GetBytes(unicodeString);
      outputBlock.Text += "\n";
      outputBlock.Text += "Encoded bytes:" + "\n";
      foreach (Byte b in encodedBytes)
      {
         outputBlock.Text += String.Format("[{0}]", b);
      }
      outputBlock.Text += "\n";

      // Decode bytes back to string.
      // Notice Pi and Sigma characters are still present.
      String decodedString = utf8.GetString(encodedBytes, 0, encodedBytes.Length);
      outputBlock.Text += "\n";
      outputBlock.Text += "Decoded bytes:" + "\n";
      outputBlock.Text += decodedString + "\n";
   }
}


Windows Phone OS

Supported in: 8.1, 8.0, 7.1, 7.0

Windows Phone

Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.

Show:
© 2015 Microsoft