Export (0) Print
Expand All

Encoding.UTF8 Property

Gets an encoding for the UTF-8 format.

Namespace:  System.Text
Assembly:  mscorlib (in mscorlib.dll)

public static Encoding UTF8 { get; }

Property Value

Type: System.Text.Encoding
An encoding for the UTF-8 format.

This property returns a UTF8Encoding object that encodes Unicode (UTF-16-encoded) characters into a sequence of one to four bytes per character, and that decodes a UTF-8-encoded byte array to Unicode (UTF-16-encoded) characters. For information about the character encodings supported by the .NET Framework and a discussion of which Unicode encoding to use, see Character Encoding in the .NET Framework.

The UTF8Encoding object that is returned by this property might not have the appropriate behavior for your application.

  • It returns a UTF8Encoding object that provides a Unicode byte order mark (BOM). To instantiate a UTF8 encoding that doesn't provide a BOM, call any overload of the UTF8Encoding constructor.

  • It returns a UTF8Encoding object that uses replacement fallback to replace each string that it can't encode and each byte that it can't decode with a question mark ("?") character. Instead, you can call the UTF8Encoding.UTF8Encoding(Boolean, Boolean) constructor to instantiate a UTF8Encoding object whose fallback is either an EncoderFallbackException or a DecoderFallbackException, as the following example illustrates.

    using System;
    using System.Text;
    
    public class Example
    {
       public static void Main()
       {
          Encoding enc = new UTF8Encoding(true, true);
          string value = "\u00C4 \uD802\u0033 \u00AE"; 
    
          try {
             byte[] bytes= enc.GetBytes(value);
             foreach (var byt in bytes)
                Console.Write("{0:X2} ", byt);
             Console.WriteLine();
    
             string value2 = enc.GetString(bytes);
             Console.WriteLine(value2);
          }
          catch (EncoderFallbackException e) {
             Console.WriteLine("Unable to encode {0} at index {1}", 
                               e.IsUnknownSurrogate() ? 
                                  String.Format("U+{0:X4} U+{1:X4}", 
                                                Convert.ToUInt16(e.CharUnknownHigh),
                                                Convert.ToUInt16(e.CharUnknownLow)) :
                                  String.Format("U+{0:X4}", 
                                                Convert.ToUInt16(e.CharUnknown)),
                               e.Index);
          }                     
       }
    }
    // The example displays the following output: 
    //        Unable to encode U+D802 at index 2
    

The following example defines an array that consists of the following characters:

  • LATIN SMALL LETTER Z (U+007A)

  • LATIN SMALL LETTER A (U+0061)

  • COMBINING BREVE (U+0306)

  • LATIN SMALL LETTER AE WITH ACUTE (U+01FD)

  • GREEK SMALL LETTER BETA (U+03B2)

  • A surrogate pair (U+D800 U+DD54) that forms GREEK ACROPHONIC ATTIC ONE THOUSAND STATERS (U+10154).

It displays the UTF-16 code units of each character and determines the number of bytes required by a UTF-8 encoder to encode the character array. It then encodes the characters and displays the resulting UTF-8-encoded bytes.

using System;
using System.Text;

public class Example
{
   public static void Main()  
   {
      // Create a character array. 
      string gkNumber = Char.ConvertFromUtf32(0x10154);
      char[] chars = new char[] { 'z', 'a', '\u0306', '\u01FD', '\u03B2', 
                                  gkNumber[0], gkNumber[1] };

      // Get UTF-8 and UTF-16 encoders.
      Encoding utf8 = Encoding.UTF8;
      Encoding utf16 = Encoding.Unicode;

      // Display the original characters' code units.
      Console.WriteLine("Original UTF-16 code units:");
      byte[] utf16Bytes = utf16.GetBytes(chars);
      foreach (var utf16Byte in utf16Bytes)
         Console.Write("{0:X2} ", utf16Byte);
      Console.WriteLine();

      // Display the number of bytes required to encode the array. 
      int reqBytes  = utf8.GetByteCount(chars);
      Console.WriteLine("\nExact number of bytes required: {0}", 
                    reqBytes);

      // Display the maximum byte count. 
      int maxBytes = utf8.GetMaxByteCount(chars.Length);
      Console.WriteLine("Maximum number of bytes required: {0}\n", 
                        maxBytes);

      // Encode the array of chars. 
      byte[] utf8Bytes = utf8.GetBytes(chars);

      // Display all the UTF-8-encoded bytes.
      Console.WriteLine("UTF-8-encoded code units:");
      foreach (var utf8Byte in utf8Bytes)
         Console.Write("{0:X2} ", utf8Byte);
      Console.WriteLine();
   }
}
// The example displays the following output: 
//       Original UTF-16 code units: 
//       7A 00 61 00 06 03 FD 01 B2 03 00 D8 54 DD 
//        
//       Exact number of bytes required: 12 
//       Maximum number of bytes required: 24 
//        
//       UTF-8-encoded code units: 
//       7A 61 CC 86 C7 BD CE B2 F0 90 85 94

.NET Framework

Supported in: 4.5, 4, 3.5, 3.0, 2.0, 1.1, 1.0

.NET Framework Client Profile

Supported in: 4, 3.5 SP1

Portable Class Library

Supported in: Portable Class Library

.NET for Windows Store apps

Supported in: Windows 8

.NET for Windows Phone apps

Supported in: Windows Phone 8.1, Windows Phone Silverlight 8.1, Windows Phone Silverlight 8

Windows Phone 8.1, Windows Phone 8, Windows 8.1, Windows Server 2012 R2, Windows 8, Windows Server 2012, Windows 7, Windows Vista SP2, Windows Server 2008 (Server Core Role not supported), Windows Server 2008 R2 (Server Core Role supported with SP1 or later; Itanium not supported)

The .NET Framework does not support all versions of every platform. For a list of the supported versions, see .NET Framework System Requirements.

Show:
© 2014 Microsoft