IdnMapping.GetAscii Method (String, Int32, Int32)
Encodes the specified number of characters in a substring of domain name labels that include Unicode characters outside the US-ASCII character range. The substring is converted to a string of displayable Unicode characters in the US-ASCII character range and is formatted according to the IDNA standard.
Namespace: System.Globalization
Assembly: mscorlib (in mscorlib.dll)
Parameters
- unicode
- Type: System.String
The string to convert, which consists of one or more domain name labels delimited with label separators.
- index
- Type: System.Int32
A zero-based offset into unicode that specifies the start of the substring.
- count
- Type: System.Int32
The number of characters to convert in the substring that starts at the position specified by index in the unicode string.
Return Value
Type: System.StringThe equivalent of the substring specified by the unicode, index, and count parameters, consisting of displayable Unicode characters in the US-ASCII character range (U+0020 to U+007E) and formatted according to the IDNA standard.
| Exception | Condition |
|---|---|
| ArgumentNullException | unicode is null. |
| ArgumentOutOfRangeException | index or count is less than zero. -or- index is greater than the length of unicode. -or- index is greater than the length of unicode minus count. |
| ArgumentException | unicode is invalid based on the AllowUnassigned and UseStd3AsciiRules properties, and the IDNA standard. |
The Unicode, index, and count parameters define a substring with one or more labels that consist of valid Unicode characters. The labels are separated by label separators. The first character of the substring cannot begin with a label separator, but it can include and optionally end with a separator. The label separators are FULL STOP (period, U+002E), IDEOGRAPHIC FULL STOP (U+3002), FULLWIDTH FULL STOP (U+FF0E), and HALFWIDTH IDEOGRAPHIC FULL STOP (U+FF61). For example, the domain name "www.adatum.com" consists of the labels, "www", "adatum", and "com", separated by periods.
A label cannot contain any of the following characters:
Unicode control characters from U+0001 through U+001F, and U+007F.
Unassigned Unicode characters, depending on the value of the AllowUnassigned property.
Non-standard characters in the US-ASCII character range, such as the SPACE (U+0020), EXCLAMATION MARK (U+0021), and LOW LINE (U+005F) characters, depending on the value of the UseStd3AsciiRules property.
Characters that are prohibited by a specific version of the IDNA standard. For more information about prohibited characters, see RFC 3454: Preparation of Internationalized Strings ("stringprep") for IDNA 2003, and RFC 5982: The Unicode Code Points and Internationalized Domain Names for Applications for IDNA 2008.
The GetAscii method converts all label separators to FULL STOP (period, U+002E). If the substring contains no characters outside the US-ASCII character range, and no characters within the US-ASCII character range are prohibited, the method returns the substring unchanged.
Notes to CallersIn the .NET Framework 4.5, the IdnMapping class supports different versions of the IDNA standard, depending on the operating system in use:
When run on Windows 8, it supports the 2008 version of the IDNA standard outlined by RFC 5891: Internationalized Domain Names in Applications (IDNA): Protocol.
When run on earlier versions of the Windows operating system, it supports the 2003 version of the standard outlined by RFC 3490: Internationalizing Domain Names in Applications (IDNA).
See Unicode Technical Standard #46: IDNA Compatibility Processing for the differences in the way these standards handle particular sets of characters.
The following example uses the GetAscii(String, Int32, Int32) method to convert an internationalized domain name to a domain name that complies with the IDNA standard. The GetUnicode(String, Int32, Int32) method then converts the standardized domain name back into the original domain name, but replaces the original label separators with the standard label separator.
// This example demonstrates the GetAscii and GetUnicode methods. // For sake of illustration, this example uses the most complex // form of those methods, not the most convenient. using System; using System.Globalization; class Sample { public static void Main() { /* Define a domain name consisting of the labels: GREEK SMALL LETTER PI (U+03C0); IDEOGRAPHIC FULL STOP (U+3002); GREEK SMALL LETTER THETA (U+03B8); FULLWIDTH FULL STOP (U+FF0E); and "com". */ string name = "\u03C0\u3002\u03B8\uFF0Ecom"; string international; string nonInternational; string msg1 = "the original non-internationalized \ndomain name:"; string msg2 = "Allow unassigned characters?: {0}"; string msg3 = "Use non-internationalized rules?: {0}"; string msg4 = "Convert the non-internationalized domain name to international format..."; string msg5 = "Display the encoded domain name:\n\"{0}\""; string msg6 = "the encoded domain name:"; string msg7 = "Convert the internationalized domain name to non-international format..."; string msg8 = "the reconstituted non-internationalized \ndomain name:"; string msg9 = "Visually compare the code points of the reconstituted string to the " + "original.\n" + "Note that the reconstituted string contains standard label " + "separators (U+002e)."; // ---------------------------------------------------------------------------- Console.Clear(); CodePoints(name, msg1); // ---------------------------------------------------------------------------- IdnMapping idn = new IdnMapping(); Console.WriteLine(msg2, idn.AllowUnassigned); Console.WriteLine(msg3, idn.UseStd3AsciiRules); Console.WriteLine(); // ---------------------------------------------------------------------------- Console.WriteLine(msg4); international = idn.GetAscii(name, 0, name.Length); Console.WriteLine(msg5, international); Console.WriteLine(); CodePoints(international, msg6); // ---------------------------------------------------------------------------- Console.WriteLine(msg7); nonInternational = idn.GetUnicode(international, 0, international.Length); CodePoints(nonInternational, msg8); Console.WriteLine(msg9); } // ---------------------------------------------------------------------------- static void CodePoints(string value, string title) { Console.WriteLine("Display the Unicode code points of {0}", title); foreach (char c in value) { Console.Write("{0:x4} ", Convert.ToInt32(c)); } Console.WriteLine(); Console.WriteLine(); } } /* This code example produces the following results: Display the Unicode code points of the original non-internationalized domain name: 03c0 3002 03b8 ff0e 0063 006f 006d Allow unassigned characters?: False Use non-internationalized rules?: False Convert the non-internationalized domain name to international format... Display the encoded domain name: "xn--1xa.xn--txa.com" Display the Unicode code points of the encoded domain name: 0078 006e 002d 002d 0031 0078 0061 002e 0078 006e 002d 002d 0074 0078 0061 002e 0063 006f 006d Convert the internationalized domain name to non-international format... Display the Unicode code points of the reconstituted non-internationalized domain name: 03c0 002e 03b8 002e 0063 006f 006d Visually compare the code points of the reconstituted string to the original. Note that the reconstituted string contains standard label separators (U+002e). */
Windows 8, Windows Server 2012, Windows 7, Windows Vista SP2, Windows Server 2008 (Server Core Role not supported), Windows Server 2008 R2 (Server Core Role supported with SP1 or later; Itanium not supported)
The .NET Framework does not support all versions of every platform. For a list of the supported versions, see .NET Framework System Requirements.