String.Normalize Method ()


The .NET API Reference documentation has a new home. Visit the .NET API Browser on to see the new experience.

Returns a new string whose textual value is the same as this string, but whose binary representation is in Unicode normalization form C.

Namespace:   System
Assembly:  mscorlib (in mscorlib.dll)

member Normalize : unit -> string

Return Value

Type: System.String

A new, normalized string whose textual value is the same as this string, but whose binary representation is in normalization form C.

Exception Condition

The current instance contains invalid Unicode characters.

Some Unicode characters have multiple equivalent binary representations consisting of sets of combining and/or composite Unicode characters. For example, any of the following code points can represent the letter "ắ":

  • U+1EAF

  • U+0103 U+0301

  • U+0061 U+0306 U+0301

The existence of multiple representations for a single character complicates searching, sorting, matching, and other operations.

The Unicode standard defines a process called normalization that returns one binary representation when given any of the equivalent binary representations of a character. Normalization can be performed with several algorithms, called normalization forms, that obey different rules. The .NET Framework supports the four normalization forms (C, D, KC, and KD)that are defined by the Unicode standard.When two strings are represented in the same normalization form, theycan be compared by using ordinal comparison.

To normalize and compare two strings, do the following:

  1. Obtain the strings to be compared from an input source, such as a file or a user input device.

  2. Call the Normalize() method to normalize the strings to normalization form C.

  3. To compare two strings, call a method that supports ordinal string comparison, such as the Compare(String, String, StringComparison) method, and supply a value of StringComparison.Ordinal or StringComparison.OrdinalIgnoreCaseas the StringComparison argument. To sort an array of normalized strings, pass a comparer value of StringComparer.Ordinal or StringComparer.OrdinalIgnoreCase to an appropriate overload of Array.Sort.

  4. Emit the strings in the sorted output based on the order indicated by the previous step.

For a description of supported Unicode normalization forms, see System.Text.NormalizationForm.

Notes to Callers:

The IsNormalized method returns false as soon as it encounters the first non-normalized character in a string. Therefore, if a string contains non-normalized characters followed by invalid Unicode characters, the Normalize method will throw an ArgumentException although IsNormalized returns false.

The following example normalizes a string to each of four normalization forms, confirms the string was normalized to the specified normalization form, then lists the code points in the normalized string.

No code example is currently available or this language may not be supported.

.NET Framework
Available since 2.0
Return to top