String::IsNormalized Method
Indicates whether this string is in Unicode normalization form C.
Assembly: mscorlib (in mscorlib.dll)
| Exception | Condition |
|---|---|
| ArgumentException | The current instance contains invalid Unicode characters. |
Some Unicode characters have multiple equivalent binary representations consisting of sets of combining and/or composite Unicode characters. The existence of multiple representations for a single character complicates searching, sorting, matching, and other operations.
The Unicode standard defines a process called normalization that returns one binary representation when given any of the equivalent binary representations of a character. Normalization can be performed with several algorithms, called normalization forms, that obey different rules. The .NET Framework currently supports normalization forms C, D, KC, and KD.
For a description of supported Unicode normalization forms, see System.Text::NormalizationForm.
Notes to CallersThe IsNormalized method returns false as soon as it encounters the first non-normalized character in a string. Therefore, if a string contains non-normalized characters followed by invalid Unicode characters, the Normalize method will throw an ArgumentException although IsNormalized returns false.
The following example determines whether a string is successfully normalized to various normalization forms.
// This example demonstrates the String.Normalize method // and the String.IsNormalized method using namespace System; using namespace System::Text; void Show( String^ title, String^ s ) { Console::Write( "Characters in string {0} = ", title ); System::Collections::IEnumerator^ myEnum = s->ToCharArray()->GetEnumerator(); while ( myEnum->MoveNext() ) { /*) * __try_cast < Char * > ( myEnum -> Current );*/ int x; Console::Write( "{0:X4} ", x ); } Console::WriteLine(); } int main() { // Character c; combining characters acute and cedilla; character 3/4 array<Char>^temp0 = {L'c',L'\u0301',L'\u0327',L'\u00BE'}; String^ s1 = gcnew String( temp0 ); String^ s2 = nullptr; String^ divider = gcnew String( '-',80 ); divider = String::Concat( Environment::NewLine, divider, Environment::NewLine ); try { Show( "s1", s1 ); Console::WriteLine(); Console::WriteLine( "U+0063 = LATIN SMALL LETTER C" ); Console::WriteLine( "U+0301 = COMBINING ACUTE ACCENT" ); Console::WriteLine( "U+0327 = COMBINING CEDILLA" ); Console::WriteLine( "U+00BE = VULGAR FRACTION THREE QUARTERS" ); Console::WriteLine( divider ); Console::WriteLine( "A1) Is s1 normalized to the default form (Form C)?: {0}", s1->IsNormalized() ); Console::WriteLine( "A2) Is s1 normalized to Form C?: {0}", s1->IsNormalized( NormalizationForm::FormC ) ); Console::WriteLine( "A3) Is s1 normalized to Form D?: {0}", s1->IsNormalized( NormalizationForm::FormD ) ); Console::WriteLine( "A4) Is s1 normalized to Form KC?: {0}", s1->IsNormalized( NormalizationForm::FormKC ) ); Console::WriteLine( "A5) Is s1 normalized to Form KD?: {0}", s1->IsNormalized( NormalizationForm::FormKD ) ); Console::WriteLine( divider ); Console::WriteLine( "Set string s2 to each normalized form of string s1." ); Console::WriteLine(); Console::WriteLine( "U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE" ); Console::WriteLine( "U+0033 = DIGIT THREE" ); Console::WriteLine( "U+2044 = FRACTION SLASH" ); Console::WriteLine( "U+0034 = DIGIT FOUR" ); Console::WriteLine( divider ); s2 = s1->Normalize(); Console::Write( "B1) Is s2 normalized to the default form (Form C)?: " ); Console::WriteLine( s2->IsNormalized() ); Show( "s2", s2 ); Console::WriteLine(); s2 = s1->Normalize( NormalizationForm::FormC ); Console::Write( "B2) Is s2 normalized to Form C?: " ); Console::WriteLine( s2->IsNormalized( NormalizationForm::FormC ) ); Show( "s2", s2 ); Console::WriteLine(); s2 = s1->Normalize( NormalizationForm::FormD ); Console::Write( "B3) Is s2 normalized to Form D?: " ); Console::WriteLine( s2->IsNormalized( NormalizationForm::FormD ) ); Show( "s2", s2 ); Console::WriteLine(); s2 = s1->Normalize( NormalizationForm::FormKC ); Console::Write( "B4) Is s2 normalized to Form KC?: " ); Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKC ) ); Show( "s2", s2 ); Console::WriteLine(); s2 = s1->Normalize( NormalizationForm::FormKD ); Console::Write( "B5) Is s2 normalized to Form KD?: " ); Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKD ) ); Show( "s2", s2 ); Console::WriteLine(); } catch ( Exception^ e ) { Console::WriteLine( e->Message ); } } /* This example produces the following results: Characters in string s1 = 0063 0301 0327 00BE U+0063 = LATIN SMALL LETTER C U+0301 = COMBINING ACUTE ACCENT U+0327 = COMBINING CEDILLA U+00BE = VULGAR FRACTION THREE QUARTERS -------------------------------------------------------------------------------- A1) Is s1 normalized to the default form (Form C)?: False A2) Is s1 normalized to Form C?: False A3) Is s1 normalized to Form D?: False A4) Is s1 normalized to Form KC?: False A5) Is s1 normalized to Form KD?: False -------------------------------------------------------------------------------- Set string s2 to each normalized form of string s1. U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE U+0033 = DIGIT THREE U+2044 = FRACTION SLASH U+0034 = DIGIT FOUR -------------------------------------------------------------------------------- B1) Is s2 normalized to the default form (Form C)?: True Characters in string s2 = 1E09 00BE B2) Is s2 normalized to Form C?: True Characters in string s2 = 1E09 00BE B3) Is s2 normalized to Form D?: True Characters in string s2 = 0063 0327 0301 00BE B4) Is s2 normalized to Form KC?: True Characters in string s2 = 1E09 0033 2044 0034 B5) Is s2 normalized to Form KD?: True Characters in string s2 = 0063 0327 0301 0033 2044 0034 */
Windows 7, Windows Vista SP1 or later, Windows XP SP3, Windows XP SP2 x64 Edition, Windows Server 2008 (Server Core not supported), Windows Server 2008 R2 (Server Core supported with SP1 or later), Windows Server 2003 SP2
The .NET Framework does not support all versions of every platform. For a list of the supported versions, see .NET Framework System Requirements.