Export (0) Print
Expand All

CharUnicodeInfo.GetUnicodeCategory Method (Char)

Gets the Unicode category of the specified character.

Namespace: System.Globalization
Assembly: mscorlib (in mscorlib.dll)

public static UnicodeCategory GetUnicodeCategory (
	char ch
)
public static UnicodeCategory GetUnicodeCategory (
	char ch
)
public static function GetUnicodeCategory (
	ch : char
) : UnicodeCategory
Not applicable.

Parameters

ch

The Unicode character for which to get the Unicode category.

Return Value

A UnicodeCategory value indicating the category of the specified character.

The Unicode characters are divided into categories. A character's category is one of its properties. For example, a character might be an uppercase letter, a lowercase letter, a decimal digit number, a letter number, a connector punctuation, a math symbol, or a currency symbol. The UnicodeCategory class returns the category of a Unicode character. For more information on Unicode characters, see the Unicode Standard.

Note that GetUnicodeCategory does not always return the same UnicodeCategory value as the GetUnicodeCategory method when passed a particular character as a parameter. The GetUnicodeCategory method is designed to reflect the current version of the Unicode standard. In contrast, although the GetUnicodeCategory method usually reflects the current version of the Unicode standard, it might return a character's category based on a previous version of the standard, or it might return a category that differs from the current standard to preserve backward compatibility.

The following code example shows the values returned by each method for different types of characters.

using System;
using System.Globalization;

public class SamplesCharUnicodeInfo  {

   public static void Main()  {

      Console.WriteLine( "                                        c  Num   Dig   Dec   UnicodeCategory" );

      Console.Write( "U+0061 LATIN SMALL LETTER A            " );
      PrintProperties( 'a' );

      Console.Write( "U+0393 GREEK CAPITAL LETTER GAMMA      " );
      PrintProperties( '\u0393' );

      Console.Write( "U+0039 DIGIT NINE                      " );
      PrintProperties( '9' );

      Console.Write( "U+00B2 SUPERSCRIPT TWO                 " );
      PrintProperties( '\u00B2' );

      Console.Write( "U+00BC VULGAR FRACTION ONE QUARTER     " );
      PrintProperties( '\u00BC' );

      Console.Write( "U+0BEF TAMIL DIGIT NINE                " );
      PrintProperties( '\u0BEF' );

      Console.Write( "U+0BF0 TAMIL NUMBER TEN                " );
      PrintProperties( '\u0BF0' );

      Console.Write( "U+0F33 TIBETAN DIGIT HALF ZERO         " );
      PrintProperties( '\u0F33' );

      Console.Write( "U+2788 CIRCLED SANS-SERIF DIGIT NINE   " );
      PrintProperties( '\u2788' );

   }

   public static void PrintProperties( char c )  {
      Console.Write( " {0,-3}", c );
      Console.Write( " {0,-5}", CharUnicodeInfo.GetNumericValue( c ) );
      Console.Write( " {0,-5}", CharUnicodeInfo.GetDigitValue( c ) );
      Console.Write( " {0,-5}", CharUnicodeInfo.GetDecimalDigitValue( c ) );
      Console.WriteLine( "{0}", CharUnicodeInfo.GetUnicodeCategory( c ) );
   }

}


/*
This code produces the following output.  Some characters might not display at the console.

                                        c  Num   Dig   Dec   UnicodeCategory
U+0061 LATIN SMALL LETTER A             a   -1    -1    -1   LowercaseLetter
U+0393 GREEK CAPITAL LETTER GAMMA       \u0393   -1    -1    -1   UppercaseLetter
U+0039 DIGIT NINE                       9   9     9     9    DecimalDigitNumber
U+00B2 SUPERSCRIPT TWO                  \u00B2   2     2     2    OtherNumber
U+00BC VULGAR FRACTION ONE QUARTER      \u00BC   0.25  -1    -1   OtherNumber
U+0BEF TAMIL DIGIT NINE                 \u0BEF   9     9     9    DecimalDigitNumber
U+0BF0 TAMIL NUMBER TEN                 \u0BF0   10    -1    -1   OtherNumber
U+0F33 TIBETAN DIGIT HALF ZERO          \u0F33   -0.5  -1    -1   OtherNumber
U+2788 CIRCLED SANS-SERIF DIGIT NINE    \u2788   9     9     -1   OtherNumber

*/


import System.* ;
import System.Globalization.* ;

public class SamplesCharUnicodeInfo
{
    public static void main(String[] args)
    {
        Console.WriteLine("                                        c  Num " 
            + " Dig   Dec   UnicodeCategory");

        Console.Write("U+0061 LATIN SMALL LETTER A            ");
        PrintProperties('a');

        Console.Write("U+0393 GREEK CAPITAL LETTER GAMMA      ");
        PrintProperties('\u0393');

        Console.Write("U+0039 DIGIT NINE                      ");
        PrintProperties('9');

        Console.Write("U+00B2 SUPERSCRIPT TWO                 ");
        PrintProperties('\u00B2');

        Console.Write("U+00BC VULGAR FRACTION ONE QUARTER     ");
        PrintProperties('\u00BC');

        Console.Write("U+0BEF TAMIL DIGIT NINE                ");
        PrintProperties('\u0BEF');

        Console.Write("U+0BF0 TAMIL NUMBER TEN                ");
        PrintProperties('\u0BF0');

        Console.Write("U+0F33 TIBETAN DIGIT HALF ZERO         ");
        PrintProperties('\u0F33');

        Console.Write("U+2788 CIRCLED SANS-SERIF DIGIT NINE   ");
        PrintProperties('\u2788');
    } //main
   
    public static void PrintProperties(char c)
    {
        Console.Write(" {0,-3}", System.Convert.ToString( c));
        Console.Write(" {0,-5}", 
            System.Convert.ToString(CharUnicodeInfo.GetNumericValue(c)));
        Console.Write(" {0,-5}", 
            System.Convert.ToString(CharUnicodeInfo.GetDigitValue(c)));
        Console.Write(" {0,-5}",
            System.Convert.ToString( CharUnicodeInfo.GetDecimalDigitValue(c)));
        Console.WriteLine("{0}", 
            System.Convert.ToString(CharUnicodeInfo.GetUnicodeCategory(c)));
    } //PrintProperties
} //SamplesCharUnicodeInfo

/*
This code produces the following output.  
Some characters might not display at the console.

                                        c  Num   Dig   Dec   UnicodeCategory
U+0061 LATIN SMALL LETTER A             a   -1    -1    -1   LowercaseLetter
U+0393 GREEK CAPITAL LETTER GAMMA       \u0393   -1    -1    -1   
UppercaseLetter
U+0039 DIGIT NINE                       9   9     9     9    DecimalDigitNumber
U+00B2 SUPERSCRIPT TWO                  \u00B2   2     2     2    OtherNumber
U+00BC VULGAR FRACTION ONE QUARTER      \u00BC   0.25  -1    -1   OtherNumber
U+0BEF TAMIL DIGIT NINE                 \u0BEF   9     9     9    
DecimalDigitNumber
U+0BF0 TAMIL NUMBER TEN                 \u0BF0   10    -1    -1   OtherNumber
U+0F33 TIBETAN DIGIT HALF ZERO          \u0F33   -0.5  -1    -1   OtherNumber
U+2788 CIRCLED SANS-SERIF DIGIT NINE    \u2788   9     9     -1   OtherNumber
*/

Windows 98, Windows Server 2000 SP4, Windows CE, Windows Millennium Edition, Windows Mobile for Pocket PC, Windows Mobile for Smartphone, Windows Server 2003, Windows XP Media Center Edition, Windows XP Professional x64 Edition, Windows XP SP2, Windows XP Starter Edition

The Microsoft .NET Framework 3.0 is supported on Windows Vista, Microsoft Windows XP SP2, and Windows Server 2003 SP1.

.NET Framework

Supported in: 3.0, 2.0

.NET Compact Framework

Supported in: 2.0

XNA Framework

Supported in: 1.0

Community Additions

ADD
Show:
© 2014 Microsoft