System Namespace


.NET Framework Class Library
Char Structure

Represents a Unicode character.

Namespace:  System
Assembly:  mscorlib (in mscorlib.dll)
Syntax

Visual Basic (Declaration)
<SerializableAttribute> _
<ComVisibleAttribute(True)> _
Public Structure Char _
    Implements IComparable, IConvertible, IComparable(Of Char),  _
    IEquatable(Of Char)
Visual Basic (Usage)
Dim instance As Char
C#
[SerializableAttribute]
[ComVisibleAttribute(true)]
public struct Char : IComparable, IConvertible, 
    IComparable<char>, IEquatable<char>
Visual C++
[SerializableAttribute]
[ComVisibleAttribute(true)]
public value class Char : IComparable, IConvertible, 
    IComparable<wchar_t>, IEquatable<wchar_t>
JScript
JScript supports the use of structures, but not the declaration of new ones.
Remarks

The .NET Framework uses the Char structure to represent a Unicode character. The Unicode Standard identifies each Unicode character with a unique 21-bit scalar number called a code point, and defines the UTF-16 encoding form that specifies how a code point is encoded into a sequence of one or more 16-bit values. Each 16-bit value ranges from hexadecimal 0x0000 through 0xFFFF and is stored in a Char structure. The value of a Char object is its 16-bit numeric (ordinal) value.

A String object is a sequential collection of Char structures that represents a string of text. Most Unicode characters can be represented by a single Char object, but a character that is encoded as a base character, surrogate pair, and/or combining character sequence is represented by multiple Char objects. For this reason, a Char structure in a String object is not necessarily equivalent to a single Unicode character.

For more information about the Unicode Standard, see the Unicode home page.

Functionality

The Char structure provides methods to compare Char objects, convert the value of the current Char object to an object of another type, and determine the Unicode category of a Char object:

Interface Implementations

This type implements the IConvertible, IComparable, and IComparable<(Of <(T>)>) interfaces. Use the Convert class for conversions instead of this type's explicit interface member implementation of IConvertible.

Examples

The following code example demonstrates some of the methods in Char.

Visual Basic
imports System

Module CharStructure

    Public Sub Main()

        Dim chA As Char
        chA = "A"c
        Dim ch1 As Char
        ch1 = "1"c
        Dim str As String
        str = "test string"

        Console.WriteLine(chA.CompareTo("B"c))          ' Output: "-1" (meaning 'A' is 1 less than 'B')
        Console.WriteLine(chA.Equals("A"c))             ' Output: "True"
        Console.WriteLine(Char.GetNumericValue(ch1))    ' Output: "1"
        Console.WriteLine(Char.IsControl(Chr(9)))       ' Output: "True"
        Console.WriteLine(Char.IsDigit(ch1))            ' Output: "True"
        Console.WriteLine(Char.IsLetter(","c))          ' Output: "False"
        Console.WriteLine(Char.IsLower("u"c))           ' Output: "True"
        Console.WriteLine(Char.IsNumber(ch1))           ' Output: "True"
        Console.WriteLine(Char.IsPunctuation("."c))     ' Output: "True"
        Console.WriteLine(Char.IsSeparator(str, 4))     ' Output: "True"
        Console.WriteLine(Char.IsSymbol("+"c))          ' Output: "True"
        Console.WriteLine(Char.IsWhiteSpace(str, 4))    ' Output: "True"
        Console.WriteLine(Char.Parse("S"))              ' Output: "S"
        Console.WriteLine(Char.ToLower("M"c))           ' Output: "m"
        Console.WriteLine("x"c.ToString())              ' Output: "x"

    End Sub

End Module
C#
using System;

public class CharStructureSample {
    public static void Main() {
        char chA = 'A';
        char ch1 = '1';
        string str = "test string"; 

        Console.WriteLine(chA.CompareTo('B'));            // Output: "-1" (meaning 'A' is 1 less than 'B')
        Console.WriteLine(chA.Equals('A'));                // Output: "True"
        Console.WriteLine(Char.GetNumericValue(ch1));    // Output: "1"
        Console.WriteLine(Char.IsControl('\t'));        // Output: "True"
        Console.WriteLine(Char.IsDigit(ch1));            // Output: "True"
        Console.WriteLine(Char.IsLetter(','));            // Output: "False"
        Console.WriteLine(Char.IsLower('u'));            // Output: "True"
        Console.WriteLine(Char.IsNumber(ch1));            // Output: "True"
        Console.WriteLine(Char.IsPunctuation('.'));        // Output: "True"
        Console.WriteLine(Char.IsSeparator(str, 4));    // Output: "True"
        Console.WriteLine(Char.IsSymbol('+'));            // Output: "True"
        Console.WriteLine(Char.IsWhiteSpace(str, 4));    // Output: "True"
        Console.WriteLine(Char.Parse("S"));                // Output: "S"
        Console.WriteLine(Char.ToLower('M'));            // Output: "m"
        Console.WriteLine('x'.ToString());                // Output: "x"
    }
}
Visual C++
using namespace System;
int main()
{
   char chA = 'A';
   char ch1 = '1';
   String^ str =  "test string";
   Console::WriteLine( chA.CompareTo( 'B' ) ); // Output: "-1" (meaning 'A' is 1 less than 'B')
   Console::WriteLine( chA.Equals( 'A' ) ); // Output: "True"
   Console::WriteLine( Char::GetNumericValue( ch1 ) ); // Output: "1"
   Console::WriteLine( Char::IsControl( '\t' ) ); // Output: "True"
   Console::WriteLine( Char::IsDigit( ch1 ) ); // Output: "True"
   Console::WriteLine( Char::IsLetter( ',' ) ); // Output: "False"
   Console::WriteLine( Char::IsLower( 'u' ) ); // Output: "True"
   Console::WriteLine( Char::IsNumber( ch1 ) ); // Output: "True"
   Console::WriteLine( Char::IsPunctuation( '.' ) ); // Output: "True"
   Console::WriteLine( Char::IsSeparator( str, 4 ) ); // Output: "True"
   Console::WriteLine( Char::IsSymbol( '+' ) ); // Output: "True"
   Console::WriteLine( Char::IsWhiteSpace( str, 4 ) ); // Output: "True"
   Console::WriteLine( Char::Parse(  "S" ) ); // Output: "S"
   Console::WriteLine( Char::ToLower( 'M' ) ); // Output: "m"
   Console::WriteLine( 'x' ); // Output: "x"
}

Thread Safety

All members of this type are thread safe. Members that appear to modify instance state actually return a new instance initialized with the new value. As with any other type, reading and writing to a shared variable that contains an instance of this type must be protected by a lock to guarantee thread safety.

Platforms

Windows 7, Windows Vista, Windows XP SP2, Windows XP Media Center Edition, Windows XP Professional x64 Edition, Windows XP Starter Edition, Windows Server 2008 R2, Windows Server 2008, Windows Server 2003, Windows Server 2000 SP4, Windows Millennium Edition, Windows 98, Windows CE, Windows Mobile for Smartphone, Windows Mobile for Pocket PC, Xbox 360, Zune

The .NET Framework and .NET Compact Framework do not support all versions of every platform. For a list of the supported versions, see .NET Framework System Requirements.
Version Information

.NET Framework

Supported in: 3.5, 3.0, 2.0, 1.1, 1.0

.NET Compact Framework

Supported in: 3.5, 2.0, 1.0

XNA Framework

Supported in: 3.0, 2.0, 1.0
See Also

Reference

Tags :


Community Content

Thomas Lee
WARNING: Chars don't make sense in many languages

It is worth mentioning that the "char" type represents a single 16 bit value. In Unicode some characters consist of 2 UTF-16 code points, so in that case a "char" cannot represent a complete "character". This doesn't happen to English, but many Chinese and other characters exist outside of the BMP (ie: require 2 chars to represent the Unicode code point).

Also note that the notion of a "character" is also flexible. Many people think of them as "glyphs", but many "glyphs" require multiple code points. For example ä can be "a" + U+0308 (combining diaresis) or "ä" (U+00A4). In some languages all "letters/characters/glyphs" cannot be represented correctly by a single Unicode code point and instead require multiple code points.

Additionally some concepts get confused by this behavior. For example, There is a ΰ (U+03B0 greek small letter Upsilon with Dialytika and Tonos), however there's no equivilent capital letter. Trying to do ToUpper() ends up returning the same value, although you could perhaps argue for Ϋ́ (U+03AB + U+0301, greeke capital letter upsilon with dialytika, and then a combining tonos) Some other operating systems/environments choose that as the ToUpper() value for U+03B0, so then a single "char" ends up with a 2 "char" upper case form.

Another example is when combinations of characters cause their form to change. This isn't common in the "latin" characters, but it's kind of like æ (U+00E6) looking like a and e crammed together, or, in German ß being the equivilent of ss. In some scripts the form changes a lot depending on the subsequent letters. An oversimplification would be to describe it as kind of like a hyperactive cursive where the letters connect in different ways depending on the following letters.

There are many other examples of cases when the "character" concept breaks down, so use caution. Strings are generally preferrable to better represent linguistic content.

Tags : unicode

Page view tracker