Export (0) Print
Expand All
Expand Minimize

GetStringTypeW function

Retrieves character type information for the characters in the specified Unicode source string. For each character in the string, the function sets one or more bits in the corresponding 16-bit element of the output array. Each bit identifies a given character type, for example, letter, digit, or neither.

Caution  Using the GetStringTypeW function incorrectly can compromise the security of your application. To avoid a buffer overflow, the application must set the output buffer size correctly. For more security information, see Security Considerations: Windows User Interface.

Syntax


BOOL GetStringTypeW(
  _In_   DWORD dwInfoType,
  _In_   LPCWSTR lpSrcStr,
  _In_   int cchSrc,
  _Out_  LPWORD lpCharType
);

Parameters

dwInfoType [in]

Flags specifying the character type information to retrieve. This parameter can have the following values. The character types are divided into different levels as described in the Remarks section.

FlagMeaning
CT_CTYPE1

Retrieve character type information.

CT_CTYPE2

Retrieve bidirectional layout information.

CT_CTYPE3

Retrieve text processing information.

 

lpSrcStr [in]

Pointer to the Unicode string for which to retrieve the character types. The string is assumed to be null-terminated if cchSrc is set to any negative value.

cchSrc [in]

Size, in characters, of the string indicated by lpSrcStr. If the size includes a terminating null character, the function retrieves character type information for that character. If the application sets the size to any negative integer, the source string is assumed to be null-terminated and the function calculates the size automatically with an additional character for the null termination.

lpCharType [out]

Pointer to an array of 16-bit values. The length of this array must be large enough to receive one 16-bit value for each character in the source string. If cchSrc is not a negative number, lpCharType should be an array of words with cchSrc elements. If cchSrc is set to a negative number, lpCharType is an array of words with lpSrcStr + 1 elements. When the function returns, this array contains one word corresponding to each character in the source string.

Return value

Returns a nonzero value if successful, or 0 otherwise. To get extended error information, the application can call GetLastError, which can return one of the following error codes:

  • ERROR_INVALID_FLAGS. The values supplied for flags were not valid.
  • ERROR_INVALID_PARAMETER. Any of the parameter values was invalid.

Remarks

For an overview of the use of the string functions, see Strings.

The values of the lpSrcStr and lpCharType parameters must not be the same. If they are the same, the function fails with ERROR_INVALID_PARAMETER.

The Locale parameter used by the corresponding GetStringTypeA function is not used by this function. Because of the parameter difference, an application cannot automatically invoke the proper ANSI or Unicode version of a GetStringType* function through the use of the #define UNICODE switch. An application can circumvent this limitation by using GetStringTypeEx, which is the recommended function.

Supported Character Types

The character type bits are divided into several levels. The information for one level can be retrieved by a single call to this function. Each level is limited to 16 bits of information so that the other mapping functions, which are limited to 16 bits of representation per character, can also return character type information.

Ctype 1

These types support ANSI C and POSIX (LC_CTYPE) character typing functions. A bitwise-OR of these values is retrieved in the array in the output buffer when dwInfoType is set to CT_CTYPE1. For DBCS locales, the type attributes apply to both narrow characters and wide characters. The Japanese hiragana and katakana characters, and the kanji ideograph characters all have the C1_ALPHA attribute.

NameValueMeaning
C1_UPPER 0x0001Uppercase
C1_LOWER0x0002Lowercase
C1_DIGIT0x0004Decimal digits
C1_SPACE0x0008Space characters
C1_PUNCT0x0010Punctuation
C1_CNTRL0x0020Control characters
C1_BLANK0x0040Blank characters
C1_XDIGIT0x0080Hexadecimal digits
C1_ALPHA0x0100Any linguistic character: alphabetical, syllabary, or ideographic
C1_DEFINED0x0200A defined character, but not one of the other C1_* types

 

The following character types are either constant or computable from basic types and do not need to be supported by this function.

TypeDescription
AlphanumericAlphabetical characters and digits (C1_ALPHA and C1_DIGIT)
PrintableGraphic characters and blanks (all C1_* types except C1_CNTRL)

 

Ctype 2

These types support proper layout of Unicode text. For DBCS locales, the character type applies to both narrow and wide characters. The direction attributes are assigned so that the bidirectional layout algorithm standardized by Unicode produces accurate results. These types are mutually exclusive. For more information about the use of these attributes, see The Unicode Standard.

NameValueMeaning
Strong
C2_LEFTTORIGHT0x0001Left to right
C2_RIGHTTOLEFT0x0002Right to left
Weak
C2_EUROPENUMBER0x0003European number, European digit
C2_EUROPESEPARATOR0x0004European numeric separator
C2_EUROPETERMINATOR0x0005European numeric terminator
C2_ARABICNUMBER0x0006Arabic number
C2_COMMONSEPARATOR0x0007Common numeric separator
Neutral
C2_BLOCKSEPARATOR0x0008Block separator
C2_SEGMENTSEPARATOR0x0009Segment separator
C2_WHITESPACE0x000AWhite space
C2_OTHERNEUTRAL0x000BOther neutrals
Not applicable
C2_NOTAPPLICABLE0x0000No implicit directionality (for example, control codes)

 

Ctype 3

These types are intended to be placeholders for extensions to the POSIX types required for general text processing or for the standard C library functions. A bitwise-OR of these values is retrieved when dwInfoType is set to CT_CTYPE3. For DBCS locales, the Ctype 3 attributes apply to both narrow characters and wide characters. The Japanese hiragana and katakana characters, and the kanji ideograph characters all have the C3_ALPHA attribute.

NameValueMeaning
C3_NONSPACING0x0001Nonspacing mark
C3_DIACRITIC0x0002Diacritic nonspacing mark
C3_VOWELMARK0x0004Vowel nonspacing mark
C3_SYMBOL0x0008Symbol
C3_KATAKANA0x0010Katakana character
C3_HIRAGANA0x0020Hiragana character
C3_HALFWIDTH0x0040Half-width (narrow) character
C3_FULLWIDTH0x0080Full-width (wide) character
C3_IDEOGRAPH0x0100Ideographic character
C3_KASHIDA0x0200Arabic kashida character
C3_LEXICAL0x0400Punctuation which is counted as part of the word (kashida, hyphen, feminine/masculine ordinal indicators, equal sign, and so forth)
C3_ALPHA0x8000All linguistic characters (alphabetical, syllabary, and ideographic)
C3_HIGHSURROGATE0x0800Windows Vista: High surrogate code unit
C3_LOWSURROGATE0x1000Windows Vista: Low surrogate code unit
Not applicable
C3_NOTAPPLICABLE0x0000Not applicable

 

C3_HIGHSURROGATE and C3_LOWSURROGATE are listed only for completeness, and should never be provided to this function. They are relevant only for Unicode.

Starting with Windows 8: GetStringTypeW is declared in Stringapiset.h. Before Windows 8, it was declared in Winnls.h.

Windows Phone 8: This API is supported.

Windows Phone 8.1: This API is supported.

Requirements

Minimum supported client

Windows 2000 Professional [desktop apps | Windows Store apps]

Minimum supported server

Windows 2000 Server [desktop apps | Windows Store apps]

Header

Stringapiset.h (include Windows.h)

Library

Kernel32.lib

DLL

Kernel32.dll

See also

National Language Support
National Language Support Functions
GetStringTypeA
GetStringTypeEx

 

 

Community Additions

ADD
Show:
© 2014 Microsoft