NormalizeString function
Normalizes characters of a text string according to Unicode 4.0 TR#15. For more information, see Using Unicode Normalization to Represent Strings.
Syntax
int NormalizeString( _In_ NORM_FORM NormForm, _In_ LPCWSTR lpSrcString, _In_ int cwSrcLength, _Out_opt_ LPWSTR lpDstString, _In_ int cwDstLength );
Parameters
- NormForm [in]
-
Normalization form to use. NORM_FORM specifies the standard Unicode normalization forms.
- lpSrcString [in]
-
Pointer to the non-normalized source string.
- cwSrcLength [in]
-
Length, in characters, of the buffer containing the source string. The application can set this parameter to -1 if the function should assume the string to be null-terminated and calculate the length automatically.
- lpDstString [out, optional]
-
Pointer to a buffer in which the function retrieves the destination string. Alternatively, this parameter contains NULL if cwDstLength is set to 0.
Note The function does not null-terminate the string if the input string length is explicitly specified without a terminating null character. To null-terminate the output string, the application should specify -1 or explicitly count the terminating null character for the input string. - cwDstLength [in]
-
Length, in characters, of the buffer containing the destination string. Alternatively, the application can set this parameter to 0 to request the function to return the required size for the destination buffer.
Return value
Returns the length of the normalized string in the destination buffer. If cwDstLength is set to 0, the function returns the estimated buffer length required to do the actual conversion.
If the string in the input buffer is null-terminated or if cwSrcLength is -1, the string written to the destination buffer is null-terminated and the returned string length includes the terminating null character.
The function returns a value that is less than or equal to 0 if it does not succeed. To get extended error information, the application can call GetLastError, which can return one of the following error codes:
- ERROR_INSUFFICIENT_BUFFER. A supplied buffer size was not large enough, or it was incorrectly set to NULL.
- ERROR_INVALID_PARAMETER. Any of the parameter values was invalid.
- ERROR_NO_UNICODE_TRANSLATION. Invalid Unicode was found in a string. The return value is the negative of the index of the location of the error in the input string.
- ERROR_SUCCESS. The action completed successfully but yielded no results.
Remarks
Some Unicode characters have multiple equivalent binary representations consisting of sets of combining and/or composite Unicode characters. The Unicode standard defines a process called normalization that returns one binary representation when given any of the equivalent binary representations of a character. Normalization can be performed with several algorithms, called normalization forms, that obey different rules, as described in Using Unicode Normalization to Represent Strings. The Win32 and the .NET Framework currently support normalization forms C, D, KC, and KD, as defined in Unicode Standard Annex #15: Unicode Normalization Forms. Normalized strings are typically evaluated with an ordinal comparison.
The following code demonstrates the use of the buffer length estimate:
const int maxIterations = 10; LPWSTR strResult = NULL; HANDLE hHeap = GetProcessHeap(); int iSizeEstimated = NormalizeString(form, strInput, -1, NULL, 0); for (int i = 0; i < maxIterations; i++) { if (strResult) HeapFree(hHeap, 0, strResult); strResult = (LPWSTR)HeapAlloc(hHeap, 0, iSizeEstimated * sizeof (WCHAR)); iSizeEstimated = NormalizeString(form, strInput, -1, strResult, iSizeEstimated); if (iSizeEstimated > 0) break; // success if (iSizeEstimated <= 0) { DWORD dwError = GetLastError(); if (dwError != ERROR_INSUFFICIENT_BUFFER) break; // Real error, not buffer error // New guess is negative of the return value. iSizeEstimated = -iSizeEstimated; } }
Windows XP, Windows Server 2003: The required header file and DLL are part of the "Microsoft Internationalized Domain Name (IDN) Mitigation APIs" download, available at the MSDN Download Center.
Examples
An example showing the use of this function can be found in NLS: Unicode Normalization Sample.
Requirements
|
Minimum supported client |
Windows Vista [desktop apps | Windows Store apps] |
|---|---|
|
Minimum supported server |
Windows Server 2008 [desktop apps | Windows Store apps] |
|
Redistributable |
Microsoft Internationalized Domain Name (IDN) Mitigation APIs onWindows XP with SP2 and later, orWindows Server 2003 with SP1 |
|
Header |
|
|
DLL |
|
See also
- National Language Support
- National Language Support Functions
- Using Unicode Normalization to Represent Strings
- IsNormalizedString
- NORM_FORM