Export (0) Print
Expand All
0 out of 1 rated this helpful - Rate this topic

Supplementary Characters

The strings nchar and nvarchar store each character as a 16-bit value in an encoding called UCS-2. This encoding, defined by versions of Unicode prior to 1996, supports characters in the range U+0000 to U+FFFF. Newer versions of Unicode have defined additional characters in the range U+10000 to U+10FFFF called supplementary characters. These characters are stored as pairs of 16-bit values called surrogate pairs in an encoding called UTF-16. All new _100 level collations support linguistic sorting with supplementary characters.

If you use supplementary characters, consider the following limitations:

  • Supplementary characters can only be used in ordering and comparison operations in collation versions 90 or greater.

  • Because supplementary characters are stored as two 16-bit values, the LEN() function returns the value 2 for each supplementary character that is contained in the argument string. Similarly, the functions CHARINDEX and PATINDEX misrepresent the occurrence of supplementary characters inside character strings.

  • The LEFT, RIGHT, SUBSTRING, STUFF, and REVERSE functions may split any surrogate pairs and lead to unexpected results.

  • Supplementary characters are not supported for use with the underscore (_), percent (%), and caret (^) wildcard characters.

  • Supplementary characters are not supported for use in metadata, such as in names of database objects.

Did you find this helpful?
(1500 characters remaining)
Thank you for your feedback

Community Additions

© 2014 Microsoft. All rights reserved.