Managing Data Conversion Between a Unicode Server and a Non-Unicode Client
This topic describes how to preserve the integrity of character data when the server-side data storage is in Unicode, but the client-side application that interacts with the data uses a specific code page.
When non-Unicode data is sent from the client to be stored on the server in Unicode, data from any client with any code page can be stored correctly if one of the following conditions is true:
Character strings are sent to the server as parameters of a remote procedure call (RPC).
String constants are preceded with the capital letter N. This is required regardless of whether your client-side application is Unicode-aware. Without the N prefix, SQL Server will convert the string to the code page that corresponds to the default collation of the database. Any characters not found in this code page will be lost.
If the client application is not Unicode-enabled and retrieves the data into non-Unicode buffers, a client will only be able to retrieve or modify data that can be represented by the client machine's code page. This means that ASCII characters can always be retrieved, because the representation of ASCII characters is the same in all code pages, while any non-ASCII data depends on code-page-to-code-page conversion.
For example, suppose you have an application that is currently running only in the United States (U.S.), but is deployed to Japan. Because the SQL Server database is Unicode-aware, both the English and Japanese text can be stored in the same tables, even though the application has not yet been modified to deal with text as Unicode. As long as the application complies with one of the two previous options, Japanese users can use the non-Unicode application to input and retrieve Japanese data, and U.S. users can input and retrieve English data. All data from both sets of users is stored intact in the same column of the database and represented as Unicode. In this situation, a Unicode-enabled reporting application that generates reports that span the complete data set can be deployed. However, English users cannot view the Japanese rows, because the application cannot display any characters that do not exist in their code page (1252).
This situation might be acceptable if the two groups of users do not have to view each other's records. If an application user must be able to view or modify records with text that cannot be represented by a single code page, there is no alternative but to modify the application so that it can use Unicode.
If the client-side program is Web-based or connects to an Active Server Pages (ASP) page, there are metadata specifications on both the client-side HTML page and the server-side ASP page. These specifications must be made to specify how character strings should be converted between the server, the ASP engine, and the client browser.
On the client side HTML page, the META attribute must specify that the character set data should be converted to the encoding scheme of the client by specifying a CHARSET code. For example, the following HTML page instructs the client to convert character data to the 950 (Chinese Traditional) code page by specifying
big5 as the
CHARSET code. To see the character set codes for the META attribute, go to this M i crosoft W e b site.
<HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=big5"> <!-- --> </HEAD> <BODY> <!-- body --> </BODY> </HTML>
On the server-side ASP page, you must instruct the ASP Web application what code page the client browser is using. You can specify the
Session.CodePage property, or the @CodePage directive. These methods will handle the conversion of data from server to client and also both GET and POST client requests. In the following examples, both methods are used to specify conversion to and from the code page of the client, which is
950 (Chinese Traditional).
<%@ Language=VBScript codepage=950 %> <% Session.CodePage=950 %>
And finally, you must remember to prefix any string literals with the letter N.