This topic has not yet been rated - Rate this topic

2.2.1.1 Character Sequences

In all dialects prior to NT LAN Manager, all character sequences were encoded using the OEM character set (extended ASCII). The NT LAN Manager dialect introduced support for Unicode, which is negotiated during protocol negotiation and session setup. The use of Unicode characters is indicated on a per-message basis by setting the SMB_FLAGS2_UNICODE flag in the SMB_Header.Flags2 field. All Unicode characters MUST be in UTF-16LE encoding.

In CIFS, character sequences are transmitted over the wire as arrays of either UCHAR (for OEM characters) or WCHAR (for Unicode characters). Throughout this document, null-terminated character sequence fields that may be encoded in either Unicode or OEM characters (depending on the result of Unicode capability negotiation) are labeled as SMB_STRING fields.

Unless otherwise noted, when a Unicode string is passed it MUST be aligned to a 16-bit boundary with respect to the beginning of the SMB Header (section 2.2.3.1). In the case where the string does not naturally fall on a 16-bit boundary, a null padding byte MUST be inserted, and the string MUST begin at the next address. For Core Protocol messages in which a buffer format byte precedes a Unicode string, the padding byte is found after the buffer format byte.

String fields that restrict character encoding to OEM characters only, even if Unicode support has been negotiated, are labeled as OEM_STRING. Some examples of strings that are never passed in Unicode are:

 
Did you find this helpful?
(1500 characters remaining)
© 2013 Microsoft. All rights reserved.