Encode and Decode XML Element and Attribute Names and ID Values

Element and attribute names or ID values are limited to a range of XML characters according to the Extensible Markup Language (XML) 1.0 (Second Edition) Recommendation, located at www.w3.org/TR/2000/REC-xml-20001006.html. When names contain invalid characters, encode and decode methods are used to translate them into valid XML names.

Many languages and applications such as Microsoft SQL Server and Microsoft Word, allow Unicode characters in their names, which are not valid in XML names. For example, if Order Detail were a column heading in a database, the database allows the space between the words Order and Detail, however, in XML, the space between Order and Detail is considered an invalid XML character. Thus, the space, the invalid character, needs to be converted into an escaped hexadecimal encoding and can be decoded later.

The XmlTextWriter class does not perform character checks by default. For example, the code WriteElementString("Order Detail", "My order"); produces an invalid element of <Order Detail>My order</Order Detail>.

To encode the element value, the correct encoding is writer.WriteElementString(XmlConvert.EncodeName("Order Detail"), "My order") which produces the valid element <Order_0x0020_Detail>My order</Order_0x0020_Detail>.

See Also

Conversion of XML Data Types