2.1.5 Character Set Standards

Character sets in the HTML 5 standard [HTML5] are referenced in ISO/IEC 10646-2003, Information technology -- Universal Multiple-Octet Coded Character Set (UCS) (see [MS-ISO10646]). All versions of Windows Internet Explorer support ISO/IEC 8859-1 and others, Information Technology -- 8-bit Single-byte Coded Graphic Character Sets (see [MS-ISO8859]). In general, string handling is performed as UTF-16.

Character set values are supplied to HTML using either the Content-Type header or the META element. The following example specifies the character set for the Latin alphabet set number 1:

 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">

The following example does the same with an XML processing instruction:

 <?xml version="1.0" charset="iso-8859-1"?>

For more information, see [MSDN-EncodeXMLData].