Character Set Recognition
Internet Explorer uses the character set specified for a document to determine how to translate the bytes in the document into characters on the screen or on paper. By default, Internet Explorer uses the character set specified in the HTTP content type returned by the server to determine this translation. If this parameter is not given, Internet Explorer uses the character set specified by the meta element in the document. It uses the user's preferences if no meta element is specified.
You can use the meta element to explicitly set the character set for a document. In this case, set the HTTP-EQUIV attribute to Content-Type and specify a character set identifier in the CONTENT attribute. For example, the following meta element identifies windows-1251 as the character set for the document.
> <PRE CLASS="clsCode"><META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=windows-1251"> </PRE> <![CDATA[]]>
To apply a character set to an entire document, you must insert the meta element before the body element. For clarity, it should appear as the first element after head, so that all browsers can translate the meta element before the document is parsed. The meta element applies to the document containing it. This means, for example, that a compound document (a document consisting of two or more documents in a set of frames) can use different character sets in different frames.
The following table contains information about the character sets supported by Microsoft Internet Explorer 5, and it includes the following information.
- Charset Friendly Name. Name used to refer to the character set.
- Preferred Charset Label. Most common identifier used to set character sets in Internet Explorer. For example, in the previous code sample the Charset Label is windows-1251. These identifiers are used for outbound data.
- Aliases. Other identifiers that can be used to set character sets. These identifiers are used for inbound data.
- IE Ver. Versions of Internet Explorer that support the listed character sets.
- Min OS. Minimum operating system that supports the listed character sets.
- Code Page. Code page that supports the listed character sets.
- Family Code Page. Indicates a Windows code page that is used to represent all or most of the characters in a charset.
Charsets in Internet Explorer 5
| CharsetFriendlyName | Preferred Charset Label | Aliases | IE Ver | Min OS | CodePage | FamilyCodePage |
|---|---|---|---|---|---|---|
| Arabic (ASMO 708) | ASMO-708 | IE5 | Win95 | 708 | 1256 | |
| Arabic (DOS) | DOS-720 | IE5 | Win95 | 720 | 1256 | |
| Arabic (ISO) | iso-8859-6 | arabic, csISOLatinArabic, ECMA-114, ISO_8859-6, ISO_8859-6:1987, iso-ir-127 | IE5, IE4 | Win95 | 28596 | 1256 |
| Arabic (Mac) | x-mac-arabic | IE5 | Win2000 | 10004 | 1256 | |
| Arabic (Windows) | windows-1256 | cp1256 | IE5 | Win95 | 1256 | 1256 |
| Baltic (DOS) | ibm775 | CP500 | IE5 | Win2000 | 775 | 1257 |
| Baltic (ISO) | iso-8859-4 | csISOLatin4, ISO_8859-4, ISO_8859-4:1988, iso-ir-110, l4, latin4 | IE5 | Win95 | 28594 | 1257 |
| Baltic (Windows) | windows-1257 | IE5 | Win95 | 1257 | 1257 | |
| Central European (DOS) | ibm852 | cp852 | IE5, IE4 | Win95 | 852 | 1250 |
| Central European (ISO) | iso-8859-2 | csISOLatin2, iso_8859-2, iso_8859-2:1987, iso8859-2, iso-ir-101, l2, latin2 | IE5, IE4 | Win95 | 28592 | 1250 |
| Central European (Mac) | x-mac-ce | IE5 | Win2000 | 10029 | 1250 | |
| Central European (Windows) | windows-1250 | x-cp1250 | IE5 | Win95 | 1250 | 1250 |
| Chinese Simplified (EUC) | EUC-CN | x-euc-cn | IE5 | Win2000 | 51936 | 936 |
| Chinese Simplified (GB2312) | gb2312 | chinese, CN-GB, csGB2312, csGB231280, csISO58GB231280, GB_2312-80, GB231280, GB2312-80, GBK, iso-ir-58 | IE5, IE4 | Win95 | 936 | 936 |
| Chinese Simplified (HZ) | hz-gb-2312 | IE5, IE4 | Win95 | 52936 | 936 | |
| Chinese Simplified (Mac) | x-mac-chinesesimp | IE5 | Win2000 | 10008 | 936 | |
| Chinese Traditional (Big5) | big5 | cn-big5, csbig5, x-x-big5 | IE5, IE4 | Win95 | 950 | 950 |
| Chinese Traditional (CNS) | x-Chinese-CNS | IE5 | Win2000 | 20000 | 950 | |
| Chinese Traditional (Eten) | x-Chinese-Eten | IE5 | Win2000 | 20002 | 950 | |
| Chinese Traditional (Mac) | x-mac-chinesetrad | IE5 | Win2000 | 10002 | 950 | |
| Cyrillic (DOS) | cp866 | ibm866 | IE5, IE4 | Win95 | 866 | 1251 |
| Cyrillic (ISO) | iso-8859-5 | csISOLatin5, csISOLatinCyrillic, cyrillic, ISO_8859-5, ISO_8859-5:1988, iso-ir-144, l5 | IE5, IE4 | Win95 | 28595 | 1251 |
| Cyrillic (KOI8-R) | koi8-r | csKOI8R, koi, koi8, koi8r | IE5, IE4 | Win95 | 20866 | 1251 |
| Cyrillic (KOI8-U) | koi8-u | koi8-ru | IE5 | Win95 | 21866 | 1251 |
| Cyrillic (Mac) | x-mac-cyrillic | IE5 | Win2000 | 10007 | 1251 | |
| Cyrillic (Windows) | windows-1251 | x-cp1251 | IE5 | Win95 | 1251 | 1251 |
| Europa | x-Europa | IE5 | n.a. | 29001 | 1252 | |
| German (IA5) | x-IA5-German | IE5 | Win2000 | 20106 | 1252 | |
| Greek (DOS) | ibm737 | IE5 | Win2000 | 737 | 1253 | |
| Greek (ISO) | iso-8859-7 | csISOLatinGreek, ECMA-118, ELOT_928, greek, greek8, ISO_8859-7, ISO_8859-7:1987, iso-ir-126 | IE5, IE4 | Win95 | 28597 | 1253 |
| Greek (Mac) | x-mac-greek | IE5 | Win2000 | 10006 | 1253 | |
| Greek (Windows) | windows-1253 | IE5 | Win95 | 1253 | 1253 | |
| Greek, Modern (DOS) | ibm869 | IE5 | Win2000 | 869 | 1253 | |
| Hebrew (DOS) | DOS-862 | IE5 | Win95 | 862 | 1255 | |
| Hebrew (ISO-Logical) | iso-8859-8-i | logical | IE5, IE4 | Win95 | 38598 | 1255 |
| Hebrew (ISO-Visual) | iso-8859-8 | csISOLatinHebrew, hebrew, ISO_8859-8, ISO_8859-8:1988, ISO-8859-8, iso-ir-138, visual | IE5, IE4 | Win95 | 28598 | 1255 |
| Hebrew (Mac) | x-mac-hebrew | IE5 | Win2000 | 10005 | 1255 | |
| Hebrew (Windows) | windows-1255 | ISO_8859-8-I, ISO-8859-8, visual | IE5 | Win95 | 1255 | 1255 |
| IBM EBCDIC (Arabic) | x-EBCDIC-Arabic | IE5 | Win2000 | 20420 | 1256 | |
| IBM EBCDIC (Cyrillic Russian) | x-EBCDIC-CyrillicRussian | IE5 | Win2000 | 20880 | 1251 | |
| IBM EBCDIC (Cyrillic Serbian-Bulgarian) | x-EBCDIC-CyrillicSerbianBulgarian | IE5 | Win2000 | 21025 | 1251 | |
| IBM EBCDIC (Denmark-Norway) | x-EBCDIC-DenmarkNorway | IE5 | Win2000 | 20277 | 1252 | |
| IBM EBCDIC (Denmark-Norway-Euro) | x-ebcdic-denmarknorway-euro | IE5 | Win2000 | 1142 | 1252 | |
| IBM EBCDIC (Finland-Sweden) | x-EBCDIC-FinlandSweden | IE5 | Win2000 | 20278 | 1252 | |
| IBM EBCDIC (Finland-Sweden-Euro) | x-ebcdic-finlandsweden-euro | IE5 | Win2000 | 1143 | 1252 | |
| IBM EBCDIC (Finland-Sweden-Euro) | x-ebcdic-finlandsweden-euro | X-EBCDIC-France | IE5 | Win2000 | 1143 | 1252 |
| IBM EBCDIC (France-Euro) | x-ebcdic-france-euro | IE5 | Win2000 | 1147 | 1252 | |
| IBM EBCDIC (Germany) | x-EBCDIC-Germany | IE5 | Win2000 | 20273 | 1252 | |
| IBM EBCDIC (Germany-Euro) | x-ebcdic-germany-euro | IE5 | Win2000 | 1141 | 1252 | |
| IBM EBCDIC (Greek Modern) | x-EBCDIC-GreekModern | IE5 | Win2000 | 875 | 1253 | |
| IBM EBCDIC (Greek) | x-EBCDIC-Greek | IE5 | Win2000 | 20423 | 1253 | |
| IBM EBCDIC (Hebrew) | x-EBCDIC-Hebrew | IE5 | Win2000 | 20424 | 1255 | |
| IBM EBCDIC (Icelandic) | x-EBCDIC-Icelandic | IE5 | Win2000 | 20871 | 1252 | |
| IBM EBCDIC (Icelandic-Euro) | x-ebcdic-icelandic-euro | IE5 | Win2000 | 1149 | 1252 | |
| IBM EBCDIC (International-Euro) | x-ebcdic-international-euro | IE5 | Win2000 | 1148 | 1252 | |
| IBM EBCDIC (Italy) | x-EBCDIC-Italy | IE5 | Win2000 | 20280 | 1252 | |
| IBM EBCDIC (Italy-Euro) | x-ebcdic-italy-euro | IE5 | Win2000 | 1144 | 1252 | |
| IBM EBCDIC (Japanese and Japanese Katakana) | x-EBCDIC-JapaneseAndKana | IE5 | Win2000 | 50930 | 932 | |
| IBM EBCDIC (Japanese and Japanese-Latin) | x-EBCDIC-JapaneseAndJapaneseLatin | IE5 | Win2000 | 50939 | 932 | |
| IBM EBCDIC (Japanese and US-Canada) | x-EBCDIC-JapaneseAndUSCanada | IE5 | Win2000 | 50931 | 932 | |
| IBM EBCDIC (Japanese katakana) | x-EBCDIC-JapaneseKatakana | IE5 | Win2000 | 20290 | 932 | |
| IBM EBCDIC (Korean and Korean Extended) | x-EBCDIC-KoreanAndKoreanExtended | IE5 | Win2000 | 50933 | 949 | |
| IBM EBCDIC (Korean Extended) | x-EBCDIC-KoreanExtended | IE5 | Win2000 | 20833 | 949 | |
| IBM EBCDIC (Multilingual Latin-2) | CP870 | IE5 | Win2000 | 870 | 1250 | |
| IBM EBCDIC (Simplified Chinese) | x-EBCDIC-SimplifiedChinese | IE5 | Win2000 | 50935 | 936 | |
| IBM EBCDIC (Spain) | X-EBCDIC-Spain | IE5 | Win2000 | 20284 | 1252 | |
| IBM EBCDIC (Spain-Euro) | x-ebcdic-spain-euro | IE5 | Win2000 | 1145 | 1252 | |
| IBM EBCDIC (Thai) | x-EBCDIC-Thai | IE5 | Win2000 | 20838 | 874 | |
| IBM EBCDIC (Traditional Chinese) | x-EBCDIC-TraditionalChinese | IE5 | Win2000 | 50937 | 950 | |
| IBM EBCDIC (Turkish Latin-5) | CP1026 | IE5 | Win2000 | 1026 | 1254 | |
| IBM EBCDIC (Turkish) | x-EBCDIC-Turkish | IE5 | Win2000 | 20905 | 1254 | |
| IBM EBCDIC (UK) | x-EBCDIC-UK | IE5 | Win2000 | 20285 | 1252 | |
| IBM EBCDIC (UK-Euro) | x-ebcdic-uk-euro | IE5 | Win2000 | 1146 | 1252 | |
| IBM EBCDIC (US-Canada) | ebcdic-cp-us | IE5 | Win2000 | 37 | 1252 | |
| IBM EBCDIC (US-Canada-Euro) | x-ebcdic-cp-us-euro | IE5 | Win2000 | 1140 | 1252 | |
| Icelandic (DOS) | ibm861 | IE5 | Win2000 | 861 | 1252 | |
| Icelandic (Mac) | x-mac-icelandic | IE5 | Win2000 | 10079 | 1252 | |
| ISCII Assamese | x-iscii-as | IE5 | Win2000 | 57006 | 57006 | |
| ISCII Bengali | x-iscii-be | IE5 | Win2000 | 57003 | 57003 | |
| ISCII Devanagari | x-iscii-de | IE5 | Win2000 | 57002 | 57002 | |
| ISCII Gujarathi | x-iscii-gu | IE5 | Win2000 | 57010 | 57010 | |
| ISCII Kannada | x-iscii-ka | IE5 | Win2000 | 57008 | 57008 | |
| ISCII Malayalam | x-iscii-ma | IE5 | Win2000 | 57009 | 57009 | |
| ISCII Oriya | x-iscii-or | IE5 | Win2000 | 57007 | 57007 | |
| ISCII Panjabi | x-iscii-pa | IE5 | Win2000 | 57011 | 57011 | |
| ISCII Tamil | x-iscii-ta | IE5 | Win2000 | 57004 | 57004 | |
| ISCII Telugu | x-iscii-te | IE5 | Win2000 | 57005 | 57005 | |
| Japanese (EUC) | euc-jp | csEUCPkdFmtJapanese, Extended_UNIX_Code_Packed_Format_for_Japanese, x-euc, x-euc-jp | IE5, IE4 | Win95 | 51932 | 932 |
| Japanese (JIS) | iso-2022-jp | IE5, IE4 | Win95 | 50220 | 932 | |
| Japanese (JIS-Allow 1 byte Kana - SO/SI) | iso-2022-jp | _iso-2022-jp$SIO | IE5 | Win95 | 50222 | 932 |
| Japanese (JIS-Allow 1 byte Kana) | csISO2022JP | _iso-2022-jp | IE5 | Win95 | 50221 | 932 |
| Japanese (Mac) | x-mac-japanese | IE5 | Win2000 | 10001 | 932 | |
| Japanese (Shift-JIS) | shift_jis | csShiftJIS, csWindows31J, ms_Kanji, shift-jis, x-ms-cp932, x-sjis | IE5, IE4 | Win95 | 932 | 932 |
| Korean | ks_c_5601-1987 | csKSC56011987, euc-kr, iso-ir-149, korean, ks_c_5601, ks_c_5601_1987, ks_c_5601-1989, KSC_5601, KSC5601 | IE5 | Win95 | 949 | 949 |
| Korean (EUC) | euc-kr | csEUCKR | IE5 | Win95 | 51949 | 949 |
| Korean (ISO) | iso-2022-kr | csISO2022KR | IE5 | Win95 | 50225 | 949 |
| Korean (Johab) | Johab | IE5 | Win2000 | 1361 | 1361 | |
| Korean (Mac) | x-mac-korean | IE5 | Win2000 | 10003 | 949 | |
| Latin 3 (ISO) | iso-8859-3 | csISOLatin3, ISO_8859-3, ISO_8859-3:1988, iso-ir-109, l3, latin3 | IE5, IE4 | Win95 | 28593 | 1254 |
| Latin 9 (ISO) | iso-8859-15 | csISOLatin9, ISO_8859-15, l9, latin9 | IE5 | Win95 | 28605 | 1252 |
| Norwegian (IA5) | x-IA5-Norwegian | IE5 | Win2000 | 20108 | 1252 | |
| OEM United States | IBM437 | 437, cp437, csPC8, CodePage437 | IE5 | Win2000 | 437 | 1252 |
| Swedish (IA5) | x-IA5-Swedish | IE5 | Win2000 | 20107 | 1252 | |
| Thai (Windows) | windows-874 | DOS-874, iso-8859-11, TIS-620 | IE5, IE4 | Win95 | 874 | 874 |
| Turkish (DOS) | ibm857 | IE5 | Win2000 | 857 | 1254 | |
| Turkish (ISO) | iso-8859-9 | csISOLatin5, ISO_8859-9, ISO_8859-9:1989, iso-ir-148, l5, latin5 | IE5 | Win95 | 28599 | 1254 |
| Turkish (Mac) | x-mac-turkish | IE5 | Win2000 | 10081 | 1254 | |
| Turkish (Windows) | windows-1254 | ISO_8859-9, ISO_8859-9:1989, iso-8859-9, iso-ir-148, latin5 | IE5 | Win95 | 1254 | 1254 |
| Unicode | unicode | utf-16 | IE5, IE4 | Win95 | 1200 | 1200 |
| Unicode (Big-Endian) | unicodeFFFE | IE5, IE4 | Win95 | 1201 | 1200 | |
| Unicode (UTF-7) | utf-7 | csUnicode11UTF7, unicode-1-1-utf-7, x-unicode-2-0-utf-7 | IE5, IE4 | Win95 | 65000 | 1200 |
| Unicode (UTF-8) | utf-8 | unicode-1-1-utf-8, unicode-2-0-utf-8, x-unicode-2-0-utf-8 | IE5, IE4 | Win95 | 65001 | 1200 |
| US-ASCII | us-ascii | ANSI_X3.4-1968, ANSI_X3.4-1986, ascii, cp367, csASCII, IBM367, ISO_646.irv:1991, ISO646-US, iso-ir-6us | IE5 | Win95 | 20127 | 1252 |
| Vietnamese (Windows) | windows-1258 | IE5, IE4 | Win95 | 1258 | 1258 | |
| Western European (DOS) | ibm850 | IE5 | Win2000 | 850 | 1252 | |
| Western European (IA5) | x-IA5 | IE5 | Win2000 | 20105 | 1252 | |
| Western European (ISO) | iso-8859-1 | cp819, csISOLatin1, ibm819, iso_8859-1, iso_8859-1:1987, iso8859-1, iso-ir-100, l1, latin1 | IE5 | Win95 | 28591 | 1252 |
| Western European (Mac) | macintosh | IE5 | Win2000 | 10000 | 1252 | |
| Western European (Windows) | Windows-1252 | ANSI_X3.4-1968, ANSI_X3.4-1986, ascii, cp367, cp819, csASCII, IBM367, ibm819, ISO_646.irv:1991, iso_8859-1, iso_8859-1:1987, ISO646-US, iso8859-1, iso-8859-1, iso-ir-100, iso-ir-6, latin1, us, us-ascii, x-ansi | IE5 | Win95 | 1252 | 1252 |
Internal Charsets Not for General Use
The following character sets are not for general use, so do not use them to label documents.
| Charset Friendly Name | Preferred Charset Label | Aliases | IE Ver | Min OS | Code Page | Family Code Page |
|---|---|---|---|---|---|---|
| User Defined | x-user-defined | IE5, IE4 | Win95 | 50000 | 50000 | |
| Japanese (Auto-Select) | IE5, IE4 | Win95 | 50932 | 932 | ||
| Auto-Select | IE5 | Win95 | 50001 | 50001 | ||
| Korean (Auto-Select) | IE5, IE4 | Win95 | 50949 | 949 |