Character Set Recognition

Note  As of December 2011, this topic has been archived and is no longer actively maintained. For more information, see Archived Content. For information, recommendations, and guidance regarding the current version of Windows Internet Explorer, see Internet Explorer Developer Center.
 

Internet Explorer uses the character set specified for a document to determine how to translate the bytes in the document into characters on the screen or on paper. By default, Internet Explorer uses the character set specified in the HTTP content type returned by the server to determine this translation. If this parameter is not given, Internet Explorer uses the character set specified by the meta element in the document. It uses the user's preferences if no meta element is specified.

You can use the meta element to explicitly set the character set for a document. In this case, set the HTTP-EQUIV attribute to Content-Type and specify a character set identifier in the CONTENT attribute. For example, the following meta element identifies windows-1251 as the character set for the document.

> <PRE CLASS="clsCode">&lt;META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=windows-1251"&gt; </PRE> <![CDATA[]]&gt

To apply a character set to an entire document, you must insert the meta element before the body element. For clarity, it should appear as the first element after head, so that all browsers can translate the meta element before the document is parsed. The meta element applies to the document containing it. This means, for example, that a compound document (a document consisting of two or more documents in a set of frames) can use different character sets in different frames.

The following table contains information about the character sets supported by Microsoft Internet Explorer 5, and it includes the following information.

  • Charset Friendly Name. Name used to refer to the character set.
  • Preferred Charset Label. Most common identifier used to set character sets in Internet Explorer. For example, in the previous code sample the Charset Label is windows-1251. These identifiers are used for outbound data.
  • Aliases. Other identifiers that can be used to set character sets. These identifiers are used for inbound data.
  • IE Ver. Versions of Internet Explorer that support the listed character sets.
  • Min OS. Minimum operating system that supports the listed character sets.
  • Code Page. Code page that supports the listed character sets.
  • Family Code Page. Indicates a Windows code page that is used to represent all or most of the characters in a charset.

Charsets in Internet Explorer 5

CharsetFriendlyNamePreferred Charset LabelAliasesIE VerMin OSCodePageFamilyCodePage
Arabic (ASMO 708)ASMO-708 IE5Win957081256
Arabic (DOS)DOS-720 IE5Win957201256
Arabic (ISO)iso-8859-6arabic, csISOLatinArabic, ECMA-114, ISO_8859-6, ISO_8859-6:1987, iso-ir-127IE5, IE4Win95285961256
Arabic (Mac)x-mac-arabic IE5Win2000100041256
Arabic (Windows)windows-1256cp1256 IE5Win9512561256
Baltic (DOS)ibm775CP500IE5Win20007751257
Baltic (ISO)iso-8859-4csISOLatin4, ISO_8859-4, ISO_8859-4:1988, iso-ir-110, l4, latin4IE5Win95285941257
Baltic (Windows)windows-1257 IE5Win9512571257
Central European (DOS)ibm852cp852IE5, IE4Win958521250
Central European (ISO)iso-8859-2csISOLatin2, iso_8859-2, iso_8859-2:1987, iso8859-2, iso-ir-101, l2, latin2IE5, IE4Win95285921250
Central European (Mac)x-mac-ce IE5Win2000100291250
Central European (Windows)windows-1250x-cp1250IE5Win9512501250
Chinese Simplified (EUC)EUC-CNx-euc-cnIE5Win200051936936
Chinese Simplified (GB2312)gb2312chinese, CN-GB, csGB2312, csGB231280, csISO58GB231280, GB_2312-80, GB231280, GB2312-80, GBK, iso-ir-58IE5, IE4Win95936936
Chinese Simplified (HZ)hz-gb-2312 IE5, IE4Win9552936936
Chinese Simplified (Mac)x-mac-chinesesimp IE5Win200010008936
Chinese Traditional (Big5)big5cn-big5, csbig5, x-x-big5IE5, IE4Win95950950
Chinese Traditional (CNS)x-Chinese-CNS IE5Win200020000950
Chinese Traditional (Eten)x-Chinese-Eten IE5Win200020002950
Chinese Traditional (Mac)x-mac-chinesetrad IE5Win200010002950
Cyrillic (DOS)cp866ibm866IE5, IE4Win958661251
Cyrillic (ISO)iso-8859-5csISOLatin5, csISOLatinCyrillic, cyrillic, ISO_8859-5, ISO_8859-5:1988, iso-ir-144, l5IE5, IE4Win95285951251
Cyrillic (KOI8-R)koi8-rcsKOI8R, koi, koi8, koi8rIE5, IE4Win95208661251
Cyrillic (KOI8-U)koi8-ukoi8-ruIE5Win95218661251
Cyrillic (Mac)x-mac-cyrillic IE5Win2000100071251
Cyrillic (Windows)windows-1251x-cp1251IE5Win9512511251
Europax-Europa IE5n.a.290011252
German (IA5)x-IA5-German IE5Win2000201061252
Greek (DOS)ibm737 IE5Win20007371253
Greek (ISO)iso-8859-7csISOLatinGreek, ECMA-118, ELOT_928, greek, greek8, ISO_8859-7, ISO_8859-7:1987, iso-ir-126IE5, IE4Win95285971253
Greek (Mac)x-mac-greek IE5Win2000100061253
Greek (Windows)windows-1253 IE5Win9512531253
Greek, Modern (DOS)ibm869 IE5Win20008691253
Hebrew (DOS)DOS-862 IE5Win958621255
Hebrew (ISO-Logical)iso-8859-8-ilogicalIE5, IE4Win95385981255
Hebrew (ISO-Visual)iso-8859-8csISOLatinHebrew, hebrew, ISO_8859-8, ISO_8859-8:1988, ISO-8859-8, iso-ir-138, visualIE5, IE4Win95285981255
Hebrew (Mac)x-mac-hebrew IE5Win2000100051255
Hebrew (Windows)windows-1255ISO_8859-8-I, ISO-8859-8, visualIE5Win9512551255
IBM EBCDIC (Arabic)x-EBCDIC-Arabic IE5Win2000204201256
IBM EBCDIC (Cyrillic Russian)x-EBCDIC-CyrillicRussian IE5Win2000208801251
IBM EBCDIC (Cyrillic Serbian-Bulgarian)x-EBCDIC-CyrillicSerbianBulgarian IE5Win2000210251251
IBM EBCDIC (Denmark-Norway)x-EBCDIC-DenmarkNorway IE5Win2000202771252
IBM EBCDIC (Denmark-Norway-Euro)x-ebcdic-denmarknorway-euro IE5Win200011421252
IBM EBCDIC (Finland-Sweden)x-EBCDIC-FinlandSweden IE5Win2000202781252
IBM EBCDIC (Finland-Sweden-Euro)x-ebcdic-finlandsweden-euro IE5Win200011431252
IBM EBCDIC (Finland-Sweden-Euro)x-ebcdic-finlandsweden-euroX-EBCDIC-FranceIE5Win200011431252
IBM EBCDIC (France-Euro)x-ebcdic-france-euro IE5Win200011471252
IBM EBCDIC (Germany)x-EBCDIC-Germany IE5Win2000202731252
IBM EBCDIC (Germany-Euro)x-ebcdic-germany-euro IE5Win200011411252
IBM EBCDIC (Greek Modern)x-EBCDIC-GreekModern IE5Win20008751253
IBM EBCDIC (Greek)x-EBCDIC-Greek IE5Win2000204231253
IBM EBCDIC (Hebrew)x-EBCDIC-Hebrew IE5Win2000204241255
IBM EBCDIC (Icelandic)x-EBCDIC-Icelandic IE5Win2000208711252
IBM EBCDIC (Icelandic-Euro)x-ebcdic-icelandic-euro IE5Win200011491252
IBM EBCDIC (International-Euro)x-ebcdic-international-euro IE5Win200011481252
IBM EBCDIC (Italy)x-EBCDIC-Italy IE5Win2000202801252
IBM EBCDIC (Italy-Euro)x-ebcdic-italy-euro IE5Win200011441252
IBM EBCDIC (Japanese and Japanese Katakana)x-EBCDIC-JapaneseAndKana IE5Win200050930932
IBM EBCDIC (Japanese and Japanese-Latin)x-EBCDIC-JapaneseAndJapaneseLatin IE5Win200050939932
IBM EBCDIC (Japanese and US-Canada)x-EBCDIC-JapaneseAndUSCanada IE5Win200050931932
IBM EBCDIC (Japanese katakana)x-EBCDIC-JapaneseKatakana IE5Win200020290932
IBM EBCDIC (Korean and Korean Extended)x-EBCDIC-KoreanAndKoreanExtended IE5Win200050933949
IBM EBCDIC (Korean Extended)x-EBCDIC-KoreanExtended IE5Win200020833949
IBM EBCDIC (Multilingual Latin-2)CP870 IE5Win20008701250
IBM EBCDIC (Simplified Chinese)x-EBCDIC-SimplifiedChinese IE5Win200050935936
IBM EBCDIC (Spain)X-EBCDIC-Spain IE5Win2000202841252
IBM EBCDIC (Spain-Euro)x-ebcdic-spain-euro IE5Win200011451252
IBM EBCDIC (Thai)x-EBCDIC-Thai IE5Win200020838874
IBM EBCDIC (Traditional Chinese)x-EBCDIC-TraditionalChinese IE5Win200050937950
IBM EBCDIC (Turkish Latin-5)CP1026 IE5Win200010261254
IBM EBCDIC (Turkish)x-EBCDIC-Turkish IE5Win2000209051254
IBM EBCDIC (UK)x-EBCDIC-UK IE5Win2000202851252
IBM EBCDIC (UK-Euro)x-ebcdic-uk-euro IE5Win200011461252
IBM EBCDIC (US-Canada)ebcdic-cp-us IE5Win2000371252
IBM EBCDIC (US-Canada-Euro)x-ebcdic-cp-us-euro IE5Win200011401252
Icelandic (DOS)ibm861 IE5Win20008611252
Icelandic (Mac)x-mac-icelandic IE5Win2000100791252
ISCII Assamesex-iscii-as IE5Win20005700657006
ISCII Bengalix-iscii-be IE5Win20005700357003
ISCII Devanagarix-iscii-de IE5Win20005700257002
ISCII Gujarathix-iscii-gu IE5Win20005701057010
ISCII Kannadax-iscii-ka IE5Win20005700857008
ISCII Malayalamx-iscii-ma IE5Win20005700957009
ISCII Oriyax-iscii-or IE5Win20005700757007
ISCII Panjabix-iscii-pa IE5Win20005701157011
ISCII Tamilx-iscii-ta IE5Win20005700457004
ISCII Telugux-iscii-te IE5Win20005700557005
Japanese (EUC)euc-jpcsEUCPkdFmtJapanese, Extended_UNIX_Code_Packed_Format_for_Japanese, x-euc, x-euc-jpIE5, IE4Win9551932932
Japanese (JIS)iso-2022-jp IE5, IE4Win9550220932
Japanese (JIS-Allow 1 byte Kana - SO/SI)iso-2022-jp_iso-2022-jp$SIOIE5Win9550222932
Japanese (JIS-Allow 1 byte Kana)csISO2022JP_iso-2022-jpIE5Win9550221932
Japanese (Mac)x-mac-japanese IE5Win200010001932
Japanese (Shift-JIS)shift_jiscsShiftJIS, csWindows31J, ms_Kanji, shift-jis, x-ms-cp932, x-sjisIE5, IE4Win95932932
Koreanks_c_5601-1987csKSC56011987, euc-kr, iso-ir-149, korean, ks_c_5601, ks_c_5601_1987, ks_c_5601-1989, KSC_5601, KSC5601IE5Win95949949
Korean (EUC)euc-krcsEUCKRIE5Win9551949949
Korean (ISO)iso-2022-krcsISO2022KRIE5Win9550225949
Korean (Johab)Johab IE5Win200013611361
Korean (Mac)x-mac-korean IE5Win200010003949
Latin 3 (ISO)iso-8859-3csISOLatin3, ISO_8859-3, ISO_8859-3:1988, iso-ir-109, l3, latin3IE5, IE4Win95285931254
Latin 9 (ISO)iso-8859-15csISOLatin9, ISO_8859-15, l9, latin9IE5Win95286051252
Norwegian (IA5)x-IA5-Norwegian IE5Win2000201081252
OEM United StatesIBM437437, cp437, csPC8, CodePage437IE5Win20004371252
Swedish (IA5)x-IA5-Swedish IE5Win2000201071252
Thai (Windows)windows-874DOS-874, iso-8859-11, TIS-620IE5, IE4Win95874874
Turkish (DOS)ibm857 IE5Win20008571254
Turkish (ISO)iso-8859-9csISOLatin5, ISO_8859-9, ISO_8859-9:1989, iso-ir-148, l5, latin5IE5Win95285991254
Turkish (Mac)x-mac-turkish IE5Win2000100811254
Turkish (Windows)windows-1254ISO_8859-9, ISO_8859-9:1989, iso-8859-9, iso-ir-148, latin5IE5Win9512541254
Unicodeunicodeutf-16IE5, IE4Win9512001200
Unicode (Big-Endian)unicodeFFFE IE5, IE4Win9512011200
Unicode (UTF-7)utf-7csUnicode11UTF7, unicode-1-1-utf-7, x-unicode-2-0-utf-7IE5, IE4Win95650001200
Unicode (UTF-8)utf-8unicode-1-1-utf-8, unicode-2-0-utf-8, x-unicode-2-0-utf-8IE5, IE4Win95650011200
US-ASCIIus-asciiANSI_X3.4-1968, ANSI_X3.4-1986, ascii, cp367, csASCII, IBM367, ISO_646.irv:1991, ISO646-US, iso-ir-6usIE5Win95201271252
Vietnamese (Windows)windows-1258 IE5, IE4Win9512581258
Western European (DOS)ibm850 IE5Win20008501252
Western European (IA5)x-IA5 IE5Win2000201051252
Western European (ISO)iso-8859-1cp819, csISOLatin1, ibm819, iso_8859-1, iso_8859-1:1987, iso8859-1, iso-ir-100, l1, latin1IE5Win95285911252
Western European (Mac)macintosh IE5Win2000100001252
Western European (Windows)Windows-1252ANSI_X3.4-1968, ANSI_X3.4-1986, ascii, cp367, cp819, csASCII, IBM367, ibm819, ISO_646.irv:1991, iso_8859-1, iso_8859-1:1987, ISO646-US, iso8859-1, iso-8859-1, iso-ir-100, iso-ir-6, latin1, us, us-ascii, x-ansiIE5Win9512521252

 

Internal Charsets Not for General Use

The following character sets are not for general use, so do not use them to label documents.

Charset Friendly NamePreferred Charset LabelAliasesIE VerMin OSCode PageFamily Code Page
User Definedx-user-defined IE5, IE4Win955000050000
Japanese (Auto-Select)  IE5, IE4Win9550932932
Auto-Select  IE5Win955000150001
Korean (Auto-Select)  IE5, IE4Win9550949949

 

 

 

Show: