Event 1058 - Codepage Sniffing
Windows Internet Explorer 8 prevents certain codepages from participating in its Codepage Sniffing heuristic. Any pages that rely on this heuristic to be recognized as 7-bit Unicode Transformation Format (UTF-7) will no longer be detected.
UTF-7 is a character encoding used to represent Unicode-encoded text using a stream of ASCII characters. This encoding was proposed for use in Internet e-mail messages. The Simple Mail Transport Protocol (SMTP) standard for transmitting mail messages does not allow byte values above the ASCII range. So UTF-7 must include some provision to encode these higher value characters. For example, the following text:
Is encoded in UTF-7 as:
Note that the left angle bracket is encoded as +ADw- and the right angle bracket is encoded as +AD4-. If Windows Internet Explorer renders a page where the character set is not explicitly specified, then it uses a set of heuristics to "sniff" the page and determine the encoding. If the characters belonging to the UTF-7 encoding are found early enough in the webpage, Windows Internet Explorer may guess that the encoding is UTF-7. What might have been a harmless string in one character set could be interpreted as a potentially malicious script if the encoding is assumed to be UTF-7. Internet Explorer 8 includes a feature to look for UTF-7-encoded strings. If it finds them, it escapes the text so that any embedded script cannot execute.
This event is logged when Internet Explorer detects a page that is encoded using UTF-7.
Perform the following steps to see this event logged in the compatibility tool:
- Create a webpage with the following contents. For this example call it 1058.html. The file can be placed anywhere, but for this example, the file is located on the Desktop.
<html> <body>if you see a pop up, this test fails. otherwise, it's a pass... +ADw-script+AD4-alert(document.location)+ADw-/script+AD4- </body> </html>
- Open Internet Explorer.
- Ensure Internet Explorer is set to auto-select the page encoding. You can do this by clicking the Page menu, then Encoding, and then Auto-Select.
- Open a browser and navigate to the webpage. For example:
The page contains script encoded in UTF-7. Internet Explorer detects this and forces the page's encoding to Windows-1252. The result is that the page is rendered as plain text (as opposed to the script being executed). When Internet Explorer 8 detects the UTF-7-encoded script, the event is logged.
If you run the same script on Windows Internet Explorer 7 the UTF-7-encoded script will not be detected and an alert box will appear. (In other words, in Internet Explorer 7, the potentially malicious script is allowed to execute.)
The best way to avoid this issue is to always specify the encoding of your webpage. You can do so with a meta tag as in the following example:
<meta http-equiv="Content-type" content="text/html; charset=utf-8">
You can also set a, HTTP header:
Content-Type: text/html; charset=UTF-8
Of course, when you do specify a character set, make sure it's not UTF-7 because of the susceptibility this character set has to cross-site scripting (XSS) attacks.
This feature can also be disabled by modifying the registry.
Security Warning: Disabling the feature should only be used as a temporary measure during troubleshooting—to compare behavior of the application when the feature is enabled or not. It is not recommended that the feature be left disabled on an on-going basis.
You disable Codepage Sniffing with a security feature control registry key (FEATURE_DISABLE_UTF7_SNIFFIN). Internet Explorer (Iexplore.exe) needs to run under this feature control to disable codepage sniffing and this can be achieved by adding the following registry key:
HKEY_CURRENT_USER SOFTWARE Microsoft Internet Explorer Main FeatureControl FEATURE_DISABLE_UTF7_SNIFFIN iexplore.exe = 0x0000000