Localization Run-Time Breaking Changes

 
Short Description SortKey.GetHashCode() was incorrect in v1.1 and has been fixed in v2.0.
Affected APIs SortKey.GetHashCode() Severity Medium Compat Switch Available No

Description "A" and "a" are equal when ignoring case, but SortKey.GetHashCode() returns different hash codes for them. 1) With the v1.1 behavior, Equals and GetHashCode are not implemented the same way. 2) The v1.1 behavior has no true utility, where the v2.0 behavior (creating a hashcode gased on the string) does, since it could be used as an optimization for comparisons with a hashcode index and a sortkey index.

User Scenario See description

Work Around none, really—we do not provide a culturally sensitive hash code.
 

 

 
Short Description 7 bit encoding behavior changed to remove insecure mapping when 8th bit is set.
Affected APIs System.Text.Encoding Severity Very Low Compat Switch Available No

Description In v1.0 and v1.1 we mapped unknown code points > 0x7f in the 7 bit code pages (ASCII & IA5) by ignoring the high bit. This allowed insecure spoofing of all legal ASCII characters by illegal characters, including \, / and :. In v2.0 we call the fallback for unknown ASCII characters. By default these characters will map to ?. These characters cannot be created normally by using the Encoding. All unicode characters > 0x7f map previously mapped to ? in v1.0 when encoding in ASCII.

User Scenario Anyone trying to get special behavior or using the wrong encoding to read non-ASCII files as ASCII can be affected by this.

Work Around Users can create a DecoderFallback that provides the RTM behavior.
 

 

 
Short Description String comparison (and sorting) for sr-SP-Latn (Serbian) culture is incorrect.
Affected APIs All CompareInfo methods Severity Medium Compat Switch Available No

Description This is unfortunately a known issue with the Windows XP, Windows Server 2003, and the .NET Framework's sorting tables. It is fixed in Windows XP's Service Pack 2 (which also adds additional locales that use Croatian, Serbian, and Bosnian). However, comparisons in the latter are always kept in sync with a version of Windows and the version being used here is that of Server 2003.

User Scenario Serbian user does not want to fail to give proper results when neither Croatian or Bosnian users see problems.

Work Around Use Croatian—but this is a huge GPS issue.
 

 

 
Short Description Misspelled culture day & month names were corrected. Culture data is unstable and should not be relied upon.
Affected APIs System.Globalization.CultureInfo; System.Globalization.DateTimeFormat; System.Globalization.DateTimeFormatInfo Severity Medium Compat Switch Available No

Description 4 cultures had incorrect month names. ar-MA, nn-NO, kn-IN & div-MV. Those were corrected, but applications relying on the misspellings could be broken. Applications should specify their own formats when storing dates/times or other culture information and not rely on culture specific data. This goes for any culture data. Cultural preferences can change, goverment or business standards can change. With v2.0 people can override culture data with whatever they feel like.

User Scenario Use a culture to format dates/times in v1.1 and try to parse them in v2.0.

Work Around Build a custom culture with the incorrect spellings.
 

 

 
Short Description Update Turkish currency to New Turkish Lira (YTL)
Affected APIs with Turkish culture only (0x041f) RegionInfo.CurrencyNativeName RegionInfo.CurrencySymbol RegionInfo.ISOCurrencySymbol RegionInfo.CurrencyEnglishName Severity Medium Compat Switch Available No

Description The Turkish currency data (symbol, name, native name) has been changed by the Turkish government and our tables have been updated to reflect the change

User Scenario Anybody calling the affected API with the Turkish culture

Work Around No work around
 

 

 
Short Description Encoding.Unicode.GetMaxCharCount returns different sizes than previous versions
Affected APIs Encoding.Unicode.GetMaxCharCount Severity Medium Compat Switch Available No

Description Previously GetMaxCharCount did not consider the possiblity of left over data from a previous call to a decoder. It is possible that a decoder has remembered a high surrogate or lone byte from a previous call. If the decoder is then called again, it may have to provide that previous character as well as the following character. Note also, that if a fallback is provided that has a large character count, the returned GetMaxCharCount() could be very large. The code example demonstrates cases where GetChars() returns more than bytecount/2 chars for Unicode

User Scenario Anybody calling Encoding.Unicode.GetMaxCharCount with left over data from previous call to a decoder

Work Around No work around.
 

 

 
Short Description 'U' format in DateTime.ToString() has different behavior for Japanese Calendar betweeen
Affected APIs DateTime.ToString() with "U" format for ja-JP culture. Severity Medium Compat Switch Available No

Description The 'U' format (Universal time in Gregorian format) is used to print DateTime using GregorianCalendar (Gregorian localized calanedar). However, if the current calendar setting from the OS is the Japanese Calendar for the ja-JP culture and it uses the Japanese Calendar specific format (such as "gg yy'?'MM'?'dd'?'"), V1.0/V1.1 used to format 'U' using Japanese calendar format, instead of the correct Gregorian localized format. This is fixed in V2.0

User Scenario The user would like to print out the DateTime using "U" format (Universal Time using Gregorian localized format) and the current locale setting from OS is ja-JP, the current calendar is Japanese Calendar, and a Japanese calendar format is selected, such as "gg yy'?'MM'?'dd'?'")

Work Around If the user still needs the V1.0/V1.1 behavior, one can convert the DateTime instance into universal time and format the DateTime using the Long date format of the JapaneseCalendar.
 

 

 
Short Description the culture identifier ky-KZ was changed to ky-KG to match international conventions
Affected APIs Culture data, which shows up through APIs such as CultureInfo and ResourceManager.GetString. Severity Low Compat Switch Available No

Description The cultue Ky-KZ was wrong and we changed it ky-KG however anyone who wrote an app and set Thread culture to ky-KZ will be broken. Apps that worked in V1.x will now throw "Unhandled Exception: System.ArgumentException: Culture name 'ky-KZ' is not supported." KG is the official ISO tag for kyrgyzstan. There is no country whose official ISO tag is KZ.

User Scenario Using the official ISO tag for Kyrgyzstan will now work. Using the incorrect tag will no longer work. If someone had created resources, tagged them as ky-kz, then did a ResourceManager.GetString("resid", "ky-KZ"), we will fail when trying to create the CultureInfo for ky-KZ, and will also therefore fail to find the resources tagged ky-KZ

Work Around None
 

 

 
Short Description UnicodeDecoder throws when handling surrogate characters
Affected APIs UnicodeEncoding.Decoder.GetChars() Severity Medium Compat Switch Available No

Description When the Decoder for the UnicodeEncoding is asked to decode a high surrogate, V1.1 will return the high surrogate character (1 Unicode character) without considering the next two bytes to be a valid low surrogate or not. In V2.0, we now will only return the full surrogate pair (2 Unicode characters) after we see all 4 bytes are valid surrogate pair. If the caller try to decode the first 2 character, the state will be remembered, and the high surrogate will not be output. The complete surrogate pair will be only outputed when the decoder receives a valid low surrogate

User Scenario The caller gets Decoder for UnicodeEncoding, and decodes two bytes at a time by calling something like Decoder.GetChars(inputBytes, inputBytePos, 2, outputChars, outputCharPos) and expect a Unicode character will be always returned, so always increment currentCharPos by 1 in every call

Work Around The previous scenario totally ignore the returned value from first GetChar() calls, and assume that one Unicode character will be returned. This is wrong assumption because we are not supposed to return a partial high surrgate until a low surrogate is following. Version 1.1 has a bug on this. We change this behavior to be compatible with Unicode standard so that a invalid surrogate will not never be generated from our APIs. To work around, the proper usage is that the caller should use the returned value from GetChars() to update the outputCharPos correctly, instead of always incrementing by 1.