Locale Names (Windows)

Switch View :
ScriptFree
Locale Names

A locale name is based on the language tagging conventions of RFC 4646 (Windows Vista and later), and is represented by LOCALE_SNAME. Generally, the pattern <language>-<REGION> is used. Here, language is a lowercase ISO 639 language code. The codes from ISO 639-1 are used when available. Otherwise, codes from ISO 639-2/T are used. REGION specifies an uppercase ISO 3166-1 country/region identifier. For example, the locale name for English (United States) is "en-US" and the locale name for Divehi (Maldives) is "dv-MV".

Note  The constant LOCALE_NAME_MAX_LENGTH gives the maximum length of a locale name. It includes space for a terminating null character.

If the locale is a neutral locale (no region), the LOCALE_SNAME value follows the pattern <language>. If it is a neutral locale for which the script is significant, the pattern is <language>-<Script>.

If a locale must be distinguished from another locale for the same language and region using a different script, the LOCALE_SNAME value follows the pattern <language>-<Script>-<REGION>, where Script is an initial-uppercase ISO 15924 script code. For example, the LOCALE_SNAME value for the specific locale Uzbek (Latin, Uzbekistan) is "uz-Latn-UZ". The script component is not included in cases where a language is commonly written in only one script.

Sort orders for locales are designated using sort order identifiers, for example, SORT_DEFAULT. To distinguish two or more sort orders for the same language and region, the locale name follows the pattern <language>-<REGION>_<sort order>. If you must distinguish both script and sort order, the name follows the pattern <language>-<Script>-<REGION>_<sort order>. The default sort order is never explicitly specified, only the alternative sort order. For example, Hungarian (Hungary) with either SORT_DEFAULT or the numerically equivalent SORT_HUNGARIAN_DEFAULT is designated "hu-HU". Hungarian (Hungary) with sort order SORT_HUNGARIAN_TECHNICAL is designated "hu-HU_technl".

For a replacement locale, the locale name must be the same as the name for the locale being replaced. For a supplemental locale, the locale name should follow the pattern of <language>-<REGION>-x-<custom> or <language>-<Script>-<REGION>-x-<custom>, where <custom> is an alphanumeric string specific to the supplemental locale. For example, a supplemental locale specific to a company called Fabricam might be called "en-US-x-fabricam".

An application can retrieve the current locale names by using the GetSystemDefaultLocaleName and GetUserDefaultLocaleName functions. While each thread can retrieve and set its own locale identifier with GetThreadLocale and set it with SetThreadLocale, there are no analogous functions to get and set locale by name.

Related topics

Locales and Languages
Custom Locales
Locale Identifiers
Sort Order Identifiers

 

 

Send comments about this topic to Microsoft

Build date: 3/6/2012

Community Content

verdy.p
Windows basic locale names and preferred BCP 47 language identifiers
Windows almost always appends a region code to all its locale names. This is not the case for BCP 47 identifiers, that most often use a region-neutral language identifier. Also, standard BCP 47 locale identifiers may not always be script-neutral, so that zh-CN is *implicitly* tied to zh-Hans-CN, which is itself tied to zh-Hans (which is the recommanded form of the language identifier). Windows attempts to always append a region code, only because Windows does not use directly attempt to identify a language, but instead a locale where the identifier of the region is used to attack some locale preferences that are not at all related to the identification of the language itself. This is in contrast to BCP 47, for which the standard only defines the language, but not other locale preferences (locale preferences are not specified in BCP 47, but with the use of an extension mechanism, such as the Unicode/CLDR extension introduced by the "u" singleton subtag. Finally the Windows locale names are forgetting many cases for the sort orders, and their subtags are different than those used in the Unicode/CLDR extension 'u' defined to be used in association with BCP 47 language tags. All Windows locale names containing an underscore will be invalid for BCP 47 usage

verdy.p
conformance to RFC 4646 (in fact BCP 47)
The locale "names" described here and used in Windows are still not conforming to locale "tags" defined in RFC 4646 or later (in fact BCP 47) : - the difference of separators between the conforming part (which uses '-' between the language, script, region subtags, and possibly an extension subtag starting with a singleton subtag 'x' for custom locales) and the non conforming part (which uses '_' before the sort order subtag) is not standard for BCP 47, which consider them as equivalent. - the letter case in subtags is not significant in BCP 47; this is not always the case for locale names used in the Windows API, which may not always find a locale name if its name uses other cases than those used in the canonical form. - the value of the sort order subtag in Windows locale names is specific to Windows; it should be specified using an extension prefix subtag (such as the 'u' subtag for the Unicode/CLDR extension for locales) ; in addition the sort order should be specified in locale names before the extension subtag used for custom locale names. - there's no support for the locale match algorithm specified in BCP 47. For this reason, a strict BCP 47 conformance that is normally required for internet protocols and description languages requires an adaptation layer, that will map these "Windows locale names" to/from standard BCP 47 language tags (or locale identifiers with the addition of the standard Unicode/CLDR extension 'u')