Security in Software Localization
Most software products have international versions that are part of multi-national networks around the world. Having one vulnerable computer on a corporate network, whether it is an English version or an international version, threatens the whole corporate security.
NOTICE:This article is not a complete security threat model for international software. You must develop specific threat models and follow the necessary security precautions including design review, code review, and functional testing to meet the security requirements. International software threat models will vary depending on the technologies and methods used.
Basic software security requirements include protecting software against:
The Security Developer Center explains these concepts in detail. This article presents some of the commonly known problems found in international software products and raises the awareness of potential problems that may impact security when globalizing and localizing software products.
On This Page
A Defensive Strategy
Both development and localization teams are responsible for securing international software to ensure that software companies delivers trustworthy products; here is how they are involved:
Developers must pay special attention to software resources as a potential threat to their software applications, and they should use defense in depth techniques.
Software defense in depth means writing source code that is suitable for both globalization and localizability areas. Developers must simplify the use of string resources that are to be used only for display. Such simplification will avoid many problems and will increase the software product’s robustness.
Localization teams modify software resources to make the product suitable for local markets. By including poorly localized resources, localization vendors may introduce security threats into products. The application may crash if the string resources do not adhere to certain guidelines. Localization teams should use resource commenting and localization verification procedures to deliver high-quality localized resources.
Improving localization quality means validating the localized string resources based on predefined instructions and ensuring that the localized resources meet high localization standards.
Sample Globalization Security Checklist
Development and localization teams must develop a threat model for their software to enable their products’ expansion in the international markets. The following lists offer a few examples of the issues that teams must include in their threat model.
Developers must create memory buffers dynamically for the string resources. The localized string resources may have different lengths. Localized strings, for example, Finnish, are usually longer than English strings. When processing string resources, developers must ensure that buffers are large enough to host the localized strings.
Converting strings between different character encodings (such as SBCS, MBCS, Unicode, UTF-8, and UTF-16) may produce a buffer size mismatch. Developers must be aware of the difference between the length of a string in bytes and the length in characters. In the case of Thai, converting UTF-16 to UTF-8 grows the buffer three-times because each code point is presented in three bytes.
Processing supplemental and pre-composite characters may lead to string length miscalculation; for example, normalizing composite character result in pre-composed characters (i.e., a + ` to à).
Reference: Unicode Normalization Forms
Some of the string encoding conversion APIs depends on the system. Changing these settings, through the regional settings, may cause these functions to fail to return the correct string resource, especially for the non-Unicode languages. Developers must use the Unicode functions to alleviate any problems related to Unicode conversion.
Reference: Globalization Step-by-Step: Unicode Enabled
Changing the user locale will change the regional settings. Regional settings include formatting date, time, and currency. Each locale has its own settings. For example changing the locale may change the date format from "m/d/yy" which displays "6/29/04" to "dddd, dd MMMM, yyyy" which displays "Tuesday, 29 June, 2004". This change will increase the length of a string that includes date variable.
A component that requires passing the LCID back and forth between the client and the server must watch for malicious modification of the LCID in the middle of these operations. Such modification may have an effect of exposing data on the server that is either corrupt or inappropriate.
Reference: Globalization Step-by-Step: Use Locale Model
Processing some risky characters may lead to unexpected results. For example, Turkish has four letters for the character " I ":
Converting an English string such as ("file.txt") to upper case using the Turkish locale will result as ("FİLE.TXT"), which is a completely different string than the original string. In some locales, the meaning of the word is case sensitive, which results in invalid comparisons for these particular locales.
When using regular expressions and patterns, developers must think of the international text validation. For example, incorrect RegExp handling may result in inappropriate denial or granting of access during validation of non-ASCII input (for example, text, symbols, and non-Latin digit classes such as Bengali and Thai).
Some string handling APIs are considered dangerous because they do not check the size of the destination buffer and do not check the null terminator in the source buffer. Examples of these functions are strcpy, strcat, strncpy, strncat, printf family, strlen, gets, wcstombs, and scanf.
Reference: Book: Writing Secure Code, Appendix A
Local governments may have their own specific security requirements. For example, any product that either uses or implements cryptography for confidentiality must obtain necessary approvals from the French government prior to shipping to France.
Sample Localizability Security Checklist
Developers must not store any critical data such as authentication, authorization, encryption, privilege elevation, log-in, or sign-in data. For example, avoid storing permission settings for application users in the resources.
In addition, developers must not save any numerical, Boolean, or similar values in the string resources. Localization can change such values, which results in changing the default behavior of the software application.
To avoid updating Web links during localization, developers should use the automatic redirection of Web links. Having a specific Web location in the string resources allows localization vendors to point to any Web site that can be dangerous.
HTML scripts, XSS, SQL statements, markup tags, or any code in the string resources can be changed during the localization process that leads to change in the software behavior. Developers must remove any code, such as markup language code, from the string resources.
Some technologies use non-orderable string variables. Swapping the non-orderable string variables may cause the software to crash. Developers must use re-orderable variables that allow localizers to change the string variables location without causing any crash.
Software logic must be independent on the content of the string resources by removing any code dependency. Developers must avoid any chance of any application crashes or misbehaviors due to removing or mistranslating string resources.
Functional strings such as file names, font names, registry keys, or command line arguments can be easily mistranslated, which might lead to application crashes. Developers must remove such strings completely from the resources to reduce the possibility of the application failure.
Some applications compare string resources with hard coded values or with other string resources, assuming that these string resources will have a matching localization. Localizers might forget to localize some of these strings, which leads to application failure.
Another example of string dependency is using delimiters, separators, and terminators to parse text. This practice may fail if localizers removed or changed these characters.
If developers cannot fulfill any of the above requirements, they must inform localizers about any string resources restriction. The localization team must use this information for localization validation to ensure secured string resources after localization. Sharing localization information, also called resource commenting, is a good practice for improving localization quality.
Sample Localization Validation Checklist
Localization teams must ensure that binaries created by local companies and shipped as part of the localized products meet the security requirements. Examples of market specific binaries are third party drivers and language support.
Product companies must hire trusted localization vendors. Well-crafted malicious string resources can be very harmful to the software applications. In addition, misleading or offensive translation in the software or the user assistance files can be harmful to the company.
Software applications use some string resources for functional purposes. Functional string resources vary from one product to another. The localization team must identify these functional strings with the help of the development team and validate their proper localization. Here are some examples of these string resources:
If developers could not remove the embedded code or left this on purpose, the localization team must ensure that any changes to this code, such as XSS, HTML, or XML, are correct and secured. Localizers should ask developers to globalize the code to work for all languages instead of customizing it to different markets.