Matching Code Pages
If you find that your code after conversion has strange characters, the most likely cause is a missing or mismatched code page for character encoding. Java Language Conversion Assistant matches encoding in the following manner:
- Files that are encoded in the current system ANSI code page keep this encoding after conversion.
- Any other encoding is converted to UTF-8 encoding.
Unicode files that start with a Unicode byte-order mark are identified automatically as Unicode. If a file does not start with a byte-order mark and the encoding switch is not specified, the file is assumed to be in the current ANSI code page of the system.
To change the system ANSI code page you are currently using
- Go to Control Panel, and double-click Regional Options (in Windows 2000) or Regional and Language Options (in Windows XP).
- Click Advanced, and select the desired code page.
If your source files are in a non-ANSI character encoding or you have Unicode source files that do not start with a byte-order mark, you must use the encoding switch.
The following character encodings can be specified in the encoding switch:
|Latin alphabet No. 2 (Central European)||ISO-8859-2|
|Latin alphabet No. 3||ISO-8859-3|
|Latin alphabet No. 4 (Baltic States)||ISO-8859-4|
|Latin alphabet No. 9||ISO-8859-15|
|Chinese National Standard||GB18030|
|UTF-8 without byte-order mark||UTF-8|
To use a specific encoding, you must have support for it installed on your system. See your system documentation for details. When using the encoding switch, all files in the project must be in the same encoding.
If you are using an encoding system that is not supported in JLCA, you might lose characters that cannot be converted. In some cases, a file might be lost.