3.1.5.2.3 Accessing the Windows Sorting Weight Table

Windows gets its sorting data from a data table (see section 3.1.5.2.3.1). Code points are labeled by using UTF-16 values. The file is arranged in sections of tab-delimited field records. Optional comments begin with a semicolon. Each section contains a label and can have a subsection label.<1>

Note that labels are any field that does not begin with a numerical (0xNNNN) value. Blank lines and characters that follow a ";" are ignored.

This document uses the following notation to specify the processing of the file.

OPEN indicates that queries are made for records in a specific section. To open the preceding section with the SORTKEY label and DEFAULT sublabel, the following syntax is used. The OPEN SECTION is accessible by using the DefaultTable name.

  
 OPEN SECTION DefaultTable where name is
      SORTKEY\DEFAULT from unisort.txt

SELECT assigns a line from the data file to be referenced by the assigned variable name. To select the highlighted row preceding, this document uses this notation. The selected row is accessible by using the name CharacterRow.

  
 SET UnicodeChar to 0x0041
 SELECT RECORD CharacterRow FROM DefaultTable
      WHERE field 1 matches UnicodeChar

Values from selected records are referenced by field number. The following pseudo code selects the individual data fields from the selected row.

  
 SET CharacterWeight.ScriptMember to CharacterRow.Field2
 SET CharacterWeight.PrimaryWeight to CharacterRow.Field3
 SET CharacterWeight.DiacriticWeight to CharacterRow.Field4
 SET CharacterWeight.CaseWeight to CharacterRow.Field5

To select the record for characters 0x0043 and 0x0068 with LCID 0x0405, the following notation is used.<2>

  
 SET Character1 to 0x0043
 SET Character2 to 0x0068
 SET SortLocale to 0x0405
  
 OPEN SECTION ContractionTable where name is
      SORTTABLES\COMPRESSION\LCID[SortLocale]\TWO from unisort.txt
 SELECT RECORD ContractionRow FROM ContractionTable WHERE field 1
      matches Character1 and field 2 matches Character2
 SET CharacterWeight.ScriptMember to ContractionRow.Field3
 SET CharacterWeight.PrimaryWeight to ContractionRow.Field4
 SET CharacterWeight.DiacriticWeight to ContractionRow.Field5
 SET CharacterWeight.CaseWeight to ContractionRow.Field6