3.1.5.2.12 SortkeyContractionHandler
This algorithm checks if the next few characters in the specified string and index have an 8-character, 7-character, 6-character, 5-character, 4-character, 3-character, or 2-character contraction sequence. If true, these characters are given just one character weight. This algorithm also handles the Hangiran special character sequence.
-
COMMENT SortkeyContractionHandler COMMENT COMMENT On Entry: SourceString – Source Unicode String COMMENT SourceIndex – Current index within source string COMMENT HasHungarianSpecialCharacterSequence: Is the character that the current COMMENT index points to COMMENT the starting of the Hungarian special character sequence COMMENT ContractionType: The contraction type, from 2-character to 8-character COMMENT contraction, to be checked against COMMENT UnicodeWeights - String of UnicodeWeightType to COMMENT append additional weight(s) to COMMENT DiacriticWeights - String of Diacritic Weight to COMMENT append extra weight(s) to if COMMENT needed COMMENT CaseWeights - String of Case Weight to COMMENT append special weight(s) to COMMENT if needed COMMENT COMMENT On Exit: Result: a string to indicate the type of contraction from the specified COMMENT string COMMENT UnicodeWeights - The UnicodeWeight of the COMMENT processed character(s) is COMMENT appended to this string. COMMENT DiacriticWeights - The Diacritic weight, if any, of COMMENT the processed character(s) is COMMENT appended to this string. COMMENT CaseWeights - The Case Weight, if any, COMMENT of the processed character(s) COMMENT is appended to this string. COMMENT PROCEDUE SortkeyContractionHandler (IN SortLocale: LCID, IN SourceString: Unicode String, IN SourceIndex: 32-bit integer, IN HasHungarianSpecialCharacterSequence: boolean IN ContractionType: integer number from 2 to 8 INOUT UnicodeWeights: string of UnicodeWeightType INOUT DiacriticWeights: string of BYTE INOUT CaseWeights: string of BYTE) Result: CharacterWeightType IF HasHungarianSpecialCharacterSequence is true THEN COMMENT The beginning of Hungarian special character sequence, COMMENT advance one character before starting to check for contraciton sequence SET SourceIndex to SourceIndex + 1 ENDIF IF SourceIndex + ContractionType is greater than or equal to SourceString.Length THEN SET Result to null RETURN false ENDIF COMMENT Search for the character in the character contraction table COMMENT Search for contraction section based on ContractionType CASE ContractionType "8": OPEN SECTION ContractionTable where name is SORTTABLES\COMPRESSION\LCID[SortLocale]\EIGHT from unisort.txt "7": OPEN SECTION ContractionTable where name is SORTTABLES\COMPRESSION\LCID[SortLocale]\SEVEN from unisort.txt "6": OPEN SECTION ContractionTable where name is SORTTABLES\COMPRESSION\LCID[SortLocale]\SIX from unisort.txt "5": OPEN SECTION ContractionTable where name is SORTTABLES\COMPRESSION\LCID[SortLocale]\FIVE from unisort.txt "4": OPEN SECTION ContractionTable where name is SORTTABLES\COMPRESSION\LCID[SortLocale]\FOUR from unisort.txt "3": OPEN SECTION ContractionTable where name is SORTTABLES\COMPRESSION\LCID[SortLocale]\THREE from unisort.txt "2": OPEN SECTION ContractionTable where name is SORTTABLES\COMPRESSION\LCID[SortLocale]\TWO from unisort.txt ENDCASE COMMENT Contraction table might not be found if locale doesn't have them IF ContractionTable is null THEN SET Result to null RETURN false ENDIF CASE ContractionType "8": SELECT RECORD ContractionRow FROM ContractionTable WHERE field 1 matches SourceString[SourceIndex] and WHERE field 2 matches SourceString[SourceIndex + 1] and WHERE field 3 matches SourceString[SourceIndex + 2] and WHERE field 4 matches SourceString[SourceIndex + 3] and WHERE field 5 matches SourceString[SourceIndex + 4] and WHERE field 6 matches SourceString[SourceIndex + 5] and WHERE field 7 matches SourceString[SourceIndex + 6] and WHERE field 8 matches SourceString[SourceIndex + 7] COMMENT If this sequence isn't a contraction then one will not be found IF ContractionRow is null THEN SET Result to null RETURN false ENDIF COMMENT Found a contraction, get its weights SET Result.ScriptMember to ContractionRow.Field9 SET Result.PrimaryWeight to ContractionRow.Field10 SET Result.DiacriticWeight to ContractionRow.Field11 SET Result.CaseWeight to ContractionRow.Field12 "7": SELECT RECORD ContractionRow FROM ContractionTable WHERE field 1 matches SourceString[SourceIndex] and WHERE field 2 matches SourceString[SourceIndex + 1] and WHERE field 3 matches SourceString[SourceIndex + 2] and WHERE field 4 matches SourceString[SourceIndex + 3] and WHERE field 5 matches SourceString[SourceIndex + 4] and WHERE field 6 matches SourceString[SourceIndex + 5] and WHERE field 7 matches SourceString[SourceIndex + 6] COMMENT If this sequence isn't a contraction then one will not be found IF ContractionRow is null THEN SET Result to null RETURN false ENDIF COMMENT Found a contraction, get its weights SET Result.ScriptMember to ContractionRow.Field8 SET Result.PrimaryWeight to ContractionRow.Field9 SET Result.DiacriticWeight to ContractionRow.Field10 SET Result.CaseWeight to ContractionRow.Field11 "6": SELECT RECORD ContractionRow FROM ContractionTable WHERE field 1 matches SourceString[SourceIndex] and WHERE field 2 matches SourceString[SourceIndex + 1] and WHERE field 3 matches SourceString[SourceIndex + 2] and WHERE field 4 matches SourceString[SourceIndex + 3] and WHERE field 5 matches SourceString[SourceIndex + 4] and WHERE field 6 matches SourceString[SourceIndex + 5] COMMENT If this sequence isn't a contraction then one will not be found IF ContractionRow is null THEN SET Result to null RETURN false ENDIF COMMENT Found a contraction, get its weights SET Result.ScriptMember to ContractionRow.Field7 SET Result.PrimaryWeight to ContractionRow.Field8 SET Result.DiacriticWeight to ContractionRow.Field9 SET Result.CaseWeight to ContractionRow.Field10 "5": SELECT RECORD ContractionRow FROM ContractionTable WHERE field 1 matches SourceString[SourceIndex] and WHERE field 2 matches SourceString[SourceIndex + 1] and WHERE field 3 matches SourceString[SourceIndex + 2] and WHERE field 4 matches SourceString[SourceIndex + 3] and WHERE field 5 matches SourceString[SourceIndex + 4] COMMENT If this sequence isn't a contraction then one will not be found IF ContractionRow is null THEN SET Result to null RETURN false ENDIF COMMENT Found a contraction, get its weights SET Result.ScriptMember to ContractionRow.Field6 SET Result.PrimaryWeight to ContractionRow.Field7 SET Result.DiacriticWeight to ContractionRow.Field8 SET Result.CaseWeight to ContractionRow.Field9 "4": SELECT RECORD ContractionRow FROM ContractionTable WHERE field 1 matches SourceString[SourceIndex] and WHERE field 2 matches SourceString[SourceIndex + 1] and WHERE field 3 matches SourceString[SourceIndex + 2] and WHERE field 4 matches SourceString[SourceIndex + 3] COMMENT If this sequence isn't a contraction then one will not be found IF ContractionRow is null THEN SET Result to null RETURN false ENDIF COMMENT Found a contraction, get its weights SET Result.ScriptMember to ContractionRow.Field5 SET Result.PrimaryWeight to ContractionRow.Field6 SET Result.DiacriticWeight to ContractionRow.Field7 SET Result.CaseWeight to ContractionRow.Field8 "3": SELECT RECORD ContractionRow FROM ContractionTable WHERE field 1 matches SourceString[SourceIndex] and WHERE field 2 matches SourceString[SourceIndex + 1] and WHERE field 3 matches SourceString[SourceIndex + 2] COMMENT If this sequence isn't a contraction then one will not be found IF ContractionRow is null THEN SET Result to null RETURN false ENDIF COMMENT Found a contraction, get its weights SET Result.ScriptMember to ContractionRow.Field4 SET Result.PrimaryWeight to ContractionRow.Field5 SET Result.DiacriticWeight to ContractionRow.Field6 SET Result.CaseWeight to ContractionRow.Field7 "2": SELECT RECORD ContractionRow FROM ContractionTable WHERE field 1 matches SourceString[SourceIndex] and WHERE field 2 matches SourceString[SourceIndex + 1] COMMENT If this sequence isn't a contraction then one will not be found IF ContractionRow is null THEN SET Result to null RETURN false ENDIF COMMENT Found a contraction, get its weights SET Result.ScriptMember to ContractionRow.Field3 SET Result.PrimaryWeight to ContractionRow.Field4 SET Result.DiacriticWeight to ContractionRow.Field5 SET Result.CaseWeight to ContractionRow.Field6 ENDCASE SET UnicodeWeight to CorrectUnicodeWeight(Result, IsKoreanLocale) APPEND UnicodeWeight to UnicodeWeights APPEND Result.DiacriticWeight to DiacriticWeights as a BYTE APPEND Result.CaseWeight to CaseWeights as a BYTE COMMENT Advance the source index SET SourceIndex to SourceIndex + ContractionType RETURN true