2.15 Index Lexicon File
The index lexicon file is a text file using Unicode encoding which lists the most frequent tokens which appear in the content index file of a master full-text index component of the current full-text index catalog. It is used by the query server to determine alternative spelling variants for the tokens encountered in the received queries.
In a binary representation, the format of the file is as follows.
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
2 |
|
|
|
|
|
|
|
|
|
3 |
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Unicode marker |
ListOfTokens (variable) |
||||||||||||||||||||||||||||||
... |
Unicode marker (2 bytes): A 2 byte field specific to the text files which use the Unicode encoding. The values of the bytes MUST be 0xFF followed by 0xFE.
ListOfTokens (variable): Array of Unicode characters representing the list of the most frequent tokens in the catalog. The tokens are separated by the new line characters and each token is composed of 1 to 64 non-space characters.