2.4 Huffman Trees

LZXD compression uses canonical Huffman tree structures to represent elements. Huffman trees, as specified in [Cormen], are well known in data compression and are not described here. Because an LZXD decoder uses only the path lengths of the Huffman tree to reconstruct the identical tree, the following constraints are made on the tree structure.

For any two elements with the same path length, the lower-numbered element MUST be farther left on the tree than the higher-numbered element. An alternative way of stating this constraint is that lower-numbered elements MUST have lower path traversal values; for example, 0010 (left-left-right-left) is lower than 0011 (left-left-right-right).

For each level, starting at the deepest level of the tree and then moving upward, leaf nodes MUST start as far left as possible. An alternative way of stating this constraint is that if any tree node has children, all tree nodes to the right of it with the same path length MUST also have children.

A non-empty Huffman tree MUST contain at least two elements. In the case where all but one tree element has zero frequency, the resulting tree MUST minimally consist of two Huffman codes, "0" and "1".

LZXD compression uses several Huffman tree structures. The main tree comprises 256 elements that correspond to all possible 8-bit characters, plus 8 * NUM_POSITION_SLOTS elements that correspond to matches. The NUM_POSITION_SLOTS elements refer to the position slots required, as specified in section 2.1.6. The value of the NUM_POSITION_SLOTS elements depends on the specified window size as described in section 2.1.6. The length tree comprises 249 elements. Other trees, such as the aligned offset tree (comprising 8 elements), and the pretrees (comprising 20 elements each), have a smaller role.