Export (0) Print
Expand All

2.6.4 Red-Black Tree

Each set of sibling objects in one level of the containment hierarchy (all child objects under a storage object) is represented as a red-black tree. The parent object of this set of siblings will have a pointer to the top of this tree.

A red-black tree is a special type of binary search tree where each node has a color attribute of red or black. It allows efficient searching in the list of child objects under a storage object. The constraints on a red-black allow the binary tree to be roughly balanced, so that insertion, deletion, and searching operations are efficient.

The red-black tree MUST maintain the following constraints in order for it to be valid.

  1. The root storage object MUST always be black. Because the root directory does not have siblings, its color is irrelevant and can therefore be either red or black.

  2. Two consecutive nodes MUST NOT both be red.

  3. The left sibling MUST always be less than the right sibling. This sorting relationship is defined as follows.

    • A node with a shorter name is less than a node with a longer name (compare the length of the names from the Directory Entry Name Length field).

    • For nodes with the same name length from Directory Entry Name Length, iterate through each UTF-16 code point, one at a time, from the beginning of the Unicode string.

    • For each UTF-16 code point, convert to upper-case with the Unicode Default Case Conversion Algorithm, simple case conversion variant (simple case foldings), with the following notes.<2> Compare each upper-cased UTF-16 code point binary value.

    • Unicode surrogate characters are never upper-cased, because they are represented by two UTF-16 code points, while the sorting relationship upper-cases a single UTF-16 code point at a time.

    • Lowercase characters defined in a newer, later version of the Unicode standard can be upper-cased by an implementation that conforms to that later Unicode standard.

The simplest implementation of the above invariants would be to mark every node as black, in which case the tree is simply a binary tree. However, keeping the red-black tree balanced will typically result in better read performance.

All sibling objects within a storage object (all immediate child objects in one level of hierarchy) MUST have unique names in the Directory Entry Name field, where uniqueness is determined by the sorting relationship above.

 
Show:
© 2014 Microsoft