Export (0) Print
Expand All

2 Structures

This document references commonly used data types as defined in [MS-DTYP].

Unless otherwise qualified, instances of GUID in this section refer to [MS-DTYP] section 2.3.4.


Figure 6: Sectors of a compound file with FAT array at sector #0

The main structure used to manage sector allocation and sector chains is the file allocation table (FAT). The FAT contains an array of 32-bit sector numbers, where the index represents a sector number, and its value represents the next sector in the chain, or a special value.

  • FAT[0] contains sector #0's next sector in chain

  • FAT[1] contains sector #1's next sector in chain

  • ...

  • FAT[N] contains sector #N's next sector in chain

This allows a compound file to contain many sector chains in a single file. Many compound file structures, including user-defined data, are implemented as sector chains represented in the FAT.

Even the FAT array itself is represented as a sector chain. The sector chain holds both internal and user-defined data streams. Because the FAT array is stored in a sector chain, the DIFAT array is used to find the FAT sector locations. Each DIFAT array entry contains a 32-bit sector number.

  • DIFAT[0] contains FAT sector #0's location

  • DIFAT[1] contains FAT sector #1's location

  • ...

  • DIFAT[N] contains FAT sector #N's location

Because space for streams is always allocated in sector-sized blocks, there can be considerable waste when storing objects much smaller than the normal sector size (either 512 or 4096 bytes). As a solution to this problem, the concept of the mini FAT is introduced.


Figure 7: Mini sectors of a mini stream

The mini FAT is structurally equivalent to the FAT, but is used in a different way. The sector size for objects represented in mini FAT is 64 bytes, instead of the 512-bytes or 4096-bytes for normal sectors. The space for these objects comes from a special stream called the mini stream. The mini stream is an internal stream object divided into equal-length mini sectors. Each mini FAT array entry contains a 32-bit sector number for the mini stream, not the file.

  • MiniFAT[0] contains mini stream sector #0's next sector in chain

  • MiniFAT[1] contains mini stream sector #1's next sector in chain

  • ...

  • MiniFAT[N] contains mini stream sector #N's next sector in chain

Stream objects with a user-defined data length less than a cutoff (4096 bytes) are allocated with the mini FAT from the mini stream. Larger stream objects are allocated with the FAT from unallocated free sectors in the file.

The names of all storage objects and stream objects, along with other object metadata like stream size and storage CLSIDs, are found in the directory entry array. The space for the directory entry array is allocated with the FAT like other sector chains.

  • DirectoryEntry[0] contains information about the root storage object.

  • DirectoryEntry[1] contains information about a storage object, stream object, or unallocated object.

  • ...

  • DirectoryEntry[N] contains information about a storage object, stream object, or unallocated object.


Figure 8: Entries of a directory entry array


Figure 9: Summary of compound file internal streams and connections to user-defined data streams

This diagram summarizes the compound file main internal streams and how they are linked to user-defined data streams. The DIFAT, FAT, mini FAT, directory entry arrays, and mini stream are internal streams, while the user-defined data streams link directly to their stream objects.

In a compound file, all integer fields, including Unicode characters encoded in UTF-16, MUST be stored in little-endian byte order. The only exception is in user-defined data streams, where the compound file structure does not impose any restrictions.

© 2014 Microsoft