A storage object is analogous to a file systemdirectory. Just as a directory can contain other directories and files, a storage object can contain other storage objects and stream objects. Also like a directory, a storage object tracks the locations and sizes of the child storage object and stream objects nested beneath it.
A stream object is analogous to the traditional notion of a file. Like a file, a stream contains user-defined data stored as a consecutive sequence of bytes.
The hierarchy is defined by a parent object/child object relationship. Stream objects cannot contain child objects. Storage objects can contain stream objects and/or other storage objects, each of which has a name that uniquely identifies it among the children of its parent storage object.
The root storage object has no parent object. The root storage object also has no name; because names are used to identify child objects, a name for the root storage object is unnecessary and the file format does not provide a representation for it.
Figure 3: Example of a structured storage compound file
A compound file consists of the root storage object with optional child storage objects and stream objects in a nested hierarchy. Stream objects can contain user-defined data stored as an array of bytes. Storage objects can contain an object classGUID called a CLSID, which can identify an application that can read/write stream objects under that storage object.
The benefits of compound files include the following:
Because the compound file implementation provides a file system-like abstraction within a file, independent of the details of the underlying file system, compound files can be accessed by different applications on different platform operating systems. The compound file can be a generic container file format that holds data for multiple applications.
Because the separate objects in a compound file are saved in a standard format, any browser utility reading the standard format can list the storage objects and stream objects in the compound file, even though data within a given object can be in a proprietary format.
There exist standardized data structures for writing certain types of stream objects—for example, summary information property-sets (for more information see [MS-OLEPS])—that applications can read using parsers for these data structures, even when the rest of the stream objects cannot be understood.
The compound file implementation constructs a level of indirection by supporting a file system within a file. A single flat file requires a large contiguous sequence of bytes on the disk. By contrast, compound files define how to treat a single file as a structured collection of storage objects and stream objects that act as file system directories and files, respectively.
Figure 4: Example of a compound file showing equal-length sector divisions
A compound file is divided into equal-length sectors. The first sector contains the compound file header. Subsequent sectors are identified by a 32-bit non-negative integer number, called the sector number.
A group of sectors can form a sector chain, which is a linked list of sectors forming a logical byte array, even though the sectors can be in non-consecutive locations in the compound file. For example, shown are two sector chains. A sector chain starts at sector #0, continues to sector #2, and ends at sector #4. Another sector chain starts at sector #1, and ends at sector #3.
Figure 5: Example of a compound file sector chain
A sector can be unallocated or free, in which case it is not part of a sector chain. A sector number is used for several purposes.
A sector number is used to identify the file offset of that sector in a compound file.
In a sector chain, it is used to identify the next sector in the chain.
Special sector numbers are used to represent chain termination and free sectors.