Cube data and aggregations can be stored with different techniques and in a variety of modes.
Cubes can require substantial storage to contain the data and aggregations in multidimensional structures. One factor that affects storage requirements is sparsity. For example, if one dimension contains sales representatives and another dimension contains regions, cells at the intersection of the Northeast sales representative and the Southwest region will probably be empty.
Microsoft® SQL Server™ 2000 Analysis Services uses several techniques for minimizing cube storage requirements:
- Storage is not allocated for empty cells, which compensates for performance issues that can be caused by sparsity.
- Data compression is employed.
- A sophisticated algorithm designs efficient summary aggregations to minimize storage without sacrificing speed.
Storage options enable you to create an OLAP storage strategy tailored to your needs by selecting appropriate storage modes and locations for cube data.
Cubes can be divided into partitions, and each partition can be stored using a different storage mode. For example, you can create a cube containing data for several years of transactions and partition it at year boundaries. You can store the partition for the current year in a multidimensional OLAP (MOLAP) structure with a high percentage of aggregations for quick response to users. You can use hybrid OLAP (HOLAP) to store the partition for the previous year, providing good response to summary queries with reduced storage needs. You can store data for years prior to the previous year in one or more relational OLAP (ROLAP) partitions with a smaller percentage of aggregations, saving on storage space with a tradeoff in query response.
Caution Partitioning cubes and merging partitions are advanced techniques. It is possible to create partitioned cubes that contain incorrect data. For more information about specific precautions, see Fact Table Considerations When Merging Partitions and Merging Partitions That Have Data Slices.
The partitions of a cube are not visible to the end user. In the preceding example, any query valid for the entire cube will execute, but queries that return older data will take more time than those that request newer information.
Analysis Services provides the Partition Wizard to assist in creating partitions. However, it is important that partitions be defined to contain mutually exclusive data. A cube may return incorrect results for some queries if a portion of the cube's data is included in more than one of its partitions.
Partitions can be stored on different Analysis servers, providing a clustered approach to cube storage and distributing workload across Analysis servers. For more information, see Distributed Partitioned Cubes and Remote Partitions.
Two partitions of a cube can be merged into a single partition, which can then be merged into another partition, and so on until only a single partition remains. For example, four partitions, each containing data for a quarter, can be merged into a single partition that contains the data for the entire year. There are precautions that need to be considered when merging partitions to ensure the resulting partition contains correct data.
The storage requirements of multiple copies of a cube on different Analysis servers can be greatly reduced by replacing the copies with linked cubes. A linked cube is based on a cube on another Analysis server, referred to as the source cube. A linked cube uses the aggregations of its source cube and has no data storage requirements of its own. Therefore, by maintaining a single source cube on one Analysis server and creating linked cubes on the other Analysis servers that require the cube, you can save a large amount of storage resources. For more information, see Linked Cubes.
Real-time cubes offer the opportunity of using relational OLAP (ROLAP) dimensions and partitions enabled for real-time OLAP. A ROLAP partition can be used to handle rapidly changing fact table data, eliminating the need for frequent reprocessing of a cube and the resulting inconvenience and unavailability. In this case, a ROLAP partition enabled for real-time OLAP requires no additional storage space, because all aggregations are performed as needed. Unlike ROLAP dimensions, ROLAP partitions must meet special requirements if they are to be used with real-time OLAP. ROLAP partitions must either have no stored aggregations or be based on an indexed view. For more information about these requirements, see Real-Time Cubes.