
New Partition-Aware Seek Operation
In SQL Server 2008, the internal representation of a partitioned table is changed so that the table appears to the query processor to be a multicolumn index with PartitionID as the leading column. PartitionID is a hidden computed column used internally to represent the ID of the partition containing a specific row. For example, assume the table T, defined as T(a, b, c), is partitioned on column a, and has a clustered index on column b. In SQL Server 2008, this partitioned table is treated internally as a nonpartitioned table with the schema T(PartitionID, a, b, c) and a clustered index on the composite key (PartitionID, b). This allows the query optimizer to perform seek operations based on PartitionID on any partitioned table or index.
Partition elimination is now done in this seek operation.
In addition, the query optimizer is extended so that a seek or scan operation with one condition can be done on PartitionID (as the logical leading column) and possibly other index key columns, and then a second-level seek, with a different condition, can be done on one or more additional columns, for each distinct value that meets the qualification for the first-level seek operation. That is, this operation, called a skip scan, allows the query optimizer to perform a seek or scan operation based on one condition to determine the partitions to be accessed and a second-level index seek operation within that operator to return rows from these partitions that meet a different condition. For example, consider the following query.
SELECT * FROM T WHERE a < 10 and b = 2;
For this example, assume that table T, defined as T(a, b, c), is partitioned on column a, and has a clustered index on column b. The partition boundaries for table T are defined by the following partition function:
CREATE PARTITION FUNCTION myRangePF1 (int) AS RANGE LEFT FOR VALUES (3, 7, 10);
To solve the query, the query processor performs a first-level seek operation to find every partition that contains rows that meet the condition T.a < 10. This identifies the partitions to be accessed. Within each partition identified, the processor then performs a second-level seek into the clustered index on column b to find the rows that meet the condition T.b = 2 and T.a < 10.
The following illustration is a logical representation of the skip scan operation. It shows table T with data in columns a and b. The partitions are numbered 1 through 4 with the partition boundaries shown by dashed vertical lines. A first-level seek operation to the partitions (not shown in the illustration) has determined that partitions 1, 2, and 3 meet the seek condition implied by the partitioning defined for the table and the predicate on column a. That is, T.a < 10. The path traversed by the second-level seek portion of the skip scan operation is illustrated by the curved line. Essentially, the skip scan operation seeks into each of these partitions for rows that meet the condition b = 2. The total cost of the skip scan operation is the same as that of three separate index seeks.