SQL Server Technical Article
Writers:
Eric N. Hanson, Kevin Farlee, Stefano Stefani, Shu Scott, Gopal Ashok,
Torsten Grabs, Sara Tahir, Joachim Hammer, Sunil Agarwal, T.K. Anand, Richard
Tkachuk, Catherine Chang, and Edward Melomed, Microsoft Corp.
Technical Reviewer: Eric N. Hanson, Microsoft Corp.
Published: December 2007
Applies To: SQL Server 2008
Summary: With the 2008 release, SQL Server makes a
major advance in scalability for data warehousing. It meets the data warehouse
needs of the largest enterprises more easily than ever. SQL Server 2008
provides a range of integrated products that enable you to build your data
warehouse, and query and analyze its data. These include the SQL Server
relational database system, Analysis Services, Integration Services, and
Reporting Services. This paper introduces the new performance and manageability
features for data warehousing across all these components. All these features
contribute to improved scalability.
Introduction
Microsoft®
SQL Server™
2008 provides a comprehensive data warehouse platform. It enables you to build
and manage your data warehouse, and deliver insight to your users, with a
single, integrated product suite. It scales to meet the needs of the largest
enterprises, in a way that empowers both your end users and your IT staff.
The number one focus of development in the SQL Server 2008
release was to improve scalability across the entire product suite to
comfortably meet the needs of large enterprises. Here, we’ll introduce the
features and enhancements we’ve added to improve your data warehouse
experience. Build. Manage. Deliver. SQL Server 2008 lets you do it all,
with ease.
Map of New Data Warehousing
Features
The following table shows the new scalability features in SQL
Server 2008, and where they help with the activities that surround your
data warehouse (DW).
|
|
Build
|
Manage
|
Deliver Insight
|
|
SQL Server Relational DBMS
|
MERGE statement
Change data capture
(CDC)
Minimally logged INSERT
|
Backup compression
|
Star join performance
Faster parallel query on
partitioned tables
GROUPING SETS
|
|
Resource governor
|
|
Data compression
Partition-aligned
indexed views
|
|
Integration Services
|
Lookup performance
Pipeline performance
|
|
|
|
Analysis Services
|
|
Backup
|
MDX Query Performance:
Block Computation
Query and Writeback
Performance
|
|
Scalable Shared Database
|
|
Reporting Services
|
|
Reporting scalability
Server scalability
|
Table 1: SQL Server 2008 improvements
This white paper briefly describes the data warehousing
enhancements in each of the different components of SQL Server 2008, and
how they help you get the most from your data warehouse. For full details on
how to use these features, see SQL Server 2008 Books Online (BOL).
SQL Server Relational DBMS
DW Improvements
The SQL Server 2008 relational DBMS contains significant advances
over earlier releases, so that it performs better when you create, manage, and
query large data warehouses. This section elaborates on the relational DBMS
data warehouse improvements listed in Table 1.
Star Join
With dimensionally modeled data warehouses, a big part of your
workload typically consists of what are known as star join queries. These
queries follow a common pattern that joins the fact table with one or several
dimension tables. In addition, star join queries usually express filter
conditions against the non-key columns of the dimension tables and perform an
aggregation (typically SUM) on a column of the fact table (called a measure
column). With SQL Server 2008, you will experience significant performance
improvements for many star join queries that process a significant fraction of
fact table rows.
The new technology employed is based on bitmap filters, also
known as Bloom filters (see Bloom filter, Wikipedia 2007, http://en.wikipedia.org/wiki/Bloom_filter).
It allows SQL Server to eliminate non-qualifying fact table rows from further
processing early during query evaluation. This saves a considerable amount of
CPU time compared to query processing technologies used by competing products.
While your results may vary, we’ve typically seen entire relational data
warehouse query workloads experience performance improvements of 15-25% when
using the new star join query processing capability. Some individual queries
speed up by a factor of seven or more.
.gif)
Figure 1: Star join query plan
with join reduction processing for efficient DW
The new star join optimization uses a series of hash joins,
building a hash table for each dimension table that participates. As a
byproduct of building this hash table, additional information, called a bitmap
filter, is built. Bitmap filters are represented as boxes in Figure 1,
labeled “Join Reduction Info.” These filters are pushed down into the scan on
the fact table, and effectively eliminate almost all the rows that would be
eliminated later by the joins. This eliminates the need to spend CPU time later
copying the eliminated rows and probing the hash tables for them. The
illustration shows the effect of this filtering within the fact table scan. The
SQL Server 2008 query executor also re-orders the bitmaps during
execution, putting the most selective one first, then the next most selective
one, and so forth. This saves more CPU time, because once a fact table row
fails a check against a bitmap, the row is skipped.
The new star join optimization is available in Microsoft SQL
Server 2008 Enterprise Edition. The query processor in SQL Server
applies the optimization automatically to queries following the star join
pattern when this is attractive in terms of estimated query cost. You do not
need to make any changes to your application to benefit from this significant
performance improvement.
Improved Parallelism
Wouldn’t you like to get
the most power you can out of the hardware you own? The partitioned table
parallelism (PTP) and the few outer rows parallelism features in SQL
Server 2008 help you do that.
Partitioned
Table Parallelism
Data warehouse applications typically collect large amounts of
historical data in fact tables, which are often partitioned by date. In SQL
Server 2005, queries that touch more than one partition use one thread
(and thus one processor core) per partition. This sometimes limits the
performance of queries that involve partitioned tables, especially when running
on parallel shared memory multiprocessor (SMP) computers with many processor
cores. Partitioned table parallelism improves the performance of parallel query
plans against partitioned tables by better utilizing the processing power of
the existing hardware, regardless of how many partitions a query touches. The
feature works by default without the need for manual tuning or configuration.
The following figure illustrates the impact of partitioned table parallelism in
a typical data warehouse scenario.
.jpg)
Figure 2: Partitioned table parallelism.
Assume that we have a fact table representing sales data
organized by sales date across four partitions, each containing seven days of
data, as shown in the top portion of the figure. Query Q summarizes sales over seven days. The query can touch
different partitions depending on when it is executed. This is illustrated by
query Q1, which touches a single partition P2 and by Q2, which touches two
partitions since the relevant data at the time of execution spans P3 and P4.
Executing
Q1 and Q2 in SQL Server 2005 may generate some unexpected behavior.
Because there is special-case logic that can allocate all threads to a
single-partition query, Q1 results in a parallel plan involving P3 that is
processed by all available threads (execution not shown in the figure). In the
case of Q2, however, the executor assigns a single thread each to
partitions P3 and P4 (see Old Allocation in the figure) even if the
underlying hardware has additional threads available. Hence on an 8-way
computer, Q2 utilizes only 2/8 (25%) of the available CPU power and very
probably executes much slower than Q1.
Executing
Q1 and Q2 in SQL Server 2008 results in better utilization of the
available hardware, and thus in better performance as well as more predictable
behavior. In the case of Q1, the executor again assigns all available threads
to process data in P2 (not shown). Q2 results in a parallel plan in which the
executor assigns all available threads to both P3 and P4 round-robin style,
producing the effect illustrated in the figure under New Allocation. The CPU
remains fully utilized and the performance of Q1 and Q2 are comparable. Under
this new round-robin allocation of threads, the performance boost provided by partitioned
table parallelism
becomes more pronounced the more processor cores there are compared to the
number of partitions affected by a query. When all data accessed by a query is
in the main memory buffer pool, which is typical for the most recent partitions,
we’ve observed speedups of 16 times or more in internal tests for queries
that touch two partitions. Actual results depend on the query, data
organization, and hardware configuration.
For details on the thread allocation strategy as well as on the
manageability features of partitioned table parallelism, see SQL
Server 2008 Books Online.
Few Outer Rows Parallelism
SQL Server 2008 enables
parallelism for nested loop joins even when the outer side of the join has only
a few rows. In SQL Server 2005, if
multiple threads are available, each thread is allocated a page of rows from
the outer side of the join. If there are
only a few rows, they are likely to be on the same page. In such cases, only one thread is employed
and the potential benefits of parallelism are lost. SQL Server 2008 will recognize such cases and
introduce an exchange operator that allocates one row per thread so that all
available CPUs are employed. The
increased parallelism means that CPU consumption will increase temporarily as
compared to SQL Server 2005, but query execution will be faster. This new behavior is only seen if the number
of outer rows is small and if the cost of the query is estimated to be large
enough to benefit from the additional parallelism. If the query cost is estimated to be small,
or if the cardinality estimate for the outer side is > 1000, SQL Server will
allocate one page per thread as in SQL Server 2005. Additional detail about exchange operators
and parallel query processing is available in Books On Line at Database Engine
> Planning and Architecture > Query Processing Architecture > Parallel
Query Processing.
Partition-Aligned Indexed
Views
Partition-aligned indexed views enable you to create and manage
summary aggregates in your relational data warehouse more efficiently, and use
them in scenarios where you couldn’t effectively use them before, improving
query performance. In a typical scenario, you have a
fact table that is partitioned by date. Indexed views (summary aggregates) are
defined on this table to help speed up queries. When you switch in a new table
partition, the matching partitions of the partition-aligned indexed views
defined on the partitioned table switch too, and do so automatically.
This is a significant improvement over SQL Server 2005,
where you must drop any indexed views defined on a partitioned table before
using the ALTER TABLE SWITCH operation to switch a partition in or out of it.
The partition-aligned indexed views feature in SQL Server 2008 gives you
the benefits of indexed views on large partitioned tables, while avoiding the
cost of rebuilding aggregates on an entire partitioned table. These benefits
include automatic maintenance of aggregates, and indexed view matching
(automatic query rewrite to use the aggregates to solve queries that refer only
to the base tables, not the aggregates). For more details on indexed views, see
Improving
Performance with SQL Server 2005 Indexed Views on Microsoft TechNet.
The following figure shows how aggregates move with base table
partitions when switching in a partition.
.jpg)
Figure 3: Partition-aligned indexed views
GROUPING SETS
GROUPING SETS allow you to write one query that produces multiple
groupings and returns a single result set. The result set is equivalent to a
UNION ALL of differently grouped rows. By using GROUPING SETS, you can focus on
the different levels of information (groupings) your business needs, rather
than the mechanics of how to combine several query results. GROUPING SETS
enables you to write reports with multiple groupings easily, with improved
query performance.
In this simple but typical example, using the AdventureWorksDW
sample database, you may want to see the following aggregates for a specific
reporting period:
·
Total sales
amount by quarter and country
·
Total sales
amount by quarter for all countries
·
The grand total
To get this result without GROUPING SETS, you must either run
multiple queries or if one result set is desired, use UNION ALL to combine
these queries. With GROUPING SETS, your query can be expressed like this:
SELECT D.CalendarYear, D.CalendarQuarter,
T.SalesTerritoryCountry
, SUM(F.SalesAmount)
AS SalesAmount
FROM dbo.FactResellerSales
F
INNER JOIN dbo.DimTime D ON
F.OrderDateKey = D.TimeKey
INNER JOIN
dbo.DimSalesTerritory T ON
F.SalesTerritoryKey
= T.SalesTerritoryKey
WHERE D.CalendarYear IN (2003,2004)
GROUP BY GROUPING SETS (
(CalendarYear,
CalendarQuarter, SalesTerritoryCountry)
, (CalendarYear,
CalendarQuarter)
, () )
ORDER BY D.CalendarYear, D.CalendarQuarter,
T.SalesTerritoryCountry
Typically, you display the result of this query as a type of
pivot table, like this:
|
|
Country
|
Period Totals
|
|
Canada
|
Mexico
|
USA
|
|
Year, Quarter
|
2007
|
Q1
|
1000
|
1000
|
5000
|
7000
|
|
2007
|
Q2
|
1000
|
1000
|
6000
|
8000
|
|
2007
|
Q3
|
1200
|
1300
|
7000
|
9500
|
|
2007
|
Q4
|
1300
|
1400
|
8000
|
10700
|
|
Grand Total:
|
35200
|
Table 2: Output from a GROUPING SETS query, formatted as a pivot
table
As the number of possible groupings increases, the simplicity and
performance benefits provided by GROUPING SETS become even greater.
MERGE
The MERGE statement allows you to perform multiple Database
Manipulation Language (DML) operations (INSERT, UPDATE, and DELETE) on a table
or view in a single Transact-SQL statement. The target table or view is joined
with a data source and the DML operations are performed on the results of the
join. The MERGE statement has three WHEN clauses, each of which allows you to
perform a specific DML action on a given row in the result set:
· For
every row that exists in both the target and the source, the WHEN MATCHED
clause allows you to UPDATE or DELETE the given row in the target table.
· For
every row that exists in the source but not in the target, the WHEN NOT MATCHED
[BY TARGET] clause allows you to INSERT a row into the target.
· For
every row that exists in the target but not in the source, the WHEN NOT MATCHED
BY SOURCE clause allows you to UPDATE or DELETE the given row in the target
table.
You can also specify a search condition with each of the WHEN
clauses to choose which type of DML operation should be performed on the row.
The OUTPUT clause for the MERGE statement includes a new virtual column called $action,
which you can use to identify the DML action that was performed on each row.
In the context of data warehousing, the MERGE statement is used
to perform efficient INSERT and UPDATE operations for Slowly Changing
Dimensions (SCD) and to maintain the fact table in various common scenarios.
The MERGE statement has better performance characteristics than running
separate INSERT, UPDATE, and DELETE statements since it only requires a single
pass over the data.
SQL Server 2008 also includes a powerful extension to the
INSERT statement that allows it to consume rows returned by the OUTPUT clause
of a nested INSERT, UPDATE, DELETE, or MERGE statements.
Suppose you have a DimBook table (ISBN, Price,
IsCurrent) that tracks the price history and current price for each book
in a bookstore. Price changes and new book additions are made on a weekly
basis. Every week a source table WeeklyChanges (ISBN, Price)
is generated and these changes are applied to the DimBook table. A row
is inserted for each new book. Existing books whose prices have changed during
the week are updated with IsCurrent=0 and a new row is inserted to reflect the
new price. The following single Transact-SQL statement performs these
operations using the new MERGE and INSERT capabilities.
INSERT INTO
DimBook(ISBN, Price, IsCurrent)
SELECT ISBN,
Price, 1
FROM
(
MERGE DimBook as book
USING WeeklyChanges AS src
ON (book.ISBN = src.ISBN and book.IsCurrent = 1)
WHEN MATCHED THEN
UPDATE SET book.IsCurrent = 0
WHEN NOT MATCHED THEN
INSERT VALUES (src.ISBN, src.Price, 1)
OUTPUT $action, src.ISBN, src.Price
) AS
Changes(action, ISBN, Price)
WHERE action
= 'UPDATE';
Change Data Capture
Change Data Capture (CDC) is one of the new data tracking
features introduces in SQL Server 2008. Mainly designed for data
warehousing scenarios, Change Data Capture provides an efficient mechanism to
track and gather data changes made to user tables and gives you access to
change data in an easily consumable relational format. Typically, you use CDC
in an operational database to capture changes for later movement into your data
warehouse. The availability of CDC in SQL Server eliminates the need for
using intrusive methods such as user triggers, timestamp columns, and expensive
queries to determine what changed in the operational system.
The auxiliary information gathered along with the change data
allows CDC to provide answers to a variety of questions. For example, here is a
set of questions for which CDC can provide the answers efficiently:
· I
want all rows that changed between 12:00 A.M. and 12:00 P.M.
· I
need to know whether a change is an insert, update, or delete.
· For
an updated row, I want to know which column(s) changed.
One of the scenarios in which CDC can be extremely beneficial is
Extract, Transform and Load (ETL). With the explosion of data volume and
shrinking maintenance windows due to global operations, it is critical to
optimize the ETL process. Change Data Capture provides you with a very
efficient way to extract changes on an incremental basis, reducing overall ETL
processing time.
The following diagram provides an overview of the components that
make up Change Data Capture.
.jpg)
Figure 4: Change Data Capture
CDC uses a capture job to extract changes from the
SQL Server transaction log, and populate change tables. The CDC API allows
you to write an application to get information from change tables. You can use
this in your ETL packages. The CDC cleanup job removes information that is no
longer needed from change tables.
Minimally Logged INSERT
In general, when you write data to a user database, you must
write it to disk twice: once to the log, and once to the database itself. This
is because the database system uses an undo/redo log so it can rollback or redo
transactions when needed. But it’s possible to write the data to disk only once
in some important cases that involve inserting data into existing tables,
thereby speeding up your ETL processes significantly. This is what the new
minimally logged INSERT feature does in SQL Server 2008. A factor of two
or more speedup is common with minimal logging compared with full logging. Your
results will depend on your application and hardware.
Minimal logging consists of logging only the information that is
required to rollback the transaction without supporting point-in-time recovery.
Minimal logging is only available under the bulk logged and simple recovery
models. When a minimally logged transaction is committed, the log records and
the data pages containing the newly inserted rows are flushed to the disk to
guarantee the durability of the transaction. Minimal logging greatly improves
the performance of large scale INSERT operations by reducing the number of log
records to be written and the amount of log space required to complete the
operation. For a discussion of table requirements for minimal logging, see
SQL Server Books Online. In particular, you must use table locking
(TABLOCK) on the target table.
Operations that can be minimally logged in SQL 2005 include bulk
import operations, SELECT INTO, and index creation and rebuild. SQL 2008
extends the optimization to INSERT INTO…SELECT FROM T-SQL operations that
insert a large number of rows into an existing target table when that table is
a heap that has no nonclustered indexes, and the TABLOCK hint is used on the
target. The optimization works whether the target table is empty or contains
data.
A key scenario for using minimally logged INSERT is this: you
create an empty table on specific file groups, so you can control where the
data is physically placed. Then you use INSERT INTO…SELECT FROM to populate it,
in a minimally logged fashion. This puts the data where you want it, and only
writes it to disk once. Once the data is loaded, you can then create the
required indexes. It is important to note that indexes themselves can be
created with minimal logging.
Data Compression
The new data compression feature in SQL Server 2008 reduces
the size of tables, indexes or a subset of their partitions by storing
fixed-length data types in variable length storage format and by reducing the
redundant data. The space savings achieved depends on the schema and the data
distribution. Based on our testing with various data warehouse databases, we
have seen a reduction in the size of real user databases up to 87%
(a 7 to 1 compression ratio) but more commonly you should expect a
reduction in the range of 50-70% (a compression ratio between roughly 2
to 1 and 3 to 1).
SQL Server provides two types of
compression as follows:
·
ROW compression enables storing fixed length types in variable
length storage format. So for example, if you have a column of data type BIGINT
which takes 8 bytes of storage in fixed format, when compressed it takes a
variable number of bytes—anywhere from 0 bytes to up to 8 bytes.
Since column values are stored as variable length, an additional 4‑bit
length code is stored for each field within the row. Additionally, zero and
NULL values don’t take any storage except for the 4‑bit code.
·
PAGE compression is built on top of ROW compression. It minimizes
storage of redundant data on the page by storing commonly occurring byte
patterns on the page once and then referencing these values for respective
columns. The byte pattern recognition is type-independent. Under PAGE
compression, SQL Server optimizes space on a page using two techniques.
The first technique is column
prefix. In this case, the system looks for a common byte pattern as a
prefix for all values of a specific column across rows on the page. This
process is repeated for all the columns in the table or index. The column
prefix values that are computed are stored as an anchor record on the page and
the data or index rows refer to the anchor record for the common prefix, if
available, for each column.
The second technique is page
level dictionary. This dictionary stores common values across columns and
rows and stores them in a dictionary. The columns are then modified to refer to
the dictionary entry.
Compression comes with additional
CPU cost. This overhead is paid when you query or execute DML operations on
compressed data. The relative CPU overhead with ROW is less than for PAGE, but
PAGE compression can provide better compression. Since there are many kinds of
workloads and data patterns, SQL Server exposes compression granularity at
a partition level. You can choose to compress the whole table or index or a
subset of partitions. For example, in a DW workload, if CPU is the dominant
cost in your workload but you want to save some disk space, you may want to
enable PAGE compression on partitions that are not accessed frequently while
not compressing the current partition(s) that are accessed and manipulated more
frequently. This reduces the total CPU cost, at a small increase in disk space
requirements. If I/O cost is dominant for your workload, or you need to reduce disk
space costs, compressing all data using PAGE compression may be the best
choice. Compression can give many-fold speedups if it causes your working set
of frequently touched pages to be cached in the main memory buffer pool, when
it does not otherwise fit in memory. Preliminary performance results on one
large-scale internal DW query performance benchmark used to test SQL
Server 2008 show a 58% disk savings, an average 15% reduction in
query runtime, and an average 20% increase in CPU cost. Some queries
speeded up by a factor of up to seven. Your results depend on your workload,
database, and hardware.
The commands to compress data are
exposed as options in CREATE/ALTER DDL statements and support both ONLINE and
OFFLINE mode. Additionally, a stored procedure is provided to help you estimate
the space savings prior to actual compression.
Backup Compression
Backup compression helps you to save in multiple ways.
By reducing the size of your SQL backups, you save significantly on
disk media for your SQL backups. While all compression results depend on the
nature of the data being compressed, results of 50% are not uncommon, and
greater compression is possible. This enables you to use less storage for
keeping your backups online, or to keep more cycles of backups online using the
same storage.
Backup compression also saves you time. Traditional SQL backups
are almost entirely limited by I/O performance. By reducing the I/O load of the
backup process, we actually speed up both backups and restores.
Of course, nothing is entirely free, and this reduction in space
and time come at the expense of using CPU cycles. The good news here is that
the savings in I/O time offsets the increased use of CPU time, and you can
control how much CPU is used by your backups at the expense of the rest of your
workload by taking advantage of the Resource Governor.
Resource Governor
The new Resource Governor in SQL Server 2008 enables you to
control the amount of CPU and memory resources allocated to different parts of
your relational database workload. It can be used to prevent runaway queries
(that deny resources to others) and to reserve resources for important parts of
your workload. SQL Server 2005 resource allocation policies treat all
workloads equally, and allocate shared resources (for example, CPU bandwidth,
and memory) as they are requested. This sometimes causes a disproportionate
distribution of resources, which in turn results in uneven performance or
unexpected slowdowns.
The primary goals of the Resource Governor are as follows:
1. Monitoring:
Enable monitoring of resource consumption per group of requests (workload
group).
2. Predictability:
Provide predictable execution of workloads in an environment where there is
resource contention. This is achieved by explicitly specifying resource
boundaries between workloads (via resource pool controls). The implementation
of resource boundaries also prevents, or reduces the probability of run-away
queries. The monitoring capability provided with the Resource Governor
facilitates easier detection of run-away queries.
3. Prioritization:
Enable the prioritization of workloads.
There are three new concepts which are important to understanding
of the resource governor: workload groups, resource pools, classification (and
classifier user-defined functions).
· Group:
A workload group, or group, is a user-specified category of
requests that are similar according to the classification rules that are
applied to each request. The value of a group is in the aggregate monitoring of
resource consumption and a uniform policy that is applied to all the requests
in a group. A group defines the policies for its members.
· Pool:
A resource pool, or pool, represents a portion of the physical
resources of the server. Depending on its settings, a pool may have a fixed
size (its minimum and maximum resource usage settings are equal to each other)
or have a part which is shared between multiple pools (its minimum is less than
its effective maximum). “Shared” in this case simply means that resources go to
the pool that requests the resources first. In the default configuration all
resources are shared, thus maintaining backward compatibility with SQL
Server 2005 policies.
· Classification:
Classification is a set of user-written rules that enable Resource
Governor to classify requests into the groups described previously. It is
implemented through a scalar Transact-SQL user-defined function (UDF) which is
designated as a “classifier UDF” for the Resource Governor.
These concepts are illustrated in the following figure.
.gif)
Figure 5: Resource Governor example: requests, classification,
groups, and pools
The Resource Governor may be used without any application
changes.
Integration Services
Improvements
Doing ETL to move data from your operational systems into your
data warehouse can be a time-intensive task. To make this process faster, SQL
Server 2008 Integration Services (SSIS) introduces two important
scalability features: improved lookup performance and improved transformation
pipeline performance.
Lookup Performance
The Lookup component in SSIS runs faster, and is even easier to
program than in SQL Server 2005. A lookup tests whether each row in a
stream of rows has a matching row in another dataset. A lookup is like a
database join operation. Typically you use lookup within an integration
process, such as the ETL layer that populates a data warehouse from source
systems.
A lookup builds a cache of retrieved rows pulled from the dataset
being probed. In SQL Server 2005, the Lookup component could only get data
from specific OleDb connections, and the cache could be populated only by using
a SQL query. In SQL Server 2008, the new version of Lookup allows you to
populate the cache using a separate pipeline in the same package or a different
package. You can use source data from just about anywhere.
SQL Server 2005 reloads the cache every time it is used. For
example, if you have two pipelines in the same package that each require the
same reference dataset, each Lookup component would cache its own copy. In SQL
Server 2008, you can save the cache to virtual memory or permanent file
storage. This means that within the same package, multiple Lookup components
can share the same cache. You can save the cache to a file and share it with
other packages. The cache file format is optimized for speed, and access to it
can be orders of magnitude faster than reloading the reference dataset from the
original relational source.
In SQL Server 2008, the Lookup component introduces the
miss-cache feature. When the component is configured to perform lookups
directly against the database, the miss-cache feature saves time by optionally
loading into cache the key values that have no matching entries in the
reference dataset. For example, if the component receives the value 123 in
the incoming pipeline, but the Lookup component already knows that there are no
matching entries in the reference dataset, the component will not try again to
find 123 in the reference dataset. This reduces a redundant and expensive
trip to the database. The miss-cache feature alone can contribute up to a
40% performance improvement in some scenarios.
Other enhancements to the Lookup component include:
· Optimized
I/O routines leading to faster cache loading and lookup operations.
· More
intuitive user interface that simplifies the configuration of the Lookup
component, in particular the caching options.
· Rows
in the input that do not match at least one entry in the reference dataset are
now sent to the No Match output. The Error output only handles errors such as
truncations.
· Query
statements in lookup transformations can be changed at runtime, making
programming transformations more flexible.
· Informational
and error messages are improved to help with troubleshooting and performance
analysis.
The following figure illustrates a scenario that uses the new
Lookup.
.gif)
Figure 6: Lookup Scenario
Dataflow 1 populates a Cache Connection Manager (CCM) from a
custom source, and then Dataflow 2 uses the same CCM to populate the lookup's
cache. The figure also shows use of 3 outputs from the Lookup component.
Pipeline Performance
In SQL Server 2008 SSIS, several threads can work together
to do the work that a single thread is forced to do by itself in SQL
Server 2005 SSIS. This can give you a several-fold speedup in ETL
performance.
In SQL Server 2005 SSIS, pipeline parallelism is more
coarse-grained. When users have a simple package with one or two execution
trees, there are only one or two processors used, and the package might not
benefit from a multiprocessor machine with more than a few processors. Even if
users logically split the data flow by using multicast, all output paths of a
multicast belong to the same execution tree, and they are executed serially by
the SQL Server 2005 SSIS data flow task.
To achieve a high level of parallelism, pipelines in SQL
Server 2008 SSIS allow more parallel processing, which means that for any
multiprocessor machine this should result in faster performance.
By using a shared thread pool, multiple outputs of a multicast
can be executed simultaneously. In short, the multicast gives an ability to
have an active buffer on each output and not just have one buffer (and one
active thread), which is passed to each output. You do not need to use the
“Union All” trick as a workaround to introduce more parallelism.
For example, suppose you have a flow that contains a multicast
with four outputs. Each output flows into an aggregate. In SQL Server 2005
SSIS, only one of the aggregates is processed at a time. In SQL
Server 2008 SSIS, all four aggregates can be processed in parallel.
The following figure shows how the enhanced SQL Server 2008
pipeline parallelism works.
.gif)
Figure 7: Improved pipeline parallelism in Integration Services
Analysis Services
Improvements
SQL Server 2008 Analysis Services (SSAS) dramatically
improves query speed with the new block computation, writeback, and scalable
shared database performance features. Manageability also improves with the
ability to back up much larger databases.
MDX Query Performance:
Block Computation
Improved block computation in SQL Server 2008 SSAS speeds up
MDX query processing primarily by doing work just for the non-null values in a
cube space. No time is wasted evaluating null cells. The key idea behind
subspace computation is best introduced by contrasting it with a “naïve”
cell-by-cell evaluation of a calculation. Consider a calculation RollingSum
that sums the sales for the previous year and the current year, and a query
that requests the RollingSum for 2005 for all products.
RollingSum = (Year.PrevMember,
Sales) + Sales
SELECT 2005 on columns,
Product.Members on rows WHERE RollingSum
A cell-by-cell evaluation of this calculation proceeds as shown
in the following figure.
.gif)
Figure 8: Cell-by-cell evaluation example
The 10 cells for [2005, all products] are evaluated in turn. For
each, we navigate to the previous year, obtain the sales value, and add it to
the sales for the current year. There are two significant performance issues
with this approach.
First, if the data is sparse, cells are calculated even though
they are bound to return a null value. In the example, calculating the cells
for anything but Product3 and Product6 is a waste of effort. The impact of this
can be extreme—in a sparsely populated cube, the difference can be several
orders of magnitude in the number of cells evaluated.
Second, even if the data is totally dense, meaning that every
cell has a value and there is no wasted effort visiting empty cells,
there is much repeated effort. The same work (such as getting the
previous Year member, setting up the new context for the previous Year cell,
checking for recursion) is re-done for each Product. It would be much more
efficient to move this work out of the inner loop of evaluating each cell.
Now consider the same example performed using a subspace
computation approach. First, we work our way down an execution tree,
determining which spaces need to be filled. Given the query, we need to compute
the space:
[Product.*, 2005, RollingSum]
Given the calculation, this means we must first compute the
space:
[Product.*, 2004, Sales]
followed by the space:
[Product.*, 2005, Sales]
and then apply the ‘+’ operator to those two spaces.
Sales is a base measure, so we simply obtain the storage engine
data to fill the two spaces at the leaves, and then work up the tree, applying
the operator to fill the space at the root. Hence the one row (Product3,
2004, 3) and the two rows { (Product3, 2005, 20), (Product6,
2005, 5)} are retrieved, and the + operator applied to them to yield the
result.
.gif)
Figure 9: Block computation example that avoids doing work for
NULL cells
The + operator operates on spaces, not simply scalar
values. It is responsible for combining the two given spaces to produce
a space that contains each product that appears in either space, with the
summed value.
We only operate on data that could contribute to the result.
There is no notion of the complete space over which we must perform the
calculation
Query and Writeback
Performance
The performance of writeback operations, and queries on writeback
data, is improved in SQL Server 2008 Analysis Services. Cell writeback in
Analysis Services is the ability for end users to update cell values at the
leaf or aggregate levels. Cell writeback uses a special writeback partition per
measure group which stores the difference (delta) between the updated cell
value and the original value. When an MDX query requests cell data from this
measure group, the storage engine accesses all partitions including the
writeback partition and aggregates the results to produce the right cell value.
In SQL Server 2005 and earlier releases, Analysis Services
required writeback partitions to have ROLAP storage. This is a common cause for
performance issues in cell writeback since ROLAP partitions query the
relational data source on demand to retrieve their data. In SQL
Server 2008, we allow writeback partitions with MOLAP storage. Retrieving
writeback data from the compressed MOLAP format is much faster than querying
the relational data source. Hence, MOLAP writeback partitions have better query
performance than ROLAP. The extent of the performance improvement varies and
depends on a number of factors including the volume of writeback data and the
nature of the query.
MOLAP writeback partitions should also improve cell writeback
performance since the server internally sends queries to compute the writeback
deltas and these queries probably access the writeback partition. Note that the
writeback transaction commit can be a bit slower since the server must update
the MOLAP partition data in addition to the writeback table, but this should be
insignificant compared with the other performance gains.
Analysis Services Enhanced
Backup
In SQL Server 2008 Analysis Services one of the performance
enhancements you’ll find is a new backup storage subsystem. The backup storage
subsystem now has been rewritten to allow for greater performance and
scalability. The changes are transparent to your application – no code change
is necessary to take advantage of them.
The new backup storage subsystem had introduced a new format to
the Analysis Services backup files. The file name extension hasn’t changed.
However, the internal format is different, so backup can scale well to handle
databases over a terabyte in size.
SQL Server 2008 Analysis Services backup is fully backward
compatible with SQL Server 2005 Analysis Services. It allows you to
restore databases backed up in SQL Server 2005 Analysis Services. SQL
Server 2008 Analysis Services does not have ability to store backups in
the old format for use in SQL Server 2005 Analysis Services.
The new highly performing backup storage subsystem allows
customer to implement new backup scenarios. Where previously you had to rely on
the raw file system copy utilities to back up large databases, now you have the
option to use the built-in backup subsystem that is integrated with the
transactional system, and allows running backup in parallel to other
operations.
Scalable Shared Database
for AS
Now you can scale out your OLAP query workload across many small
servers with just a single copy of the database. SQL Server 2008 Analysis
Services supports this with a feature called scalable shared database
(SSD).
Scale out can apply to various scenarios and workloads such as
processing, queries, data and cache management. The most common scale-out
scenario for Analysis Services is to spread the query load across multiple
servers in response to increasing number of concurrent users. This has been
achieved in the past by placing a load balancing solution such as Microsoft
Network Load Balancing (NLB) capability in front of multiple servers and
replicating the physical data between the servers. Managing such an environment
poses various challenges and the data replication is a major one. The scalable
shared database feature enables DBAs to mark a database as read only and
share it across multiple server instances from a Storage Area Network (SAN),
thereby eliminating the need to replicate the data. This saves disk space, and
the time that otherwise would have been spent copying data.
The following figure illustrates a typical SSD configuration.
.gif)
Figure 10: Scalable shared database for Analysis Services
An alternate solution for improving performance is scale up,
where a single large server is used instead of multiple small servers. The
advantage of scale up is that a single query can typically be processed faster
on a larger machine. But using scale out via SSD can save you money on hardware
(given the lower cost per processor) and still satisfy your needs for many
multi-user workloads. Furthermore, SSD allows you to scale to more processors
than can be accommodated in a single large server.
The scalable shared database feature consists of three logical
parts:
· Read-only
database: Allows marking a database as read-only
· Database
storage location: Allows a database to reside outside the server Data folder
· Attach/detach
database: Allows attaching or detaching a database from any UNC path
These features together enable the query scale-out scenario.
However, each feature is independent and has usages outside of query scale out
as well.
The SSD for AS feature works in a similar way as the SSD feature
introduced in the SQL Server 2005 relational database.
Reporting Services
Improvements
SQL Server 2008 Reporting Services (SSRS) provides performance,
scale, and design improvements that make it a great choice for your enterprise
reporting needs. We highlight the two major scalability improvements here.
Reporting Scalability
The SQL Server 2008 Reporting Services reporting engine has
had a major upgrade from the prior release, so that it can render much larger
reports than it could before. Although this is not specifically a data
warehousing improvement (it is useful in operational reporting too), it is
useful in some data warehousing scenarios. If you create reports with hundreds
or thousands of pages, SQL Server 2008 Reporting Services helps you to
render the reports faster. Moreover, the size of the largest report that can be
rendered has been increased dramatically, given the same hardware
configuration.
Server Scalability
SQL Server 2008 Reporting Services does not run inside
Internet Information Server (IIS). It can manage its own memory, and has its
own memory limits. This allows you to configure the memory settings so SSRS can
run on the same computer more effectively with other services, such as
SQL Server.
Conclusion
SQL Server gives you everything you need for data warehousing.
With the 2008 release, it scales to meet the needs of the largest enterprises
more readily than ever. As illustrated by the many data warehouse scale
enhancements introduced in this paper, it is a major advance over previous
releases. The number one change you’ll see is improved scalability across the
board, for data warehouse construction, relational query processing, reporting,
and analysis.
For more information:
·
SQL Server Web site
·
SQL
Server TechCenter
·
SQL
Server Developer Center
References
Bloom filter, Wikipedia 2007, http://en.wikipedia.org/wiki/Bloom_filter.
Hanson, Eric., Improving Performance
with SQL Server 2005 Indexed Views, http://www.microsoft.com/technet/prodtechnol/sql/2005/impprfiv.mspx.