by Kalen Delaney, a SQL Server MVP since
1993 who has been working with SQL Server for over 20 years. Her major areas of
specialization are internals and tuning. Find out more about Kalen’s activities
through her web site at www.SQLServerInternals.com.
Introduction
SQL Server
2008 introduces the Management Data Warehouse (MDW) to SQL Server Management
Studio for streamlined performance troubleshooting. MDW is a set of components
that enable a database developer or administrator to quickly track down
problems that could be causing performance degradation. In this paper, we’ll
look at the following components:
- The
Data Collector
- The
MDW database
- The
MDW reports
Many of the
reports included in MDW will provide suggestions for improving performance,
based on information gathered by the Data Collector in the following areas:
- Query
performance statistics and use of indexes
- Resource
locking and blocking
- Disk
usage
Sometimes the
reports can lead directly to a solution to a performance problem. For example,
if the query performance statistics report shows an extremely slow running
query, you can drill down to the query plan, which may indicate a missing
index. In many cases, creating that index can make a major difference in the
query performance. In another case, the query performance statistics report
might show the same query performing sporadically, sometimes very fast and
sometimes slow. You might then determine that the slower executions are using a
different execution plan. This could lead you to capturing the plan for the
fast running execution and creating a plan guide. A third example might be that
the resource locking and blocking report indicates unacceptable lock wait times
when running queries in a particular database. This might suggest that you
explore the possibility of using one of the snapshot based isolation levels, to
alleviate the blocking problems.
Understanding
exactly how to interpret all of the details in the reports, and knowing how to
correct the poor performance shown in the reports requires knowledge of SQL
Server that is beyond the scope of this paper.
Overview of Management Data Warehouse
Installation
Setting up the MDW is a two-step process. First, you must
run the “Configure Management Data Warehouse” Wizard to create the MDW
database. The second step is to set up data collection by specifying the
database that you want to upload the data to. The wizard then configures the
data collection using the SQL Server Integration Services runtime. This second
step requires that you run the “Configure Management Data Warehouse” Wizard a
second time. SQL Server Agent jobs are created to start and stop the packages
that perform the data collection and the uploading of the data. It is strongly
recommended that you not modify the jobs directly. Although the jobs behave
like any other SQL Server Agent jobs, the function is only to control the
execution of the SSIS package.
Database Creation
Although the purpose of the paper is not to provide an
installation tutorial, it will be useful to refer to Figure 1, which is the
first dialogue of the “Configure Management Data Warehouse” Wizard. The wizard
is accessed from SQL Server Management Studio by right-clicking on the “Data
Collection” icon in the Management folder. When you select the option to
“Create or upgrade a management data warehouse” and click “Next”, you are given
a dialog to create a database in which to store the collected performance data.
The default database name is MDW, and it is recommended that you use that name
as most of the documentation assumes that is the name of the database
containing the collected data. In particular, you should not change the
database name after creation, because all of the jobs created to manage the
database collection refer to the database by the original name and will
generate errors if the name is changed. However, you can create a new database,
and reconfigure the data collector to store its collected data to use it by
rerunning the “Configure Management Data Warehouse” wizard again.
.jpg)
Figure 1: Starting
the MDW configuration process
After creating the database, the wizard will ask you to map
logins and users to management data warehouse administrator, reader and writer
roles. The default is not to allow anyone but the database owner to have access
to the MDW database.
Best practice tip:
You should not change the database name after creation, because all of the jobs
created to manage the database collection refer to the database by the original
name and will generate errors if the name is changed.
Job Creation
After creating the MDW database, you will need to run the
“Configure Management Data Warehouse” Wizard a second time to configure the
data collection and create the jobs. The wizard will also ask you for a
directory for caching collected data prior to uploading to the MDW database.
(Caching will be discussed in more detail in the next section, “Collection”).
If you leave the directory specification blank, the cache will be in the TEMP
directory for the Service Account that your SQL Server Agent Service is using.
(If your SQL Server Agent service is running under LocalSystem, the cache will
be in the TEMP directory for the user that is running the wizard. For example,
my cache is stored in “C:\Users\Kalen\AppData\Local\Temp”. When choosing a
location for the cache, you should pick one that will be least disruptive to
the target SQL Server instance when lots of writes are being performed. You
also need to make sure that the SQL Server Agent account has privileges to the
location. You also need to make sure that the SQL Server Agent account will
have access to the MDW database if it has been configured on a remote server.
To ensure this access, you can create a SQL Server Agent proxy account and add
it to the dc_proxy database role for the msdb database.
Figure 2 shows the last screen of the wizard, indicating the
various steps the “Configure Management Data Warehouse” Wizard will perform
this time.
.jpg)
Figure 2: Configure
Management Data Warehouse Wizard, final step
The Data Collection is performed primarily through SSIS
packages that control the collection frequency on the target and performs the
upload of data to the MDW. The SQL Agent jobs control the triggering of the
upload action for SSIS.
Figure 3 shows the existing jobs after completing the
configuration. The first five jobs seen in the figure are the jobs that trigger
the data collection and upload packages. The sixth job (mdw_purge_data_[MDW])
was created when the MDW database was created, and the last job shown is
associated with Policy Based Management (PBM) and is not in the scope of this
paper.
.jpg)
Figure 3: Existing
jobs in SQL Server Agent after configuration
The next section on “Data Collection Architecture” provides
more details on collection and upload. For now, you should be aware of the
following recommendations:
- Do
not change any of the job specifications for the data collection and upload
jobs.
- You
can change properties of the mdw_purge_data_[MDW] job, which will remove data
from the MDW database on a regular basis, subject to a retention policy. This
job will also purge the job history from the data collection and upload jobs.
In general, the retention setting is controlled via property pages for specific
collection sets. The mdw_purge_data job triggers when the stored procedure that
performs the purge operation starts. The only property that you should change
is when to perform the purge operation along with any alerting that you would
like to add. You should not modify the job because any changes that you might
add would get overwritten with subsequent software upgrades of SQL Server. If
you want to perform archiving before the mdw_purge_data job runs, you can
create a separate job that runs prior to the mdw_purge_data job.
Best practice tip:
You should not change any of the job specifications for the data collection and
upload jobs within SQL Server Agent.
Collection
To understand the data collection processes, the following
terms will be useful:
- Target
A target is an instance of SQL Server from which data is being collected. Note
that SQL Server Express instances cannot be targets. - Target
Type
A target type is an aspect of the target from which data is being collected.
For example, a target type could be an entire SQL Server instance, or a single
database in that instance. Each target type has different characteristics and
different performance data that can be collected. - Data
Provider
A data provider is a data source, which is specific to a target type. For
example, system views can be data providers. Most system views provide data for
a SQL Server instance target, but some provide data for a SQL Server database
target. - Collector
Type
A collector type is a logical wrapper around an SSIS package that provides the
actual mechanism for collecting and uploading data to the MDW database.
Collector types include: TSQL Query collector type, SQL Trace Collector type,
Performance Counter collector type and query activity collector type - Collection
Item
A collection item is data gathered from a single data provider, such as a
specific query against a system view, or the value of a specific performance
monitor counter. A collection item is an instance of a collector type. - Collection
Set
A collection set is a group of related collection items. There are three
collection sets defined when you configure your MDW: one that contains items
pertaining to disk usage, one containing items pertaining to query execution
and performance statistics, and one containing items pertaining to server
activity, such as CPU and memory usage, as well as waittype and blocking data. - Collection
Mode
The collection mode is the manner in which data is collected and stored.
After running the “Configure Management Data Warehouse”
Wizard, you will have three default collection sets in two collection modes.
The Disk Usage collection set is in non-cached mode and the Server Activity and
Query Statistics collection sets are in cached mode. The caching mode
determines the relationship between the collection of data and the uploading of
data to the MDW database.
- Non-Cached
In non-cached mode, collection and upload are on the same schedule. The
packages start, collect, and upload data at their configured frequency, and run
until they are finished. There is only one job created for collection sets in
non-cached mode. Non-cached mode allows for the collection and uploading of
data to occur on demand, as well as at the intervals specified in the job. - Cached
Mode
In cached mode, the data collection and uploading of jobs are on different
schedules, so two jobs are created for each collection set. Cached mode
supports continuous collection of data, but less frequent uploads. For example,
the default Server Activity collection set uploads data by default every 15
minutes. However, the collection items are collected every 60 seconds.
Table 1 highlights the key features of each of the three
default collection sets. You can see the mode (cached or non-cached), the
default upload frequency, the default data retention period, and a summary of
the collection items in each of the collection sets.
.jpg)
Table 1 Summary of
the default collection sets
Description of Technology
The data collection process consists of scheduled jobs to
collect and upload performance data into the configured database.
Data Collector Architecture
Figure 4, found in SQL Server Books Online, shows the
components that make up the data collection process. The collection system is
grouped into four categories:
- Storage components are the databases that
contain configuration information and jobs (msdb) as well as the collected
performance data (MDW).
- Execution components manage the data collection
and storage operations.
- API components enable interaction between the
user interfaces and the data collector and the user interfaces. These
components include SQL Server metadata, such as the dynamic management views.
- Client components are the user interfaces for
managing the data collection and presenting the reports.
.jpg)
Figure 4: Components
of MDW Data Collection
Job Details
The default configuration creates 5 jobs to manage the three
default collection sets. In addition, a job to purge data, both from the MDW
database and from the job history, is created. The names of five data
collection jobs all start with collection_set. The Disk Usage Collection set
only has a single job, since it is non-cached, and the name of the job reflects
this: collection_set_1_noncached_collect_and_upload. The Query Statistics
collection set has two jobs: collection_set_2_collection and collection_set_2_upload.
Server Activity also has two jobs: collection_set_3_collection and
collection_set_3_upload.
Scheduling the Collections
Both of the upload jobs are scheduled to run every 15
minutes starting at midnight. For a single instance (or even just a few
instances), this schedule should be fine. If you have many servers trying to
upload data to the MDW database at the same time, you can consider staggering
the start times so that the uploads are not running concurrently. For example,
you could have the collection_set_2_upload job start at 12:01 and the
collection_set_3_upload job start at 12:06.
Best practice tip:
Consider staggering start times for upload jobs so they are not running
concurrently to help avoid a tax on performance.
Storage Components
Although the storage component comprises both the msdb
database, where jobs, history, SSIS packages, and configuration information are
stored, as well as the MDW database, we will only be looking at the MDW
database in this section. If you have worked with scheduling jobs using SQL
Server Agent, the jobs and history component of MDW should not present any
special challenges. If you have worked with SSIS, the packages for managing the
data collection and uploads should look like normal jobs.
The MDW database contains the data collected by the default
collection sets and any custom collection sets you define. There are two
schemas in the database: core and snapshots. The core schema contains the
‘dimension’ tables that may be relatively small. For example, one of the tables
in the core schema is core.wait_categories that is used by the Server Activity
collection set and contains only 17 rows, one for each category of wait,
including Backup, Latch, Lock, Logging, etc. The tables in the snapshots schema
are the ‘fact’ tables that contain the actual collected data. These tables can
contain millions or billions of rows, depending on how busy your SQL Server is
and how often your purge jobs run.
There is one additional aspect of storage that does not show
up in Figure 4, and that is the cache location, where the data from the jobs
running in cached mode is stored prior to being uploaded to the MDW database.
Although the setup process will use a default value for this directory, it is
recommended that you carefully consider the placement of this caching directory
so as not to interfere with the disk activity writing to the MDW database
itself.
Configuration of Collections
As mentioned, after the jobs are created, you should not
change any of their properties as they are used mainly to trigger the SSIS
packages. However, you can configure some of the properties of the of the
collection sets, by going to the data Collection node in the Management folder
of SQL Server Management Studio. Right click on any of the collection sets, and
you can control the upload schedule, whether the job runs as cached or
non-cached, and the data retention period. You can also disable the entire data
collection process by right clicking on the Data Collection node and selecting
“Disable Data Collection”. Doing this will stop all the data collection and
upload jobs. Individual collection sets can be stopped, and this would usually
be done to support a specific troubleshooting scenario. You may only want to
start a specific collection set for troubleshooting a known problem or test
scenario.
Metadata to Support Collections
SQL Server 2008 provides system stored procedures and system
metadata views to help you support your MDW collection sets.
System Procedures
The system procedures provided enable you to
programmatically create and manage the data collector and data collection.
There are also procedures to allow you to create your own custom collector
sets. The procedures fall into the following categories:
- Data
Collector stored procedures
- Collection
sets stored procedures
- Collection
items stored procedures
- Collector
types stored procedures
- MDW
configuration stored procedures
- Package
upload stored procedures
- Data
collection execution log stored procedures
Best practice tip:
User custom collections for collecting more queries than the default setting;
the top 3 worst performing queries.
Views and Functions
The views and
functions allow you to programmatically retrieve information about your data
collector and data collection. The views provide about collection sets,
collection items, and execution. The functions provide SSIS package execution
and SQL Trace information.
For complete
details of all the procedures, view and functions, please refer to Books Online. Most of the operations invoked by
the system procedures and most of the information available through the views
and functions are also available through the graphical interface. The main
exception are the mechanisms to create your own data collection sets, data
collector types, and data provider types, which can only be created using
Transact-SQL. Describing the creation of your own collector sets, collector
types, and provider types is beyond the scope of this paper.
Custom Collection Sets
Creating a
custom collection set is relatively straightforward, using the provided stored
procedures. To create a collection set based on performance counters, DMVs, or
TSQL queries, you should take a look at the definition of the default Server
Activity collection set. You can easily see this definition by right clicking
on Server Activity (in the System Data Collection Sets folder in SQL Server
Management Studio) and selecting “Script Data Collection”.
There is also
GUI support for creating a collection set based on SQL Trace data. Once you
have defined your trace using the SQL Server Profiler, select File > Export
> Script Trace Definition > For SQL Trace Collection Set. That will generate
a script that creates a custom trace data collector set based on a trace
definition.
One of the
reasons you might create a custom trace data collection set is that the default
system data collector for Query Statistics does not store all of the statements
– only the worst performing according to the algorithms chosen in the
collection set, as documented in the Books Online. You may want to collect more queries
than the top 3 worst performing ones, and if so, you could disable the default system
collector and set up your own trace collection set.
There is also
a mechanism for creating your own custom data collection provider, which is
beyond the scope of this paper.
Details of Data Collected
We saw an overview of the information collection through
each of the default collection sets in Table 1. In this Table, we saw that the
non-cached collection (Disk Usage) was uploaded every 6 hours and retained for
730 days. The cached data, which is expected to be much more voluminous, was
uploaded every 15 minutes and only retained for 14 days.
Server Activity Information
The server activity information falls into four general
categories: waiting processes, memory usage, CPU/scheduler usage, and disk I/O.
Use the information from this collection set to help troubleshoot problems with
your hardware configuration, your operating system configuration, and your SQL
Server’s ability to handle the workload you are submitting to it. If there are
issues with concurrent access to resources, whether data resources, memory
resources, or scheduler resources, the information available can indicate where
the blockages are. The server activity data is gathered by running seven
different queries against dynamic management views, and by collecting
performance counters across the platform.
Best practice tip:
User Server Activity Information to troubleshoot hardware and operating system
configurations. Using Query Statistics Information for customer collection sets
is not recommended as it simply duplicates the functionality of the QS
collection set.
Query Statistics Information
The query
statistics information collection set contains a single collection item that
uses the Query Activity collector type. This special collector type was created
solely to support the query statistics collection set. It’s not externally
configurable and it’s not really intended for reuse in custom collection sets.
Disk Usage Information
The disk
usage information gathered in a non-cached mode provides information about the
space allocated and used by each of your database files and by the transaction
log. By default, this information is collected and uploaded every six hours,
which should be sufficient to allow you to predict your database growth, and
plan for your storage needs.
Viewing the Data
The third major component of the Management Data Warehouse
is the graphical reports available to view the collected data. At a high level,
you can look at the collection of reports across multiple servers, by right
clicking on your MDW database, as shown in Figure 5.
.jpg)
Figure 5: Accessing
Management Data Warehouse reports
Figure 6 shows you an example of an overview report spanning
13 separate SQL Server instances. The Management Data Warehouse Overview report
displays the last upload time so that you can get a quick status of all the
upload activity to the central MDW. The reports are also accessible via the
right click menu on the Data Collection node for the target instance.
.jpg)
Figure 6: Management
Data Warehouse reports, spanning SQL Server instances
Using SQL Server Management Studio Drilldown Reports
The most powerful feature of the visual reports is the
ability to drill down into the details. Here are some examples.
Query Statistics
You can click on the last Query Statistics upload time for
any of the server instances in Figure 6 and you’ll see a report like that shown
in Figure 7.
.jpg)
Figure 7: Query
Statistics
The timeline along the top lets you choose the time period
in which you are interested. The middle graph shows the top 10 queries for the
selected time period. By default, the top queries by CPU are shown, but you can
click on one of the other measures instead and see the top queries, by
duration, I/O, physical reads or logical writes. The bottom section the report
reports the TSQL text for the top 10 queries. You can click on any of the
queries and see a drilldown report like the one in Figure 8.
.jpg)
Figure 8: Query
Statistics Drilldown Report
If you scroll down on the screen above, you see an
additional report that provides details of the query plans for the collected
queries. This report gives you the ability to drill into the graphical
execution plan, where you’ll be able to determine which indexes are being
scanned, and when the optimizer determines that there is a missing index that
could improve performance if it were created.
Server Activity
The Server Activity report shows you system resource usage
information and SQL Server wait statistics, as shown in Figure 9.
.jpg)
Figure 9: Server
Activity Report
Again, you can choose the time frame from the timeline at
the top. The next section allows you to select one of the system resource
reports and drill down into details of CPU, Memory, Disk I/O, and Network
Usage. The next section shows SQL Server waits for the most common waittype
categories. Clicking on one of the bars drills down into the details of wait
types for that time period as shown in Figure 10.
.jpg)
Figure 10: Server
Activity Drilldown
There are two more very useful reports that you can generate
from this point. First, if you click on the Wait Category “Lock”, you’ll see
the Blocking Report that indicates what processes are being blocked, what
processes are causing the blocking, and what resources the blocked processes
are waiting to access. This report can help you track down blocking chains, as
well as help you detect which resources are causing the most contention.
Second, if you click on any of the other Wait Category values, a report showing
General Waits will be shown.
Disk Space Usage
You can click on the last Disk Usage upload time for any of
the server instances in Figure 6, and see a report similar to Figure 11, which
indicates the data file size and log size for each database on the instance at
the time the last upload was made.
.gif)
Figure 11: Disk Space
Usage, data file size and log size
All of these reports can also be accessed from the Data
Collection Node of the Management folder in the SQL Server Management Studio.
Right click on Data Collection, select Reports, Management Data Warehouse, and
then select one of the Collection Sets. You’ll access reports similar to Figure
7, 9 or 11.
It is also possible to create your own custom reports, using
SQL Server Reporting Services and querying the MDW database directly. If you
want your report to work inside of Management Studio, you’ll need to use the
report designer that was released with BIDS with SQL Server 2005. This is
because SQL Server Management Studio uses the client side reporting control
from Reporting Services 2005.
Viewing the Collection Set Logs
There is also data available in the data collection logs.
You can right click the Data Collection Node or any of the nodes under it, and
select View Logs. From the dialog that opens, you can choose which of the
collection sets you want to see the logs for, and also choose to see the entire
available history of the jobs performing the collection and upload for that
collection set.
Best Practices and Caveats for Management Data Warehouse
Here are some best practices to keep in mind:
- Use
a centralized server for the MDW database. This allows you to use a single
point for viewing reports for multiple instances. You’ll need to be within the
same domain for this to work.
- When
creating a trace collection set, be sure to set filters to collect just the
data you need to diagnose the problem. For example, create a trace for a set of
stored procedures that only collect data if the duration of the query exceeds a
threshold. This way you don’t flood the MDW database with meaningless data.
- Before
adding a collection item with performance counters, make sure you thoroughly
understand what the default Server Activity collection set is already
collecting, so that you do not end up collecting the same data through multiple
collection sets.
Performance Considerations
The costs associated with the MDW are mainly the costs of
data collection and each collection set has its own process (an instance of
DCEXEC.exe) performing the collection. To minimize the overhead of the
collection processes, keep the following points in mind:
- If
you are collecting the results from multiple queries that have the same
collection frequency, combine the queries into a single collection item. The
XML parameters for a single T-SQL collection item can have multiple
<Query> elements. Doing this will minimize the amount of memory that
DCEXEC.exe uses during collection. Similarly, combine multiple performance
counter collection items into a single collection item wherever possible.
- Combine
collection items into a single collection set whenever possible. The only
reason for separate collection sets is if you need separate data retention
periods or different collection schedules. For example, suppose you wanted to
capture the output of 6 DMV queries. You want to keep the data from the first
two queries for 6 months, but it is acceptable to purge the data from the
remaining queries after just two weeks. You should not create a separate
collection set for each query; two collection sets will capture this data with
much less overhead.
- A
collection set running in cached collection mode will always have a collection
process running. If you collect data very frequently, this is more efficient
than starting and stopping a new process every time new data must be collected.
In contrast, a collection set in non-cached mode will not have a collection process
running most of the time. A new process will be started at the scheduled
collection and upload time; it will collect and upload data in a single step,
then exit. If you are collecting data infrequently, non-cached mode is more
efficient than leaving a generally-idle process running all of the time. As a
general rule of thumb, if you will be capturing data once every 5 minutes or
more frequently than this, consider using a collection set that runs in cached
collection mode. If it will suffice to collect data less frequently than every
five minutes, consider using non-cached collection mode, instead.
- Collection
frequency for cached collection sets can be as high as one execution every five
seconds. You should be aware that more frequent collection has correspondingly
high overhead. Always choose the lowest collection frequency that will meet
your needs.
- One
DCEXEC.exe instance can have a memory working set of 30 – 100 MB, depending on
the items in the collection set.
- Starting
a new DCEXEC.exe process for non-cached collection or for data upload can take
10 seconds or more of wall clock time, much of it CPU-bound. In most scenarios
this wouldn’t be noticed, but it could be have a more measureable impact if you
have very little RAM (1GB or less), if you have many collection sets, or if you
don’t follow the best practices listed above.
Caveats and Limitations in the Current Version
In the current version, removing data collection once it has
been configured is not supported. In addition, you should not drop or change
the name of the MDW database, as all of the jobs are based on the original
database name.
Summary and Conclusion
The Management Data Warehouse in SQL Server 2008 is a
powerful collection of components which provide an easy way to create a
repository of performance-related information. We encourage you to start
exploring the data collected using MDW, even if you are not an experienced SQL
Server tuner.
The Management Data Warehouse consists of three components:
- A
set of custom data collectors, which are based on SQL Server Agent jobs that
collect and store performance data
- A
database (called MDW, by default) for holding the data repository
- A
set of graphical drilldown reports for viewing the data collected
Custom collectors can be added and custom reports can be
developed and displayed to capture and present the data that is important to
you and your troubleshooting efforts.
You are welcome submit community comments about how you are
using the data collected and what you would like to see in the product, as this
can drive future development.
For more information
SQL Server 2008 Internals, Microsoft Press, 2009
About the author. Kalen Delaney, a SQL Server MVP since 1993, has been working with SQL Server for
over 20 years. Her major areas of specialization are internals and tuning.
Kalen teaches her advanced SQL Server curriculum to clients around the world,
including at many regional Microsoft offices. Kalen writes regularly for SQL
Server Magazine and is the primary author for the Inside SQL Server series of
books, for SQL Server 7.0, SQL Server 2000, and SQL Server 2005. Her latest
book, from Microsoft Press, is SQL Server 2008 Internals. You can find out more
about Kalen’s activities through her web site at www.SQLServerInternals.com.