Exercise 1: Configuring Columnstore Indexes
In this exercise, you will review the performance of a typical data warehouse query against a fact table and its associated dimension tables. You will measure the performance across different aspects (CPU, I/O, execution time) and compare query performance both without and with a columnstore index.
At the end of this exercise, you will understand:
- How to measure the performance impact of a columnstore index
- How to verify if a columnstore index is being used
- How to verify if the batch execution mode is being used
- How to create a columnstore index
- Go to Start | All Programs | SQL Server 2012, and click SQL Server Management Studio.
- In the Connect to Server dialog, type the network name of the instance where the AdventureWorksDW2012 database exists, and click Connect.
- Click New Query.
Copy and paste the following code into the new query window:
USE AdventureWorksDW2012 GO SET STATISTICS TIME ON -- ** Execute this query twice SELECT p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear, AVG(fpi.UnitCost), SUM(fpi.UnitsOut) FROM dbo.FactProductInventory as fpi INNER JOIN dbo.DimProduct as p ON fpi.ProductKey = p.ProductKey INNER JOIN dbo.DimDate as d ON fpi.DateKey = d.FullDateAlternateKey GROUP BY p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear ORDER BY p.EnglishProductName, d.CalendarYear, d.WeekNumberOfYear FakePre-3cb0e4dbac2941b3b134656b5bc1e1fc-d3ef647f4b914cd9bc3f23bac06e3c72FakePre-7c5ac38da5b144bc963f91ec9354809f-9f625c684f144fbd84127ccff218cd8f
Press F5 twice in order to run the query two times. Be sure to execute the data warehouse query twice – in order to eliminate caching as a factor.
The second execution of the query returned the following results (please note that your results will vary based on the specifications of your lab environment):
(93744 row(s) affected) SQL Server Execution Times: CPU time = 2047 ms, elapsed time = 4009 ms.
Now turn off the STATISTICS TIME, copy and paste the following code into the New Query area of the SQL Management Studio and press F5 to run it:
SET STATISTICS TIME OFF
Next, copy and paste the following query into the New Query area of the SQL Management Studio and press F5 to run it in order to measure the I/O impact of the it:
SET STATISTICS IO ON SELECT p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear, AVG(fpi.UnitCost), SUM(fpi.UnitsOut) FROM dbo.FactProductInventory as fpi INNER JOIN dbo.DimProduct as p ON fpi.ProductKey = p.ProductKey INNER JOIN dbo.DimDate as d ON fpi.DateKey = d.FullDateAlternateKey WHERE d.CalendarYear >= 2000 GROUP BY p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear ORDER BY p.EnglishProductName, d.CalendarYear, d.WeekNumberOfYear
If you click the messages tab, it should return the results similar to the following:
(94248 row(s) affected) Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'FactProductInventory'. Scan count 1, logical reads 46472, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'DimProduct'. Scan count 1, logical reads 477, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'DimDate'. Scan count 1, logical reads 45, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
- Take note of the “logical reads 46472” for the FactProductInventory table.
Now turn off the STATISTICS IO, copy and paste the following code into the New Query area of the SQL Management Studio and press F5 to run it:
SET STATISTICS IO OFF
- Next we’ll look at the actual graphical execution plan of the query. Enable this option by going to Query | Include Actual Execution Plan.
Copy and paste the following query into the New Query area of the SQL Management Studio and press F5 to run it:
SELECT p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear, AVG(fpi.UnitCost), SUM(fpi.UnitsOut) FROM dbo.FactProductInventory as fpi INNER JOIN dbo.DimProduct as p ON fpi.ProductKey = p.ProductKey INNER JOIN dbo.DimDate as d ON fpi.DateKey = d.FullDateAlternateKey WHERE d.CalendarYear >= 2000 GROUP BY p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear ORDER BY p.EnglishProductName, d.CalendarYear, d.WeekNumberOfYear
- In the query results pane, you will see a Results tab, a Messages tab and an Execution plan tab. Click the Execution plan tab.
Scroll to the far right of the execution plan tree and take note of the Table Scan for the FactProductInventory table:
Figure 1
Table Scan for the Fact ProductInventory table
Hover over the Table Scan and take note of the Actual Execution Mode value, which should be Row (this will be an important point of contrast in the next task):
Figure 2
Actual Execution Mode
- Disable viewing of the actual execution plan by going to Query | Include Actual Execution Plan.
Task 2 – Creating the Columnstore Index
Copy and paste the following code into the New Query area of the SQL Management Studio and press F5 to run it in order to create the new columnstore index on the FactProductInventory table:
CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_CS_FactProductInventory] ON dbo.FactProductInventory ( ProductKey, DateKey, UnitCost, UnitsIn, UnitsOut, UnitsBalance )
You should see the following message:
Command(s) completed successfully.
Task 3 – Creating the Columnstore Index
Copy and paste the following code into the New Query area of the SQL Management Studio SQL
-- Let's view CPU and Elapsed time for a sample DW query SET STATISTICS TIME ON SELECT p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear, AVG(fpi.UnitCost), SUM(fpi.UnitsOut) FROM dbo.FactProductInventory as fpi INNER JOIN dbo.DimProduct as p ON fpi.ProductKey = p.ProductKey INNER JOIN dbo.DimDate as d ON fpi.DateKey = d.FullDateAlternateKey GROUP BY p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear ORDER BY p.EnglishProductName, d.CalendarYear, d.WeekNumberOfYear
Press F5 twice in order to view the CPU and elapsed time against a table. Be sure to execute the data warehouse query twice – in order to eliminate caching as a factor:
The second execution of the query returned the following results as compared to the original data collected in the first task of this exercise (please note that your results will vary based on the specifications of your lab environment – and these results are annotated with the before/after results):
BEFORE the columnstore index SQL Server Execution Times: CPU time = 2047 ms, elapsed time = 4009 ms. AFTER the columnstore index SQL Server Execution Times: CPU time = 2015 ms, elapsed time = 3380 ms.
Now turn off the STATISTICS TIME, copy and paste the following code into the New Query area of the SQL Management Studio and press F5 to run it:
SET STATISTICS TIME OFF
Next, copy and paste the following SQL code into the New Query area of the SQL Management Studio and press F5 to execute it in order to measure the I/O impact of the query:
SET STATISTICS IO ON SELECT p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear, AVG(fpi.UnitCost), SUM(fpi.UnitsOut) FROM dbo.FactProductInventory as fpi INNER JOIN dbo.DimProduct as p ON fpi.ProductKey = p.ProductKey INNER JOIN dbo.DimDate as d ON fpi.DateKey = d.FullDateAlternateKey WHERE d.CalendarYear >= 2000 GROUP BY p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear ORDER BY p.EnglishProductName, d.CalendarYear, d.WeekNumberOfYear
This should return the results similar to the following (note the logical reads of 726 for FactProductInventory for the AFTER versus the original 294682 in the annotated results below):
(93744 row(s) affected) Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'FactProductInventory'. Scan count 1, logical reads 726, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'DimProduct'. Scan count 1, logical reads 477, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'DimDate'. Scan count 1, logical reads 45, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Now turn off the STATISTICS IO, copy and paste the following SQL code into the New Query area of the SQL Management Studio and press F5 to run it:
SET STATISTICS IO OFF
- Next we’ll look at the actual graphical execution plan of the query. Enable this option by going to Query | Include Actual Execution Plan.
- Copy and paste the following code into the New Query area of the SQL Management Studio and press F5 to execute it:
SELECT p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear, AVG(fpi.UnitCost), SUM(fpi.UnitsOut) FROM dbo.FactProductInventory as fpi INNER JOIN dbo.DimProduct as p ON fpi.ProductKey = p.ProductKey INNER JOIN dbo.DimDate as d ON fpi.DateKey = d.FullDateAlternateKey WHERE d.CalendarYear >= 2000 GROUP BY p.EnglishProductName, d.WeekNumberOfYear, d.CalendarYear ORDER BY p.EnglishProductName, d.CalendarYear, d.WeekNumberOfYear
-
- In the query results pane, you will see a Results tab, a Messages tab and an Execution plan tab. Click the Execution plan tab.
Scroll to the far right of the execution plan tree and take note of the Columnstore Index Scan operator for the FactProductInventory table:
Figure 3
Columnstore Index Scan
Hover over the Columnstore Index Scan and take note of the Actual Execution Mode value, which should be Batch (notice the change from Row when no columnstore index existed):
Figure 4
Actual Execution Mode value
Please notice that If parallelism is disabled or you only are using 1 logical processor, you will not be able to see the batch execution mode.
- Disable viewing the actual execution plan by going to Query | Include Actual Execution Plan.
|
|