T-SQL in Yukon
Powerful New T-SQL Syntax Gives SQL Server a Programmability Boost
Itzik Ben-Gan
This article was based on Beta 1 of Microsoft SQL Server Code Name "Yukon" and all information contained herein is subject to change.
Note: This document was developed prior to the product's release to manufacturing and, as such, we cannot guarantee that any details included herein will be exactly the same as those found in the shipping product. The information represents the product at the time this document was printed and should be used for planning purposes only. Information subject to change at any time without prior notice.
SUMMARY
The T-SQL language in the upcoming version of SQL Server will provide more power and flexibility than previous versions. Additions and enhancements include error handling with the TRY/CATCH construct, SNAPSHOT isolation, and WAITFOR enhancements. Also important are the BULK rowset provider, common table expressions, recursive queries, PIVOT and UNPIVOT operators, and much more. This article introduces these features so readers will be ready for the next version of SQL Server.

Contents
The Beta 1 release of the next version of SQL Server™ code-named "Yukon" introduces an array of T-SQL enhancements and new features that can improve your expressive powers, error management, and performance. In this article I will discuss some of the important enhancements and new features including error handling, recursive queries, and isolation. I will focus on error management and relational improvements and briefly describe the other enhancements.
In addition to the features I will describe here, there are many other important enhancements to T-SQL that I will not cover in depth because some are not new conceptually while others deserve independent coverage. These include messaging and the Service Broker platform, partitioning, and XML enhancements. (For more information about XML enhancements in Yukon, see "XML in Yukon: New Version Showcases Native XML Type and Advanced Data Handling" by Bob Beauchemin in this issue.
Error Handling
SQL Server Yukon Beta 1 introduces the new TRY/CATCH construct for error handling in T-SQL. This construct allows you to trap transaction abort errors, even those that would have caused a batch to terminate in previous versions of SQL Server (conversion errors, deadlocks, and so on). The types of errors that the new construct cannot handle are those that cause the termination of your session (usually errors with severity 21 and above, such as hardware errors). Typically, your error-handling code looks like Figure 1.

Figure 1 Error Handling Code
SET XACT_ABORT ON
BEGIN TRY
BEGIN TRAN
-- your code goes here
COMMIT TRAN
END TRY
BEGIN CATCH TRAN_ABORT
DECLARE @err AS INT
SET @err = @@error
IF @err = <error id 1>
BEGIN
-- Handle error 1
END
ELSE IF @err = <error id 2>
BEGIN
-- Handle error 2
END
-- ... other errors ...
ELSE
BEGIN
-- Handle unexpected error
END
IF @@trancount > 0 ROLLBACK
END CATCH
The XACT_ABORT setting is turned on so that SQL Server will treat any error as a transaction abort error, thus allowing it to be trapped and handled. Within the TRY block any error that occurs within an explicit transaction causes control to be passed to the CATCH block that immediately follows the TRY block. If no error occurs, the CATCH block is skipped. If you want to investigate the type of error that was raised and react accordingly, you must save the return value of @@error to a variable at the beginning of the CATCH block and then start investigating it. Otherwise the value returned by @@error may not be correct since any statement besides a DECLARE would change it.
When a transaction abort error occurs within a transaction located in the TRY block and control is passed to the CATCH block, the transaction enters a doomed state. Locks are not released, and persisted work is not reversed until you explicitly issue a ROLLBACK command. Until you issue a ROLLBACK, you are not allowed to initiate any activity that requires opening an implicit or explicit transaction. You can examine the content of resources that were changed in the transaction that caused the error, and by doing so you see the changes, but you have to issue a ROLLBACK in order to take any corrective measures that require a transaction to occur. Note that in order to trap errors that take place within the CATCH block, you must write your code within a nested TRY/CATCH construct. To see a more detailed example, first create an ErrorLog table in which the error-handling code audits comments, and then create the T1 and T2 tables against which queries will be issued, as I've done with the code in Figure 2.

Figure 2 ErrorLog Table
CREATE TABLE ErrorLog
(
lsn INT NOT NULL IDENTITY PRIMARY KEY,
dt DATETIME NOT NULL DEFAULT(GETDATE()),
errid INT NOT NULL,
comments VARCHAR(100) NOT NULL
)
CREATE TABLE T1(col1 INT NOT NULL PRIMARY KEY)
CREATE TABLE T2(col1 INT NOT NULL PRIMARY KEY)
INSERT INTO T1 VALUES(1)
INSERT INTO T2 VALUES(1)
GO
Next, run the script in Figure 3 (call it Script 1) in a new connection (call it Connection 1). Script 1 sets the lock timeout to 30 seconds and sets the deadlock priority to low, volunteering to be the deadlock victim in a deadlock scenario with a process that runs with a normal priority. The code in the TRY block updates T1, waits 10 seconds, then selects from T2. If the transaction completes with no errors, a row is inserted into the ErrorLog table with a comment saying that it completed successfully.

Figure 3 Script 1
SET XACT_ABORT ON
SET LOCK_TIMEOUT 30000
SET DEADLOCK_PRIORITY LOW
DECLARE
@err AS INT,
@comments AS VARCHAR(100),
@retry AS INT,
@numretries AS INT
SELECT @retry = 0, @numretries = 2
RETRY:
BEGIN TRY
BEGIN TRAN
UPDATE T1 SET col1 = 2
WAITFOR DELAY '00:00:10'
SELECT * FROM T2
COMMIT
IF @@error = 0
INSERT INTO ErrorLog(errid, comments)
VALUES(0, 'Transaction completed successfully')
END TRY
BEGIN CATCH TRAN_ABORT
SET @err = @@error
IF @err = 2627
BEGIN
SELECT @retry = 0, @comments = 'PK violation'
END
ELSE IF @err = 1222
BEGIN
SELECT @retry = 0, @comments = 'Lock timeout expired'
END
ELSE IF @err = 1205
BEGIN
SELECT
@retry = CASE WHEN @retry < @numretries THEN @retry + 1
ELSE 0 END,
@comments = 'Deadlock;'
+ CASE WHEN @retry > 0
THEN ' attempting retry #' +
CAST(@retry AS VARCHAR(10))
ELSE ' aborting'
END
END
ELSE
BEGIN
SELECT @retry = 0, @comments = 'Unhandled error'
END
IF @@trancount > 0 ROLLBACK
INSERT INTO ErrorLog(errid, comments)
VALUES(@err, @comments)
IF @retry > 0
BEGIN
WAITFOR DELAY '00:00:05'
GOTO RETRY
END
END CATCH
The CATCH block is designed to trap primary key violation errors, lock timeout expiration, and deadlock errors with retry logic. You can set the desired number of retries by changing the value assigned to the variable @retry at the beginning of the code; it is currently set to 2.
After running the code in
Figure 3 for the first time, examine the contents of ErrorLog. Note that the transaction completed successfully. To test a primary key violation error, open a new connection (call it Connection 2) and run the following code:
Go back to Connection 1 and run Script 1 again. If you examine the contents of ErrorLog, you should see that a primary key violation error was logged. Go to Connection 2 and delete the row you just inserted by running this command:
DELETE FROM T1 WHERE col1 = 3
To test a lock expiration timeout, run the following in Connection 2:
BEGIN TRAN
UPDATE T1 SET col1 = 1
Go back to Connection 1 and run Script 1 again. After about 30 seconds, you should get an error. Examine the contents of ErrorLog and notice that a lock timeout expiration was logged. Go to Connection 2 and issue a ROLLBACK command to roll back the transaction.
To test a deadlock, go to Connection 2 and paste in the following code, but don't run it yet:
DECLARE @i AS INT
BEGIN TRAN
SET @i = 1
WHILE @i <= 2
BEGIN
UPDATE T2 SET col1 = 2
WAITFOR DELAY '00:00:10'
SELECT * FROM T1
WAITFOR DELAY '00:00:05'
SET @i = @i + 1
END
ROLLBACK
Go to Connection 1, run the code in Script 1, and immediately afterward run the code in Connection 2. After about a minute you should see the error in Connection 1. Examine the contents of ErrorLog and notice that there were two retry attempts after deadlock errors and a third successful attempt that raised no error. Query the ErrorLog table and examine its contents.
Finally, if you want to raise your own transaction abort error within the TRY block, you can invoke the RAISERROR command with the TRAN_ABORT option.
SNAPSHOT Isolation
Yukon introduces a new isolation level called SNAPSHOT which allows you to work in a mode where writers do not block readers, providing readers with a committed version of the data they request. SQL Server Yukon maintains a linked list in tempdb that tracks changes to rows and constructs an older, committed version of data for readers. This isolation is useful for optimistic locking, where UPDATE conflicts are not common. If Process 1 retrieves data and later attempts to modify it, and if Process 2 has modified the same data between the retrieval and modification, then SQL Server produces an error upon Process 1's modification attempt because of the conflict. Then, Process 1 can try to reissue the transaction. This mode can be efficient in situations where update conflicts are not common.
In order to allow working in a SNAPSHOT isolation level, you must turn on the database option ALLOW_SNAPSHOT_ISOLATION, as you'll see shortly. To simulate a scenario in which writers do not block readers, create the testdb database, turn on the appropriate database option, and create the T1 table with the value "Version1" in the datacol by running the following code:
CREATE DATABASE testdb
GO
USE testdb
ALTER DATABASE testdb SET ALLOW_SNAPSHOT_ISOLATION ON
CREATE TABLE T1
(
keycol INT NOT NULL PRIMARY KEY,
datacol VARCHAR(10) NOT NULL
)
INSERT INTO T1 VALUES(1, 'Version1')
Issue the following code from Connection 1, which opens a transaction and changes the value in the datacol to "Version2":
USE testdb
BEGIN TRAN
UPDATE T1 SET datacol = 'Version2' WHERE keycol = 1
SELECT * FROM T1
Go to Connection 2 and run the following code, which sets the session's isolation to SNAPSHOT and retrieves the contents of T1:
USE testdb
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
SELECT * FROM T1
Notice that you get "Version1" back, even though Connection 2 changed it to "Version2" (but had not committed that change yet).
Now go to Connection 1 and commit the transaction and close all connections with a COMMIT command. To try optimistic locking, open two new connections, go to Connection 1, and run the following code, which sets the session's isolation to SNAPSHOT, opens a transaction, and retrieves data from T1:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
BEGIN TRAN
SELECT * FROM T1
Go to Connection 2 and issue an UPDATE:
UPDATE T1 SET datacol = 'Version3' WHERE keycol = 1
Go back to Connection 1 and try to update the same data you previously retrieved, which was modified by Connection 2:
UPDATE T1 SET datacol = 'Version4' WHERE keycol = 1
You should get an error notifying you that SQL Server cannot use snapshot isolation to access table T1 in database testdb and that you should retry your transaction.
The WAITFOR Command
The WAITFOR command has been enhanced in several ways in Yukon. Besides waiting for a specified duration or until a certain datetime value, you can now request to wait for a T-SQL statement to affect at least one row. You can specify that the command wait on one of the following statements: SELECT, INSERT, UPDATE, DELETE, or RECEIVE. The first four are self-explanatory; RECEIVE refers to receiving a message from a queue. You can optionally specify a timeout if you want to stop waiting after your specified number of milliseconds. The syntax of the WAITFOR command is:
WAITFOR(<statement>) [,TIMEOUT <timeout_value>]
Another T-SQL enhancement in Yukon allows you to return output from Data Manipulation Language (DML) statements other than SELECT (INSERT, UPDATE, DELETE). A new OUTPUT clause allows you to request that the old/new images of the columns be returned by referring to the INSERTED and DELETED tables, similar to the way you refer to them in triggers. You can even specify an INTO clause and direct the output to a table variable. Another enhancement lets you specify the READPAST hint with modifying statements, allowing you to skip locked rows.
An example of using the aforementioned enhancements is to have several processes waiting for a DELETE statement to delete at least one row from a table, directing the output to a table variable, with each processing different portions of the data in parallel. To witness this, create the following MsgQueue table:
USE tempdb
CREATE TABLE MsgQueue
(
msgid INT NOT NULL IDENTITY PRIMARY KEY,
msgdata VARCHAR(15) NOT NULL
)
Open one or more connections and run the following code in each to periodically insert new messages into the table:
SET NOCOUNT ON
USE tempdb
WHILE 1 = 1
BEGIN
INSERT INTO MsgQueue VALUES('Msg' +
CAST(CAST(RAND()*1000000000 AS INT) AS VARCHAR(10)))
WAITFOR DELAY '00:00:01'
END
Next you need to open several other new connections and run the code in
Figure 4 in each connection to simulate processing of the newly arriving messages.

Figure 4 Processing Queued Messages
SET NOCOUNT ON
USE tempdb
DECLARE @MyQueue TABLE
(
msgid INT NOT NULL PRIMARY KEY,
msgdata VARCHAR(15) NOT NULL
)
WHILE 1 = 1
BEGIN
BEGIN TRAN
WAITFOR
(
DELETE FROM MsgQueue WITH (READPAST)
OUTPUT DELETED.* INTO @MyQueue
)
PRINT 'Processing messages ...'
SELECT * FROM @MyQueue
-- Message processing goes here
DELETE FROM @MyQueue
COMMIT TRAN
END
BULK Rowset Provider
Yukon introduces the new BULK rowset provider, which you can specify in the OPENROWSET function to efficiently access files in a relational fashion. You can use the BULK provider in a manner similar to the way you use a BULK INSERT statement, but you don't have to send the output to a table. You do have to specify a format file, which is the same format file you use with bcp.exe or the BULK INSERT statement. The following code shows how to access a file called c:\temp\textfile1.txt using the format file c:\temp\textfile1.fmt, providing the alias C to the result table, and the aliases col1, col2, and col3 to the result columns:
SELECT col1, col2, col3
FROM OPENROWSET(BULK 'c:\temp\textfile1.txt',
FORMATFILE = 'c:\temp\textfile1.fmt') AS C(col1, col2, col3)
In addition to the FORMATFILE option, you can also specify the following options within the parentheses of the OPENROWSET function: CODEPAGE, DATAFILETYPE, FIELDTERMINATOR, FIRSTROW, LASTROW, and ROWTERMINATOR. You can also use an INSERT SELECT to load the data into a table efficiently and can specify table hints for load options:
INSERT INTO MyTable WITH (BULK_CHECK_CONSTRAINTS)
SELECT col1, col2, col3
FROM OPENROWSET(BULK 'c:\temp\textfile1.txt',
FORMATFILE = 'c:\temp\textfile1.fmt') AS C(col1, col2, col3)
Other options you can specify as a table hint include: BULK_BATCHSIZE, BULK_FIRE_TRIGGERS, BULK_KEEPIDENTITY, BULK_KEEPNULLS, BULK_KILOBYTES_PER_BATCH, BULK_MAXERRORS, and ROWS_PER_BATCH.
Using the BULK rowset provider you can load a file into a table's column more easily than ever before using regular DML. Now, instead of being limited to the TEXT, NTEXT, and IMAGE datatypes for large objects, you can also use the VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) datatypes. The new MAX option allows you to manipulate large objects the same way you manipulate regular datatypes. For example, the following UPDATE statement stores a text file in a table column that is defined as VARCHAR(MAX):
UPDATE LOBs
SET clob_col = (SELECT clob_data
FROM OPENROWSET(BULK 'c:\temp\textfile1.txt',
SINGLE_CLOB) AS C(clob_data))
WHERE keycol = 1
The SINGLE_NCLOB option tells SQL Server that the large object is in character format. Similarly, SINGLE_CLOB specifies a large object with regular character format, and SINGLE_BLOB specifies a binary format. The name of the returned column is BulkColumn, but as the previous code snippet shows, you can specify your own alias for this.
The TOP Option
The T-SQL TOP option in Yukon has two significant enhancements. You can now specify as an argument to TOP an expression that can contain variables or even a self-contained query. You can also use the TOP option with modifying DML (INSERT, UPDATE, DELETE).
To specify an expression, you must enclose it in parentheses. The expression should be of the BIGINT datatype when not using the PERCENT option, and a float value in the range 0 through 100 when using the PERCENT option. The following code shows how to use an expression with a variable to return the requested number of earliest orders from the SalesOrderHeader in the AdventureWorks database:
USE AdventureWorks
DECLARE @n AS BIGINT
SET @n = 5
SELECT TOP (@n) *
FROM SalesOrderHeader AS SOH
ORDER BY OrderDate, SalesOrderID
The SalesOrderID is used as a tiebreaker. Similarly, the following example demonstrates how to use the PERCENT option to return the requested percentage of earliest orders:
DECLARE @p AS FLOAT
SET @p = 0.01
SELECT TOP (@p) PERCENT *
FROM SalesOrderHeader AS SOH
ORDER BY OrderDate, SalesOrderID
You should get 4 rows as a result since the SalesOrderHeader table contains 31,519 rows, and 31,519 x .0001 rounded up equals 4.
The main motivation for allowing TOP with modifying DML was to substitute the SET ROWCOUNT option, which SQL Server could not optimize very well. The SET ROWCOUNT option was often used with modifications to process large numbers of rows in batches, preventing the transaction log from exploding and keeping the individual locks from escalating to a full table lock. To see how to delete rows in batches using the new TOP functionality, first copy the contents of the SalesOrderHeader table to MySalesOrderHeader, and create an index on the OrderDate, and SalesOrderID columns by running the following code:
SELECT *
INTO MySalesOrderHeader
FROM SalesOrderHeader
CREATE UNIQUE CLUSTERED INDEX idx_uc_OrderDate_SalesOrderID
ON MySalesOrderHeader(OrderDate, SalesOrderID)
To delete all rows with an order year earlier than 2003 in batches of 1,000, use the following code:
WHILE 1 = 1
BEGIN
DELETE TOP (1000)
FROM MySalesOrderHeader WHERE OrderDate < '20030101'
IF @@rowcount < 1000 BREAK
END
SQL Server optimizes such code much more efficiently than it can with the SET ROWCOUNT option. You can now drop the MySalesOrderHeader table:
DROP TABLE MySalesOrderHeader
The APPLY Operator
APPLY is a new relation operator that you specify in the FROM clause of a query. It allows you to invoke a table-valued function for each row of an outer table, optionally using the outer table's columns as the function's arguments. The APPLY operator comes in two forms: CROSS APPLY and OUTER APPLY. The former doesn't return the outer table's row if the table-valued function returns an empty set for it, while the latter returns a row with NULLs instead of the function's columns. To use the APPLY operator, first create the following Arrays table, which stores arrays of comma-separated values:
CREATE TABLE Arrays
(
arrid INT NOT NULL IDENTITY PRIMARY KEY,
array VARCHAR(7999) NOT NULL
)
INSERT INTO Arrays VALUES('')
INSERT INTO Arrays VALUES('10')
INSERT INTO Arrays VALUES('20,40,30')
INSERT INTO Arrays VALUES('-1,-3,-5')
Next, create the fn_splitarr table-valued function, which accepts an array as an argument and returns a table containing the individual elements and their positions (see
Figure 5). To test the function, run this code:
SELECT * FROM fn_splitarr('20,40,30')
The output should look like the following lines:
pos value
--- -----
1 20
2 40
3 30
Now use the CROSS APPLY operator to invoke the function for each row in Arrays:
SELECT A.arrid, F.*
FROM Arrays AS A
CROSS APPLY fn_splitarr(array) AS F
Then check the values in your output against these:
arrid pos value
----- --- -----
2 1 10
3 1 20
3 2 40
3 3 30
4 1 -1
4 2 -3
4 3 -5
Note that the row with arrid 1 from Arrays was not returned, as the function returned an empty set for it. To return all rows from Arrays regardless of whether the function returns rows for them or not, use OUTER APPLY.

Figure 5 fn_splitarr Table-valued Function
CREATE FUNCTION fn_splitarr(@arr AS VARCHAR(7999))
RETURNS @t TABLE(pos INT NOT NULL, value INT NOT NULL)
AS
BEGIN
DECLARE @end AS INT, @start AS INT, @pos AS INT
SELECT @arr = @arr + ',', @pos = 1,
@start = 1, @end = CHARINDEX(',', @arr, @start)
WHILE @end > 1
BEGIN
INSERT INTO @t VALUES(@pos, SUBSTRING(@arr, @start, @end - @start))
SELECT @pos = @pos + 1,
@start = @end + 1, @end = CHARINDEX(',', @arr, @start)
END
RETURN
END
GO
A bonus feature of support for the APPLY operator is that you can now refer to table-valued functions and specify the outer table's columns as arguments in your subqueries. For example, the following code returns all of the Arrays whose sum of its elements is less than or equal to 10:
SELECT *
FROM Arrays
WHERE (SELECT SUM(value) FROM fn_splitarr(array)) <= 10
Common Table Expressions and Recursive Queries
Common table expressions (CTEs) allow you to write named table expressions that persist for only the duration of a query. In their simple form, they provide a mixed functionality of views and derived tables. Like views, the CTE can be referred to more than once in the outer query, and like a derived table, it persists for only the duration of the query. In their more complex form, you can write recursive CTEs, allowing you to manipulate trees and graphs more easily and efficiently.
You define a CTE using a WITH clause followed by the CTE's name and optionally provide a list of result column aliases in parentheses. You follow with the AS clause and parentheses containing the CTE's query expression. You finish by providing an outer query which refers to the CTE's result. Within the CTE's query expression, you can refer to variables if you like.
The code in Figure 6 shows a simple example for writing a non-recursive CTE that returns yearly customer's sales order values. Obviously you could achieve the same results without using a CTE, but suppose with each row you also want to return the previous year's total value, and the difference from the current year. If you choose to use derived tables, you have to specify the current year's query in one derived table, and the previous year's query in another, and have the outer query join the two. With CTEs you can write a single query returning yearly totals, and have the outer query refer to it twice (see Figure 7).

Figure 7 Single Query Returning Yearly Totals
DECLARE @CustomerID AS INT
SET @CustomerID = 3
WITH YearlyCustOrders(OrderYear, TotalValue)
AS
(
SELECT YEAR(OrderDate), SUM(OrderQty*UnitPrice)
FROM SalesOrderHeader AS H
JOIN SalesOrderDetail AS D
ON H.SalesOrderID = D.SalesOrderID
WHERE CustomerID = @CustomerID
GROUP BY YEAR(OrderDate)
)
SELECT Cur.*, Prev.TotalValue AS PrevTotalValue,
(Cur.TotalValue/Prev.TotalValue - 1) * 100 AS Diff
FROM YearlyCustOrders AS Cur
LEFT OUTER JOIN YearlyCustOrders AS Prev
ON Cur.OrderYear = Prev.OrderYear + 1;

Figure 6 A Non-recursive CTE
USE AdventureWorks
DECLARE @CustomerID AS INT
SET @CustomerID = 3
WITH YearlyCustOrders(OrderYear, TotalValue)
AS
(
SELECT YEAR(OrderDate), SUM(OrderQty*UnitPrice)
FROM SalesOrderHeader AS H
JOIN SalesOrderDetail AS D
ON H.SalesOrderID = D.SalesOrderID
WHERE CustomerID = @CustomerID
GROUP BY YEAR(OrderDate)
)
SELECT * FROM YearlyCustOrders;
The true power of CTEs, though, is in their recursive form. Within the CTE's parentheses you define queries that are either independent or refer back to the CTE. The independent queries (those without a reference to the CTE's name) are called anchor members and are invoked only once. The queries that refer back to the CTE's name are called recursive members and are invoked repeatedly until the query returns no rows. Anchor members can be appended to each other using either a UNION or a UNION ALL operator, depending on whether you want to eliminate duplicates or not. You must append recursive members using a UNION ALL operator.
As an exmaple of a scenario in which recursive CTEs are useful, consider the BillOfMaterials table in the AdventureWorks database. This table represents a typical bill of materials scenario with product assemblies that form an acyclic directed graph. Each product is assembled from other products, which might in turn be assembled from other products as well, and there are no cyclic relationships. The assembly product-contained product relationship is expressed by the AssemblyID and ComponentID columns. The PerAssemblyQty contains the quantity of the product expressed in ComponentID for each individual product expressed in AssemblyID. Relationships that become obsolete have a date specified in the ObsoleteDate column. If you're interested only in non-obsolete data, you should test this column for NULLs. There is other useful information in the table including the unit measure, but for the sake of this illustration I'll ignore all other columns.
The code in Figure 8 yields the data for an exploded view of ProductID 210. Figure 9 shows an excerpt of this view; and depicts the containment relationships between the products. Within the CTE's body, the first query doesn't have a reference to the CTE's name, so it is an anchor member and as such is invoked only once. Notice that the query looks for a row where the component ID is 210, with an assembly ID of NULL, meaning that it's a top-level product. The query makes sure that the relationship is not obsolete, and returns the component ID and quantity. The recursive member returns the products contained within the assemblies returned from the previous step by issuing a join between the CTE's name and the BillOfMaterials table. The first time the recursive member is invoked, the previous step is the result returned by the anchor member. The second time it is invoked, the previous step is the result returned by the first invocation of the recursive member, and so on until the recursive member returns an empty set.

Figure 9 Exploded View of ProductID 210

Figure 8 Details on ProductID 210
WITH BOMCTE(ProductID, Qty)
AS
(
SELECT ComponentID, PerAssemblyQty
FROM BillOfMaterials
WHERE ComponentID = 210
AND AssemblyID IS NULL
AND ObsoleteDate IS NULL
UNION ALL
SELECT BOM.ComponentID, BOMCTE.Qty * BOM.PerAssemblyQty
FROM BOMCTE
JOIN BillOfMaterials AS BOM
ON BOM.ObsoleteDate IS NULL
AND BOM.AssemblyID = BOMCTE.ProductID
)
SELECT B.ProductID, P.Name, B.Qty
FROM BOMCTE AS B
JOIN Product AS P
ON P.ProductID = B.ProductID
ORDER BY P.ProductID;
The recursive member calculates the cumulative quantity of the component by multiplying the previous step's quantity by the component's quantity. The reference of the outer query to the CTE's name gets the unified results of all of the invocations of the anchor and recursive members. The outer query joins the CTE to the Products table to get the product names and produces the 90 rows in
Figure 10 (abbreviated). Each component can appear more than once in the output, for example product 835, since it might participate in different assemblies. You can revise the outer query to group the results by product ID and name to get the total quantities per product. The code looks like
Figure 8, but the outer query will look like this:
SELECT B.ProductID, P.Name,
SUM(B.Qty) AS TotalQty
FROM BOMCTE AS B
JOIN Product AS P
ON P.ProductID = B.ProductID
GROUP BY B.ProductID, P.Name
ORDER BY B.ProductID;

Figure 10 CTE Joined to the Products Table
ProductID Name Qty
---------- ------------------------------ ------------------
1 Adjustable Race 1
2 Bearing Ball 100
3 LL Bottom Bracket 1
...
835 Metal Sheet 5 2
835 Metal Sheet 5 1
835 Metal Sheet 5 2
...
(90 row(s) affected)
If you suspect cycles and want to limit the number of recursive invocations, you can specify the MAXRECURSION option right after the outer query:
WITH...
outer_query
OPTION(MAXRECURSION 30)
This option causes SQL Server to raise an error when the CTE exceeds the specified limit. The default in SQL Server is 100 when not specifying this option. If you don't want to have a limit, you must specify 0. Note that you can write custom code that detects cyclic relationships, but that is beyond the scope of this article.
The PIVOT Operator
The new PIVOT operator in SQL Server Yukon allows you to write crosstab queries that rotate rows into columns. The UNPIVOT operator does the opposite—working on pivoted data to rotate columns into rows. Figure 11 shows the result of using the PIVOT operator in the database when you want to return yearly total sales order values for each salesperson, with each year's value appearing in a different column.

Figure 11 Result of a PIVOT
SalesPersonID 2001 2002 2003 2004
------------ -------------- -------------------- --------------
20 NULL NULL 197303.2486 31680.0348
33 569966.4019 1475303.3042 1693905.5061 685729.9303
29 1665446.9384 3104537.8510 2720479.1674 1311261.2956
•••
(18 row(s) affected)
The important thing to note when using the PIVOT operator is that you need to provide it with a query expression that returns only the columns of interest by using a view, derived table, or a CTE. The reason is that PIVOT makes an implicit GROUP BY operation behind the scenes on all columns that are not explicitly referenced by the operator. In this case, you need the salesperson ID, the order year, and the order value:
USE AdventureWorks
SELECT
SOH.SalesPersonID,
YEAR(SOH.OrderDate) AS OrderYear,
SOD.OrderQty * SOD.UnitPrice AS OrderValue
FROM SalesOrderHeader AS SOH
JOIN SalesOrderDetail AS SOD
ON SOD.SalesOrderID = SOH.SalesOrderID
SQL Server figures out that the "GROUP BY" column list should be the list of columns in the input table that were not referred to explicitly by the PIVOT operator in the aggregate function or in the IN clause. So that you don't get undesired columns in the implicit GROUP BY column list, you need to provide the PIVOT operator with an input table containing only the columns of interest for the aggregate function, IN clause, and implicit GROUP BY. This can be achieved by using a CTE or a derived table containing the previous query that returns only the columns of interest.
The code in Figure 12 shows how you use this query within a CTE, and have the outer query issue a PIVOT operation on the CTE's results. SUM(OrderValue) tells PIVOT which aggregation to calculate to populate the cells of the pivoted column. The FOR clause tells PIVOT which source column contains the values to rotate into result columns. The IN clause contains the list of values that are going to appear as result column names.

Figure 12 PIVOT on CTE
WITH OrdersCTE(SalesPersonID, OrderYear, OrderValue)
AS
(
SELECT
SOH.SalesPersonID,
YEAR(SOH.OrderDate),
SOD.OrderQty * SOD.UnitPrice
FROM SalesOrderHeader AS SOH
JOIN SalesOrderDetail AS SOD
ON SOD.SalesOrderID = SOH.SalesOrderID
)
SELECT *
FROM OrdersCTE
PIVOT(SUM(OrderValue)
FOR OrderYear IN([2001], [2002], [2003], [2004])) AS P
SQL Server requires you to specify explicitly the list of values in the IN clause to rotate to result columns. You can't use a static query and have SQL Server figure out all distinct values in OrderYear. To achieve this, you have to use dynamic execution to construct the query string dynamically, as the code in Figure 13 shows.

Figure 13 Constructing a Dynamic Query String
CREATE INDEX idx_nc_OrderDate ON SalesOrderHeader(OrderDate)
DECLARE @sql AS NVARCHAR(4000), @dt AS DATETIME
SET @sql =
N'WITH OrdersCTE(SalesPersonID, OrderYear, OrderValue)
AS
(
SELECT
SOH.SalesPersonID,
YEAR(SOH.OrderDate),
SOD.OrderQty * SOD.UnitPrice
FROM SalesOrderHeader AS SOH
JOIN SalesOrderDetail AS SOD
ON SOD.SalesOrderID = SOH.SalesOrderID
)
SELECT *
FROM OrdersCTE
PIVOT(SUM(OrderValue) FOR OrderYear IN('
SET @dt =
(SELECT CONVERT(CHAR(4), YEAR(MIN(OrderDate))) + '0101'
FROM SalesOrderHeader)
WHILE @dt IS NOT NULL
BEGIN
SET @sql = @sql + N'[' + CAST(YEAR(@dt) AS NCHAR(4)) + N']'
SET @dt =
(SELECT CAST(YEAR(MIN(OrderDate)) AS CHAR(4)) + '0101'
FROM SalesOrderHeader
WHERE OrderDate >= DATEADD(year, 1, @dt))
IF @dt IS NOT NULL SET @sql = @sql + N','
END
SET @sql = @sql + N')) AS P'
EXEC sp_executesql @sql
To see the UNPIVOT operator in action, first create the SalesPivoted table by running the query in
Figure 12 with the addition of "SELECT INTO SalesPivoted" before the FROM clause (see
Figure 14). The UNPIVOT operator's arguments are very similar to PIVOT's arguments. But this time you specify a name of a result column that will contain the pivoted cell's values all in one column. Following the FOR clause, you specify the name of the result column that will store the names of the pivoted columns as column values. In the parentheses following the IN clause, specify the list of pivoted columns you want to unpivot:
SELECT *
FROM SalesPivoted
UNPIVOT(OrderValue
FOR OrderYear IN([2001], [2002], [2003], [2004])) AS U

Figure 14 Setting up for UNPIVOT
WITH OrdersCTE(SalesPersonID, OrderYear, OrderValue)
AS
(
SELECT
SOH.SalesPersonID,
YEAR(SOH.OrderDate),
SOD.OrderQty * SOD.UnitPrice
FROM SalesOrderHeader AS SOH
JOIN SalesOrderDetail AS SOD
ON SOD.SalesOrderID = SOH.SalesOrderID
)
SELECT *
INTO SalesPivoted
FROM OrdersCTE
PIVOT(SUM(OrderValue) FOR OrderYear
IN([2001], [2002], [2003], [2004])) AS P
UNPIVOT does not return rows for cells that contained NULL values. To clean up the extra table and index I've built in the database, run the following code:
DROP INDEX SalesOrderHeader.idx_nc_OrderDate
DROP TABLE SalesPivoted
Triggers and Notifications
SQL Server Yukon Beta 1 introduces support for Data Definition Language (DDL) triggers, allowing you to trap DDL operations and react to them, optionally rolling back the activity. DDL triggers work synchronously, right after the triggering event, similar to the way triggers from previous versions of SQL Server worked. SQL Server also supports an asynchronous event consumption mechanism using notifications, which lets you subscribe to be notified when certain events occur.
The following trigger is created at the database level and captures DROP TABLE attempts:
CREATE TRIGGER prevent_drop_table ON DATABASE FOR DROP_TABLE
AS
RAISERROR('Not allowed to drop tables.', 10, 1)
ROLLBACK
-- For debug
PRINT 'DROP TABLE attempt in database ' + DB_NAME() + '.'
PRINT EventData()
GO
You can define the trigger to fire for specific DDL events such as CREATE_TABLE, DROP_TABLE, ALTER_TABLE, CREATE_VIEW, and so on, or you can specify DDL_DATABASE_LEVEL_EVENTS if you want the trigger to fire for all DDL events in the database. Within the trigger you can invoke the EventData function to return information about the process and activity that fired the trigger. You can investigate the XML returned by the function and react accordingly.
To test the trigger, first create the table TestDrop and insert a row into it by running the following code:
CREATE TABLE TestDROP(col1 INT)
INSERT INTO TestDROP VALUES(1)
Next, attempt to drop the table:
The DROP attempt is trapped and a message is printed indicating that it is not allowed to drop the table. Also, the return value of the EventData function is printed in XML format for debug purposes. (In practice, within the trigger you can examine the XML data, which contains a lot of useful information, in order to determine which course of action best suits your needs. For example, you can prevent dropping certain tables at specific times of the day.) The trigger rolled back the activity, so the table is not dropped from the database. To drop the trigger, issue the following code statement:
DROP TRIGGER prevent_drop_table ON DATABASE
You can also create a trigger that captures server-level events. For example, the following trigger captures login manipulation events such as creating, altering, or dropping a login:
CREATE TRIGGER audit_ddl_logins ON ALL SERVER
FOR CREATE_LOGIN, ALTER_LOGIN, DROP_LOGIN
AS
PRINT 'DDL LOGIN took place.'
PRINT EventData()
GO
In this case, the trigger merely prints a notice that the event took place along with details of the event. But, of course, you can investigate the event details and react accordingly. To test the trigger, run the following code and then take a look at the result:
CREATE LOGIN login1 WITH PASSWORD = '123'
ALTER LOGIN login1 WITH PASSWORD = 'xyz'
DROP LOGIN login1
A DDL login event was identified and the event data was produced in XML format. You could examine the event data and audit the information that is important to you, if you'd like.
If you want to drop the trigger, run the following code:
DROP TRIGGER audit_ddl_logins ON ALL SERVER
Wrapping It Up
The T-SQL enhancements and new features provided with Yukon allow you to manipulate data more efficiently, develop applications more easily, and increase your error handling capabilities. When dealing with data manipulation, T-SQL is still the best development choice in SQL Server, and now you have a richer environment for your development. To make it easier for you to experiment with this new feature set, all of the samples described in this article are available for download from the link at the top of this article.
For background information see:
SQL Server
http://www.sqlmag.com
Itzik Ben-Gan is a Principal Mentor with Solid Quality Learning. He teaches, lectures, writes, and consults about SQL Server internationally. Itzik coauthored the book
Advanced Transact-SQL for SQL Server 2000 (APress, 2000). He manages the Israeli SQL Server Users Group and is a SQL Server MVP. Contact him at
itzik@solidqualitylearning.com.