Breaking the Spell on Stored Procedures and Functions
This is the first of three articles by Herts Chen on overcoming the inability of calling stored procedures in UDFs and SQL statements by using OPENQUERY. It discusses one of OPENQUERY's limitations and shows how to make OPENQUERY work around one of them.
One of the strengths of RDBMSs is their underlying set-based query language—SQL. A query is always more concise and better optimized if you can express it in a single SQL statement than if you have to implement it in a batch of sequential T-SQL scripts. For example, suppose you have a 200MB table that contains a half million records. If you update an un-indexed column of this entire table in a single UPDATE statement, it may take about two minutes. However, if you do the same modification by issuing a half million UPDATE statements in a WHILE loop, it will cost you nearly three hours. There are, however, situations or "use cases" in which it's impossible to derive the results in a single SQL statement. One alternative is to encapsulate sequential details in these use cases into a stored procedure or a user-defined function (UDF).
UDFs, of course, can be directly embedded within SQL statements. In a sense, functions increase the expressive power of SQL because they help SQL to modularize as many sequential details as necessary. Stored procedures, however, can't be expressed in a SQL statement, so from the perspective of coding a SQL statement, functions seem to be a more useful tool.
Unfortunately, functions can't completely replace procedures. In fact, SQL Server poses much tighter restrictions on functions than on procedures. For example, in functions, you can't change the global database state—for example, update a table. Also, SQL Server packages more of its system information into stored procedures—sp_who, for instance—than into system functions, such as user_name(). To put procedures and functions in perspective, in SQL 2000 master database alone, there are 973 stored procedures that help you SELECT, UPDATE, INSERT, or DELETE any imaginable system information. However, there are only 101 functions that let you SELECT but not change configuration, metadata, security, system, and system statistical information.
Sometimes you might wish you could encapsulate any system information in a single SELECT statement. For example, it would be nice to see just the blocking information of user processes by "SELECT spid, login, BlkBy FROM sp_who2 WHERE spid > 50". Other times, you might wish you could perform complex changes to the database state in a single INSERT, UPDATE, or DELETE statement. Unfortunately, intellectual fantasies like these will never come true—at least not until we can call stored procedures in SQL or within a function that in turn embeds in SQL. SQL Server currently prohibits calling any stored procedure or updating any table from within a function—a restriction that prevents us from using system information or doing some complex changes in a single SQL statement. This article is the first of a series that intends to break the "spell" on functions and procedures so that you can bring the best of them together in a SQL statement.
Set-based queriesThere are many DBCC commands or stored procedures (sp's) that return an output that looks like a result set, and it would be ideal if these outputs could be joined with other result sets directly. This would be handy, for example, when you're accessing these result sets programmatically, either to automate system administration tasks or to integrate data from multiple sources for an application. Without being able to join multiple result sets of DBCC or sp in a SQL statement, and without being able to declare a cursor over a DBCC or an sp, your only option is to use multiple temporary tables. Worse yet, this last resort is only possible if you know the data type of every column of a DBCC or sp result set so that you can create a temporary table before inserting the result set of the DBCC or sp into the temporary table. (This is because you can't use SELECT INTO to "blindly" create the temporary table from a DBCC or sp result set.) If the result set of a DBCC or sp is undocumented—for instance, sp_who2—and you create its temporary table solely based on one of its snapshots, this temporary table is likely to break when a later snapshot returns a longer string or a different data type. In Listing 1, I first show a much needed, but unsupported, set-based SQL statement that joins result sets of two sp's. Then I turn to the "last resort" alternative that relies on temporary tables.
Listing 1. Comparison of set-based query and procedural scripts.
--set-based query: --a much needed, but unsupported, SQL select * from (Exec sp_who) w inner join (Exec sp_lock) l on w.spid = l.spid where w.spid > 50 and w.blk <> 0 --a batch of scripts: --temporary tables must be created first before insert create table #who(spid smallint,ecid smallint,status nchar(30),loginame nchar(128), hostname nchar(128),blk char(5), dbname nchar(128), cmd nchar(16)) create table #lock(spid smallint,dbid smallint, ObjId int,IndId smallint,Type nchar(4), Resource nchar(16),Mode nvarchar(8),Status nvarchar(5)) insert #who Exec sp_who insert #lock Exec sp_lock select * from #who w inner join #lock l on w.spid = l.spid where w.spid > 50 and w.blk <> 0 drop table #who drop table #lock
As you can see, the SQL statement is far more concise and efficient than the batch of scripts. There are, however, system limitations that may force a query to depend on intermediate results of other queries and be broken down to sequential steps. One of the limitations is the inability of directly calling DBCC commands or stored procedures anywhere in a SQL statement. The following sections explore some very common scenarios.
Use case 1: Blocking processes stored procs and DBCC
That said, wouldn't it be better if you could interactively punch in a SQL statement in Query Analyzer and quickly retrieve just the blocking processes, the blocking application hosts, the objects they've locked, and the details of their last SQL batches? In an emergency, this result set helps you make an informed decision on which process to notify or kill. From a long-term perspective, the result set helps in reshaping the application's locking behavior. The following would be such an ideal SQL statement. Note that the BlkBy column of sp_who2 records the spid of a blocking process. If a process is not blocked, its BlkBy column will have a value of " . " instead of an spid. The head of a blocking chain thus is a process whose BlkBy is " . ", but whose spid shows up in at least one BlkBy.
select blocking.*, l.*, i.* from (exec sp_who2) blocking inner join (Exec sp_lock) l on l.spid = blocking.spid inner join (DBCC INPUTBUFFER(all)) i on l.spid = i.spid where blocking.BlkBy like '%.%' and blocking.spid in (select CAST(blocked.BlkBy as int) from (exec sp_who2) blocked where blocked.BlkBy not like '%.%')
Use case 2: Extended stored procedures
SELECT t.taxlot_id, i.*, v.* From (Exec xp_intersects('polygon_100_year_flood', 'taxlots', 50)) t inner join insurance i on t.taxlot_id = i.taxlot_id inner join market_value v on t.taxlot_id = v.taxlot_id
Use case 3: XML data sources
-- parameter @XMLDoc varchar(8000) contains XML string DECLARE @iDoc int Exec sp_xml_preparedocument @iDoc OUTPUT, @XMLDoc SELECT * FROM OpenXML(@iDoc, '/ROOT/Order',0) WITH (ProductID int, Quantity int) o inner join catalog c on o.ProductID = c.ProductID inner join warehouse w on c.WarehouseID = w.WarehouseID Exec sp_xml_removedocument @iDoc
sp_xml_preparedocument and sp_xml_removedocument are xp's for parsing the XML string in memory and de-allocating the memory, respectively.
Don't you think it would be handy if these steps could be coded in a single SQL statement by grouping the XML steps in a procedure such as sp_OpenXML?
-- parameter @XMLDoc varchar(8000) SELECT * FROM (Exec sp_OpenXML(@XMLDoc) o inner join catalog c on o.ProductID = c.ProductID inner join warehouse w on c.WarehouseID = w.WarehouseID)
The set-based approachIn order to fully utilize the power of set-based processing, we must look for some way or some tool that can help us create solutions for the use cases I've just described. The tool must be able to both participate in a SQL statement and return a result set directly from an sp, xp, OPENXML, and DBCC command. In the remainder of the article, I'll examine two such potential tools and identify the one that satisfies these two requirements.
UDFs might seem at first glance to be the logical choice. After all, they're the only user extension that can participate in SQL statements. Furthermore, UDFs can return either a table variable or a scalar value. For example, to satisfy the third use case, you might be tempted to try this:
--Function that encapsulates xp's and OPENXML --creation of function udf_xml succeeds Create function udf_xml(@XMLDoc varchar(8000)) Returns @t table(ProductID int, Quantity int) as begin declare @iDoc int Exec sp_xml_preparedocument @iDoc OUTPUT, @XMLDoc INSERT @t SELECT * FROM OPENXML(@idoc, '/ROOT/Order',0) WITH (ProductID int, Quantity int) Exec sp_xml_removedocument @iDoc return end GO --now join result set of udf_xml() in a SQL statement --execution of function udf_xml fails DECLARE @XMLDoc varchar(8000) set @XMLDoc = '<ROOT> <Order ProductID="11" Quantity="12"/> <Order ProductID="42" Quantity="10"/> <Order ProductID="72" Quantity="3"/> </ROOT>' select * from udf_xml(@XMLDoc) o inner join catalog c on o.ProductID = c.ProductID inner join warehouse w on c.WarehouseID = w.WarehouseID GO -- Error message output Server: Msg 557, Level 16, State 2, Procedure udf_xml, Line 6 Only functions and extended stored procedures can be executed from within a function.
Although udf_xml encapsulates the three steps of returning an XML result set successfully—and the join SQL statement is indeed as simple as it gets—the execution of udf_xml fails. In fact, functions alone can't solve any of the use cases presented here. This is simply due to the following limitations on functions:
- Functions can't execute stored procedures. A function can only execute functions and some xp's, but not sp's. This limitation immediately disqualifies function as a candidate for solving use cases 1 and 3.
- Functions can't create temporary or any tables. In order to return a result set of an xp, a function must be able to store the result set somewhere. Tables are a natural choice. Unfortunately, functions can't change the global database state. This includes the restriction of creating any temporary or permanent tables that will indeed change the global database state.
- Functions can't access temporary tables. Even if there were a pre-created global temporary table, SQL Server still disallows inserting data to this or any existing temporary table from within a function.
- Functions can't insert into an existing table. Another even more compromising idea is to create a permanent table outside of a function. However, inserting into an existing permanent table from within a function is considered changing global database state as well.
- The result set of executing DBCC, sp, or xp can't be inserted into a table variable. Well, what about using a table variable to store the result set? Unfortunately, you can't use EXECUTE to insert data into a table variable. However, EXECUTE is the only way within a function to issue a SQL string that may include DBCC, sp, and xp. Under this last limitation, we've just exhausted our last resort for functions, and have to conclude that UDFs simply aren't the answer to any of these use cases.
Which leaves us with OPENQUERYBased on my research, it seems to me that OPENQUERY is the only way to return the result set of sp, xp, and DBCC in a SQL statement. Still, OPENQUERY isn't perfect. For example, with OPENQUERY, you can easily return the result set of sp_who, but not sp_who2, as the following example shows:
--openquery returns the result set of sp_who just fine select * from openquery(csherts, 'Exec sp_who') go (14 row(s) affected) --openquery finds no result set from sp_who2 select * from openquery(csherts, 'Exec sp_who2') go Server: Msg 7357, Level 16, State 2, Line 1 Could not process object 'Exec sp_who2'. The OLE DB provider 'SQLOLEDB' indicates that the object has no columns.
OPENQUERY satisfies the first requirement since it can participate in the FROM clause of a SELECT statement. By using OPENQUERY, UDFs can indirectly call an sp or xp, OPENXML, or DBCC. In addition to the FROM clause, functions can appear anywhere in SQL. As to the second requirement, it's clear that OPENQUERY works for only some sp's, but not with xp's or DBCC. In fact, if OPENQUERY can't figure out the column names and types of the output result set based on the input SQL string, it will fail and simply say, "The object has no columns." For OPENXML, OPENQUERY gives you the first impression that everything's working just fine, as the first example in the following code block shows. However, a closer look at the second example reveals that OPENQUERY is unrealistic for OPENXML. That's because, in practice, an XML string is likely to be composed in a variable. Unfortunately, OPENQUERY doesn't take variables for its arguments.
Nevertheless, the combination of OPENQUERY and UDFs has the potential of supporting all these use cases. In other words, OPENQUERY is the only tool for breaking the spells on sp's and functions. In the next (the second) article of this series, I'll characterize the behavior of OPENQUERY in detail, summarize its limitations, and show how to overcome these limitations. I'll then illustrate the use of OPENQUERY in the implementation of the use cases discussed this month.
-- Example 1 OPENQUERY works for OPENXML only if XML -- is a constant string -- XML specified in the constant argument to OPENQUERY select * from openquery(csherts, ' declare @iDoc int Exec sp_xml_preparedocument @iDoc OUTPUT, ''<ROOT> <Order ProductID="11" Quantity="12"/> <Order ProductID="42" Quantity="10"/> <Order ProductID="72" Quantity="3"/> </ROOT>'' SELECT * FROM OPENXML(@idoc, ''/ROOT/Order'',0) WITH (ProductID int, Quantity int) Exec sp_xml_removedocument @iDoc ') go -- Output (3 row(s) affected) --Example 2 XML concatenates with OPENXML steps in a --variable argument to OPENQUERY. Illustrates that --OPENQUERY does not accept variable arguments DECLARE @XMLDoc varchar(8000) set @XMLDoc = '<ROOT> <Order ProductID="11" Quantity="12"/> <Order ProductID="42" Quantity="10"/> <Order ProductID="72" Quantity="3"/> </ROOT>' set @XMLDoc = ' declare @iDoc int Exec sp_xml_preparedocument @iDoc OUTPUT, ''' + @XMLDoc + ''' SELECT * FROM OPENXML(@idoc, ''/ROOT/Order'',0) WITH (ProductID int, Quantity int) Exec sp_xml_removedocument @iDoc ' select * from openquery(csherts,@XMLDoc) go -- Output Server: Msg 170, Level 15, State 1, Line 15 Line 15: Incorrect syntax near '@XMLDoc'.
[Related articles in past issues include Andrew Zanevsky's "Granting Users' Wishes with UDFs" (Sept. 2000), "Inline Table-Valued Functions" (Oct. 2000), "Multi-Statement Table-Valued Functions" (Nov. 2000), and "UDF Performance… or Lack of It" (with Anton Jiline, Oct. 2001); Tom Moreau's "Dynamic DTS Tasks and OPENROWSET" (Jan. 2001) and "How Do You Feed an Array to a Stored Procedure?" (April 2002); and Scott Whigham's "How to Write your Own System Functions" (Dec. 2001).—Ed.]
To find out more about SQL Server Professional and Pinnacle Publishing, visit their website at http://www.pinpub.com/html/main.isx?sub=57
Note: This is not a Microsoft Corporation website. Microsoft is not responsible for its content.
This article is reproduced from the May 2003 issue of SQL Server Professional. Copyright 2003, by Pinnacle Publishing, Inc., unless otherwise noted. All rights are reserved. SQL Server Professional is an independently produced publication of Pinnacle Publishing, Inc. No part of this article may be used or reproduced in any fashion (except in brief quotations used in critical articles and reviews) without prior consent of Pinnacle Publishing, Inc. To contact Pinnacle Publishing, Inc., please call 1-800-493-4867 x4209.