Click to Rate and Give Feedback
Related Articles

There's always been disagreement about whether large blobs, such as document and multimedia items, should be stored in the database or file system. In SQL Server 2008 you don't have to choose; filestream storage provides the best of both approaches.

Bob Beauchemin

MSDN Magazine May 2009

...

Read more!

Here we examine data persistence patterns to help you determine which best suits your needs. We look at a number of patterns, including the Active Record, the Data Mapper, the Repository, the Identity Map, the Lazy Loading, and the Virtual Proxy.

Jeremy Miller

MSDN Magazine April 2009

...

Read more!

Here Charles Petzold explains several techniques for improving the performance of ItemsControls.

Charles Petzold

MSDN Magazine March 2009

...

Read more!

Writing your JavaScript code in C# and having it converted, Eric Lippert’s programming blog, and reading up on T-SQL 2008.

Scott Mitchell

MSDN Magazine January 2009

...

Read more!

In this article the author looks at two versions of the same application--one consuming an on-premise Data Service, and one consuming an Azure Table Data Service to illustrate data access in the cloud.

Elisa Flasko

MSDN Magazine May 2009

...

Read more!

Also by this Author

WPF is one of the most important new technologies in the .NET Framework 3.0. This month John Papa introduces its data binding capabilities.

John Papa

MSDN Magazine December 2007

...

Read more!

Here we build a syndicated news reader application to illustrate the use of isolated storage and data syndication in Silverlight.

John Papa

MSDN Magazine February 2009

...

Read more!

The System.Transactions namespace of the Microsoft .NET Framework makes handling transactions much simpler than previous techniques. Read all about it this month.

John Papa

MSDN Magazine November 2006

...

Read more!

With the Entity Framework, developers are given more flexibility by being able to design around a conceptual model rather than a relational data model. To get started, you need to know how to design an Entity Data Model. John Papa walks you through that process.

John Papa

MSDN Magazine February 2008

...

Read more!

SQL Server Management Objects offer developers a robust toolset for backing up and restoring databases, and issuing DDL commands, as John Papa explains.

John Papa

MSDN Magazine June 2007

...

Read more!

Popular Articles

Jason Clark

MSDN Magazine July 2003

...

Read more!

Kenny Kerr sings the praises of the new Visual C++ 2008 Feature Pack, which brings modern conveniences to Visual C++.

Kenny Kerr

MSDN Magazine May 2008

...

Read more!

One-time passwords offer solutions to dictionary attacks, phishing, interception, and lots of other security breaches. Here's how it all works.

Dan Griffin

MSDN Magazine May 2008

...

Read more!

Writing a Web application with ASP.NET is unbelievably easy. So many developers don't take the time to structure their applications for great performance. In this article, the author presents 10 tips for writing high-performance Web apps. The discussion is not limited to ASP.NET applications because they are just one subset of Web applications.

Rob Howard

MSDN Magazine January 2005

...

Read more!

We introduce you to the benefits of building composite applications with the Composite Application Guidance for WPF from Microsoft patterns & practices.

Glenn Block

MSDN Magazine September 2008

...

Read more!

New information has been added to this article since publication.
Refer to the Editor's Update below.

Data Points
Using XQuery, New Large DataTypes, and More
John Papa

Code download available at: DataPoints0603.exe (116 KB)
Browse the Code Online

SQL Server 2005 introduces a lot of new features, but it also enhances the popular and oft-used Transact-SQL (T-SQL) language. Changes include the introduction of new datatypes to store large values using the MAX indicator, the integration of enhanced XML querying and data modification with XQuery, and the new XML datatype.
In this month's column I will address these and other features based on some of my most frequently received questions regarding SQL Server 2005 and its T-SQL enhancements. All sample code, which runs on the AdventureWorks database that ships with SQL Server 2005, is included in the download file. I also included some extra examples in the downloadable code for your extended enjoyment.

Q I know I can store XML in the new XML datatype in SQL Server 2005, but how can I query parts of the XML without pulling all of it out of the database using an XmlDataReader? For example, how do I filter or pull a subset of XML nodes?
Q I know I can store XML in the new XML datatype in SQL Server 2005, but how can I query parts of the XML without pulling all of it out of the database using an XmlDataReader? For example, how do I filter or pull a subset of XML nodes?

A The XML datatype is one of the brand new datatypes introduced to SQL Server 2005. It not only makes storing XML documents and fragments easier and cleaner than in previous versions of SQL Server, but it also provides mechanisms to interact with the XML data. SQL Server 2005 accomplishes the querying and modification of XML data by allowing the XML datatype's methods to use XQuery. XQuery is to XML what SQL is to relational data. (The XQuery language specification is located at www.w3.org/TR/xquery.)
A The XML datatype is one of the brand new datatypes introduced to SQL Server 2005. It not only makes storing XML documents and fragments easier and cleaner than in previous versions of SQL Server, but it also provides mechanisms to interact with the XML data. SQL Server 2005 accomplishes the querying and modification of XML data by allowing the XML datatype's methods to use XQuery. XQuery is to XML what SQL is to relational data. (The XQuery language specification is located at www.w3.org/TR/xquery.)
The XML datatype has at its disposal a subset of XQuery already built in. The XQuery methods accept XPath expressions that can be used to navigate the XML. There are five built-in functions (shown in Figure 1) that interact with the XML datatype. In fact, the five functions are accessible right from an XML variable using the following sample format, where @myXml is an XML variable:
@myXml.Query(@myXPathExpression)

Function Description
Query Gets a set of nodes from an XML document/fragment (returns XML)
Value Gets a single value from an element or attribute of an XML document/fragment (returns scalar value)
Exist Returns a Boolean value indicating if the XQuery expression returns values
Modify Changes values in an XML document/fragment
Nodes Gets a context to a node based upon the XQuery expression
Some examples would help illuminate the use of these new features. In the next few examples I will use a few of the AdventureWorks database's tables, which contain XML columns, for that purpose. First, let's assume that you want to grab all of the résumés from a set of prospective job candidates who have experience with computers. You can retrieve values from an XML document (resumes) stored in the XML datatype using the query method, as shown here:
SELECT 
    Resume.query('declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/
    Resume";
    data(/Resume/Name/Name.First)') FirstName
FROM   
    HumanResources.JobCandidate
WHERE    
    Resume.exist('declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/
        Resume";
    /Resume/Employment/Emp.OrgName[contains(.,"World")]') = 1
This example accesses the HumanResources.JobCandidate table's Resume column, which contains the candidate's résumé in XML format. Notice that the SELECT clause refers to the Resume column and issues the query method directly from it. The query method accepts an XPath expression that in this case evaluates to use a specific XML Schema Definition (XSD) as its default namespace. (This XSD happens to be included in the SQL Server 2005 AdventureWorks database.) By using this XSD as the default namespace there is no need to prefix all of the nodes in the XPath expression with the namespace prefix.
This XPath is referring to the first name node and selecting it to be returned. By using the data clause, I am telling the statement to return only the contents of the selected nodes. The WHERE clause in this example uses the exist method to evaluate if the company name has the word "World" in its title. The exist method returns either a 0 or a 1 to represent false or true, respectively. This statement will return a list of all of the first names where the person previously worked at a company that had the word "World" in its title:
   FirstName
   ----------------
   Stephen
   Peng
Removing the word "data" from this example would change the results to return the entire node and not just its contents:
FirstName
----------------
<Name.First xmlns="http://schemas.microsoft.com/sqlserver/2004/07/
adventure-works/Resume">Stephen</Name.First>
<Name.First xmlns="http://schemas.microsoft.com/sqlserver/2004/07/
adventure-works/Resume">Peng</Name.First>
So far I have touched on ways to query XML using the query and exist methods. You also can use the value method if all you are looking to retrieve is a single node's contents. The following is an example where I retrieve the list of e-mail addresses from the résumé pool:
SELECT 
    Resume.value(        'declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/
        Resume";
    (/Resume/EMail)[1]', 'VARCHAR(100)') as email 
FROM 
    HumanResources.JobCandidate
WHERE
    LEN(Resume.value('declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/
        Resume";
    (/Resume/EMail)[1]', 'VARCHAR(100)')) > 1
Notice the use of the value method in this example. I could have used the query method, but since I only want a single value and only its contents, the value function offers a bit cleaner and more efficient option in this case. For example, notice that the data keyword is not needed when using value since you are already referring to the contents of the nodes. The value method expects a singleton to be returned. In other words, it expects a single value to be returned so you need to specify which instance of the e-mail node you want to grab. If you have only a single row, it is most common to just code [1] to indicate the first instance. The value method also expects to be told to what datatype to convert the value. In this case I convert the e-mail address to a varchar(100). The query just shown returns the following results:
Shai@Example.com
Max@Wingtiptoys.com
Krishna@TreyResearch.net
Stephen@example.com
Tai@Example.com
You can also use the for, where, return, and order by statements to retrieve data. Figure 2 shows a command that embeds a for/return statement inside the query method's expression. In this case, the for statement iterates through the /Resume/Employment nodes. For each iteration, the current node is referenced by the variable $employer. Within the for statement you can add a where clause to filter the nodes you want to return data from as well. In this example I simply iterate through each Employment node and return the contents of its Emp.OrgName child node.
SELECT 
    JobCandidateID, 
    Resume.query('declare default element namespace
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
    for $employer in /Resume/Employment
    return data($employer/Emp.OrgName)
    ') Company
FROM 
    HumanResources.JobCandidate
WHERE
    Resume.exist('declare default element namespace
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
    /Resume/Employment/Emp.OrgName[contains(.,"World")]') = 1
Taking a closer look at the WHERE clause of the SQL statement in Figure 2, you will notice that I use the exist method to limit the rows. Here I use the contains statement to look for any /Resume/Employment/Emp.OrgName node whose content contains the word "World".
There are several variations of these commands that you can use in combination. The SQL Server Books Online has a complete list of the functions and statements available to the new XQuery capabilities of SQL Server 2005.

Q How can I modify parts of the XML in an XML datatype without overwriting it?
Q How can I modify parts of the XML in an XML datatype without overwriting it?

A SQL Server 2005 exposes a modify method on the XML datatype which provides the ability to insert, update, or delete XML data within an XML datatype. There are three different statements that can be used with the modify method: Insert, which is used to insert new nodes into the XML, Replace Value Of, which is used to change the value of one or more nodes in the XML, and Delete, which is used to delete nodes from the XML.
A SQL Server 2005 exposes a modify method on the XML datatype which provides the ability to insert, update, or delete XML data within an XML datatype. There are three different statements that can be used with the modify method: Insert, which is used to insert new nodes into the XML, Replace Value Of, which is used to change the value of one or more nodes in the XML, and Delete, which is used to delete nodes from the XML.
Keep in mind that all of these statements are issued in a SQL UPDATE statement. For example, when inserting data, you are actually inserting more XML into an existing row's XML datatype column in the database. All three of these methods manipulate data within an XML column. Their usage is quite similar to the querying methods (nodes, exist, value, and query) in that they accept an XPath expression to locate the nodes to manipulate. The insert syntax is as follows:
xmlColumn.modify(
'insert your-content 
{as first | as last} 
into | before | after 
your-xpath-expression')
The insert keyword is followed by the content or XML nodes that you want to insert. Then, optionally you can specify whether the content should be inserted first or last with respect to the XPath expression. Next, specify whether the content will be inserted into, before, or after the node referred to by the XPath expression. Finally, specify the XPath expression that locates the node that you want to refer to for the insert.
It's time for an example again. In this case let's examine some code that inserts a telephone extension node and value for the job candidate named Shai Bassli. Figure 3 shows the code that will insert a telephone extension node and value of 555 within the first Telephone node for Shai Bassli.
DECLARE @ext int
SET @ext = 555

UPDATE    HumanResources.JobCandidate
SET    Resume.modify('declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/
        Resume";
    insert <Tel.Extension>{sql:variable("@ext")}</Tel.Extension> 
into(/Resume/Address/Addr.Telephone/Telephone)[1]')
WHERE
    Resume.exist('declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works
        Resume";
    /Resume/Name/Name.Last[. = "Bassli"]') = 1
The syntax to modify a node is as follow:

xmlColumn.modify(
'replace value of 
your-xpath-expression
with
your-new_value 
')
If you look closely at the code in Figure 3 you will see that the extension's value is set using a variable like sql:variable("@ext"). This is a way to embed a local variable or parameter defined in the T-SQL within the XQuery statement. In this case I am passing the value of the extension. This feature makes XQuery much more flexible than if it didn't allow parameters.
The replace value of keywords are followed by an XPath expression that is evaluated and resolved to a node whose value you want to replace. For example, I could locate and replace the e-mail address of a person. Following the XPath expression is the with statement and then the new value. Figure 4 shows an example of how to locate Shai Bassli's telephone record and replace the extension of 555 with 777.
DECLARE @ext int
SET @ext = 555
DECLARE @newExt int
SET @newExt = 777

UPDATE    HumanResources.JobCandidate
SET    Resume.modify('declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/
        Resume";
    replace value of 
(/Resume/Address/Addr.Telephone/Telephone/Tel.Extension[.=sql:variable("@ext")])[1] 
    with xs:int(sql:variable("@newExt"))')
WHERE
    Resume.exist('declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/
        Resume";
    /Resume/Name/Name.Last[. = "Bassli"]') = 1
So far I have shown how to insert a telephone extension and modify its value. Now I will demonstrate how to delete the telephone extension from the XML without affecting the rest of the XML document, effectively returning the XML to its original state. The Delete syntax is as follows:
xmlColumn.modify('delete your-xpath-expression')
The syntax is pretty straightforward here; all you have to specify is the delete keyword and the XPath expression that resolves to the node that you want to remove. Be careful that you specify the correct node to remove because the delete command will remove the node you specify and all of its children. For example, if you specify /Resume as your XPath expression you will be deleting the entire résumé. OK, maybe that is obvious to you, but it is always good to be extra careful with deletes anyway.
The code in Figure 5 shows an example of how to delete (or remove) a node from an XML document (or fragment). Notice that I am deleting the node that has a telephone extension value of 777 and the last name of Bassli.
DECLARE @newExt int
SET @newExt = 777

UPDATE    HumanResources.JobCandidate
SET    Resume.modify('declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/
        Resume";
    delete 
(/Resume/Address/Addr.Telephone/Telephone/Tel.Extension[.=sql:variable("@newExt")])')
WHERE
    Resume.exist('declare default element namespace
    "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/
        Resume";
    /Resume/Name/Name.Last[. = "Bassli"]') = 1

Q I've heard that in SQL Server 2005 I might not have to store my large string data fields in columns using the text datatype because of the new large datatypes. Where have these new datatypes been all my life?
Q I've heard that in SQL Server 2005 I might not have to store my large string data fields in columns using the text datatype because of the new large datatypes. Where have these new datatypes been all my life?

A When SQL Server 7.0 was introduced, it raised the limit on VARCHAR fields to 8000 bytes. There were still issues, however, as many large string-based fields exceeded 8000 characters and had to either be broken into multiple rows, stored in a binary field (for example TEXT, NTEXT, or IMAGE) in the database, or even stored in a document in a file system. SQL Server 2005 blows the doors off the variable length fields' limits! The new MAX specifier can be used with the VARCHAR, NVARCHAR, and VARBINARY datatypes to declare variables that can store vast quantities of data with fewer limitations than the binary large object datatypes.
A When SQL Server 7.0 was introduced, it raised the limit on VARCHAR fields to 8000 bytes. There were still issues, however, as many large string-based fields exceeded 8000 characters and had to either be broken into multiple rows, stored in a binary field (for example TEXT, NTEXT, or IMAGE) in the database, or even stored in a document in a file system. SQL Server 2005 blows the doors off the variable length fields' limits! The new MAX specifier can be used with the VARCHAR, NVARCHAR, and VARBINARY datatypes to declare variables that can store vast quantities of data with fewer limitations than the binary large object datatypes.
A variable declared as VARCHAR(MAX) or VARBINARY(MAX) can store 231 (or about 2 billion) characters. A variable declared as NVARCHAR(MAX) can store 230or about 1 billion) characters. Unlike the binary large object datatypes, these new MAX datatypes can be passed as parameters to stored procedures, just like the new XML datatype, and they can be indexed or even parsed by string functions. Now, of course, I won't go crazy and define all of my string columns as VARCHAR(MAX) in my tables. However, these new fields can be great for note-based columns. For example, assuming your application needs to store a diary of daily events where the entries are all text and can be a quick sentence or seven pages of text. In the past you could solve this problem several ways, such as by storing this data in a TEXT column, limiting the notes to 8000 characters in a VARCHAR(8000), or even splitting the text up into multiple rows behind the scenes—like the text nvarchar(4000) column in the syscomments table. None of these solutions were great. With SQL Server 2005 you can now define a column to be a VARCHAR(MAX) and it can store the entire text in it, and even be indexed.
Ah, but you might remember that in SQL Server 2000 the reason for the limitation was page size. This meant that table rows could not exceed 8060 bytes (in SQL Server 7.0 and SQL Server 2000). You could define multiple columns on a table that exceeded this limit, but then you would get a warning indicating that if the data later exceeded the limit, you would get an error. Often, this meant DBAs would try to split up tables so they would not hit the limit. In SQL Server 2005, rows can span multiple pages, so the 8060 limit is no longer applicable, and you can define several columns on a single table as MAX types, if needed.
[ Editor's Update - 11/21/2006: Rows cannot span pages in SQL Server 2005, however portions of the row may be moved off the row's page so that the row can be very large. See msdn2.microsoft.com/ms190969.aspx for more information.]

Send your questions and comments for John to  mmdata@microsoft.com.


John Papa is a Senior .NET Consultant with ASPSOFT and a baseball fanatic who spends most of his summer nights rooting for the Yankees with his family and his faithful dog, Kadi. John, a C# MVP, has authored books on ADO, XML, and SQL Server, and can often be found speaking at industry conferences such as VSLive or blogging at codebetter.com/blogs/john.papa.

Page view tracker