A Survey of Microsoft SQL Server 2000 XML Features

Article
06/30/2006

Andrew Conrad
Microsoft Corporation

July 26, 2001 (Revised March 2004)

Greetings from Redmond! I hope that you're enjoying your summer. I've been quite busy with a jam-packed conference schedule. If you were fortunate enough to attend Tech Ed this year, I'm sure you had an exciting time. I know I did. Living by the adage that all good things take time, I'm posting an article written by an associate of mine who had a bit more time this past month to craft an extreme article. Keeping in good form, Andy tried to beat my previous article in the length category. This is a great article for anyone that hasn't taken a close look at the XML features in SQL Server 2000. This article is intended to provide you with a roadmap of SQL Server XML features, focusing heavily on recently released features shipped through our Web release program. I would like to thank Andy for his time and energy on this article.

– Eric Schmidt

Remember those good old 100 level survey classes from college? Basically, take a fairly broad topic and cover it at light speed, highlighting only the important facts and events. It seemed these classes were best suited for the humanities. For example, I took The History of Western Civilization: 482 – present. Ten minutes into the first lecture, the Roman Empire had fallen. I skipped a class or two and missed the Renaissance and Napoleonic Wars entirely. For those of us from the Math/ Computer Science department, it was a good way to satisfy some graduation requirements, and learn just enough to answer a few Jeopardy questions.

Well, welcome to XML for SQL Server 101. I'm Professor SQLXML, and for the remainder of this column, I am going to teach a survey of the XML features provided by SQL Server 2000. As with other survey classes, the point here is not to get into any great detail on any one topic, but to give you an idea of the XML features provided by SQL Server 2000, and how they can be used to solve some common application development problems and scenarios. At the end of the column, I will also discuss where you can get more information about the XML support provided by SQL Server 2000.

Note This column is based on the SQL Server XML functionality which is available as of the XML for Microsoft SQL Server 2000 Web Release 2 Beta 1. Since SQL Server 2000 was released in fall 2000, the WebData group at Microsoft has provided three fully supported upgrades to enhance and improve the XML features. These frequent releases are a result of customer demand and the fast-changing world of XML standards. For example, SqlXml 2.0 added support for XSD, based on the W3C recommendation. To download the latest Web release or to find more information on the new features offered in the SQLXML Web Releases, see https://msdn.microsoft.com/sqlxml.

Data as XML

You probably wouldn't be reading this column if you didn't already understand the benefits of representing your data as XML. However, I will give a quick overview for those of you that decided to sleep through XML 101.

For application developers, XML is really about the following benefits:

XML can represent semi-structured data in a much more natural way than the traditional data structures, such as ADO recordsets. For example, every Customer element only contains a CellPhoneNumber sub-element as needed.
XML is hierarchical. For most of us, our business object relationships are hierarchical and not flat. But in the relational world, we are forced to persist our data in to flat relation tables and expose relationships between objects as key relationships. In XML, these relationships can be expressed through hierarchical containment. For example, a Customer element has an address sub-element.
XML is a ubiquitous, platform-independent, human-readable format. Okay, this is a lot of the XML hype and marketing lingo, but there is some truth here. If we can generate XML from our relational data, then we are no longer tied to a particular platform or programming language. In fact, this would allow us to expose and consume relation data to and from any client application.

Storing Data in Relational Database

Okay, nothing earth shattering here, but if you have data that needs to be accessible and updateable by multiple users, by far the best place to store the data is in a DBMS like SQL Server 2000. In fact, most of you are quite comfortable querying and modifying the data using traditional client data access mechanisms like ADO and DAO. This means issuing TSQL queries and accessing/modifying the data through an ADO Recordset object.

Exposing Relation Data as XML

So we now all agree that XML is the best way to expose hierarchical, semi-structured business data, and we also agree that the best place to store data is an a relational database, but the real question is "How do I expose relational data as XML?" This is where XML for SQL Server comes in to play. The entire point of the product is to map relation data to and from XML. This is accomplished by a number of features that we will discuss shortly. However, before we even get stared talking about the features, I'm going to show you the final exam for this class. Now I know that is a bit strange, but stick with me. I do have a PhD! (Never mind that I ordered it off the Internet for $59.95.)

The Problem

Suppose that I need to develop a middle tier application over an existing, legacy customer information database. (For this article, we will assume that my legacy data source is the Northwind database included with SQL Server 2000). The point of this application is to expose/consume customer data to and from a client that only has HTTP access. As we discussed earlier, the obvious choice to format the data is XML.

Further, after reading Eric Schmidt's Using Schema and Serialization to Leverage Business Logic column from last month, the client application developer and I decide that it is probably a good idea to start off by developing an XSD schema that defines the content and format of our Customer instance documents:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >

    <xsd:element name="CustomerRecord" >
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="Contact" type="xsd:string" /> 
                <xsd:element name="Title" type="xsd:string" /> 
                <xsd:element name="Company" type="xsd:string" />
                <xsd:element ref="Order" />
            </xsd:sequence>
                <xsd:attribute name="ID" type="xsd:string" />   
        </xsd:complexType>
    </xsd:element>

    <xsd:element name="Order" >
        <xsd:complexType>
            <xsd:sequence>    
                <xsd:element ref="Items" />
            </xsd:sequence>
            <xsd:attribute name="OrderID" type="xsd:int"/>     
            <xsd:attribute name="Ordered" type="xsd:dateTime"/>
            <xsd:attribute name="Shipped" type="xsd:dateTime"/>
        </xsd:complexType>
    </xsd:element>

    <xsd:element name="Items" >
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element ref="ItemPurchased" />    
            </xsd:sequence>
            <xsd:attribute name="Price" type="xsd:int" />     
            <xsd:attribute name="Quantity"  type="xsd:int" />
        </xsd:complexType>   
    </xsd:element>

    <xsd:element name="ItemPurchased" type="Products" />

    <xsd:complexType name="Products">
        <xsd:sequence>
            <xsd:element name="Name" type="xsd:string" />    
            <xsd:element name="NumberInStock" type="xsd:int" />    
        </xsd:sequence>
    </xsd:complexType>   
   
</xsd:schema>

What we are really doing here is defining our customer business object. So now the problem, my customer and order data is currently residing in several relational tables and I want to expose that data as XML. I could write some traditional ADO code to select records from the Customers, Orders, Order Details, and Products table, and then build the XML myself by cursoring through the recordsets, nesting elements based on the key relationships, and adding all of the appropriate XML markup. I could also build a hierarchical recordset using the ADO shape provider and persist the recordset to XML; however, the format of XML supported by ADO is quite different from what my schema specifies. I could then write an XSLT script to translate the XML to format specified by the customer XSD schema, but that really seems like a lot of work. There has to be a better way.

The For XML Query Extension

In SQL Server 2000, the user can now execute SQL queries against relational database objects and return results as XML documents, rather than as standard rowsets. To retrieve the results of a query as XML, the user adds the For XML clause to the end of a standard rowset producing query and includes one of three modes: RAW, AUTO, or EXPLICIT. For example, I could run the following query either through ADO or Query analyzer:

Select firstname, lastname, employeeid from Northwind.._
Employess where employeeid = 1 for XML RAW

And the following XML is generated:

<row firstname="Nancy" lastname="Davolio" employeeid="1"/>

Wow! My application also needs to generate XML from relational data, so the For XML SQL extension seems to be exactly the tool I need. So let's try each of the modes, and see if we can generate XML that fits the structure specified by our Customer XSD schema.

I like simple, and the simplest For XML mode is RAW mode. Let's give that one a try first. RAW mode transforms each row in the rowset into an XML element with the generic identifier "row." Each column value that is not NULL is mapped to an attribute of the XML element in which the attribute name is the same as the column name.

So, let's give RAW mode a whirl. For the sample (and for the remainder of the code samples in this column), I'm going to issue queries using the standard ADO mechanism. In fact, I will use practically the same code that I would use for a recordset producing ADO command with one exception—I will ask for the results to be written to an output stream instead of a recordset. If you think about this, it makes a lot of sense. A semi-structured, hierarchical XML document really doesn't fit nicely into a traditional, structured recordset. And who knows what we will want to do with the XML after the query executes—perhaps pass it to the DOM, apply an XSLT style sheet, or even just publish the document over HTTP—so returning the XML document in a stream gives the user some flexibility.

For XML queries requires that all the results are in a single rowset, and I accomplish this by joining the tables I am interested in using a standard SQL join query:

Public Function ForXMLRawQuery()

    Dim adoConn     As New ADODB.Connection
    Dim adoCmd      As New ADODB.Command
    Dim adoStream   As New ADODB.Stream

    adoConn.Open "provider=sqlxmloledb;data " & _
         provider=sqloledb;Server=(local);Database=northwind;Integrated Security='SSPI';"
    Set adoCmd.ActiveConnection = adoConn
    adoCmd.CommandType = adCmdText
    adoCmd.CommandText = "select Customers.ContactName, " & _
       "Customers.ContactTitle, " & _
       "Customers.CompanyName, Customers.CustomerID," & _
       "Orders.OrderID, Orders.OrderDate, Orders.ShippedDate, " & _
       "[Order Details].UnitPrice, [Order Details].Quantity, " & _
       "Products.ProductName , Products.UnitsInStock " & _
       "From Customers, Orders, [Order Details], Products " & _
       "Where Customers.CustomerID = Orders.CustomerID " & _
       "and Orders.OrderID = [Order Details].OrderID " & _
       "and [Order Details].ProductID = Products.ProductID " & _
       "and Customers.CustomerID='GROSR' for XML RAW"
    
    adoStream.Open
    adoCmd.Properties("Output Stream").Value = adoStream
    adoCmd.Execute , , adExecuteStream
   
    Debug.Print adoStream.ReadText

End Function

Generates the following XML:

<row ContactName="Manuel Pereira" ContactTitle="Owner"  
   CompanyName="GROSELLA-Restaurante" CustomerID="GROSR" OrderID="10268" 
   OrderDate="1996-07-30T00:00:00" ShippedDate="1996-08-02T00:00:00" 
   UnitPrice="99" Quantity="10" ProductName="Thüringer Rostbratwurst" 
   UnitsInStock="0"/>
<row ContactName="Manuel Pereira" ContactTitle="Owner" 
   CompanyName="GROSELLA-Restaurante" CustomerID="GROSR" OrderID="10268" 
   OrderDate="1996-07-30T00:00:00" ShippedDate="1996-08-02T00:00:00" 
   UnitPrice="27.8" Quantity="4" ProductName="Mozzarella di Giovanni" 
   UnitsInStock="14"/>
<row ContactName="Manuel Pereira" ContactTitle="Owner" 
   CompanyName="GROSELLA-Restaurante" CustomerID="GROSR" OrderID="10785" 
   OrderDate="1997-12-18T00:00:00" ShippedDate="1997-12-24T00:00:00" 
   UnitPrice="31" Quantity="10" ProductName="Ikura" 
   UnitsInStock="31"/>
<row ContactName="Manuel Pereira" ContactTitle="Owner" 
   CompanyName="GROSELLA-Restaurante" CustomerID="GROSR" OrderID="10785" 
   OrderDate="1997-12-18T00:00:00" ShippedDate="1997-12-24T00:00:00" 
   UnitPrice="7.75" Quantity="10" ProductName="Rhönbräu Klosterbier" 
   UnitsInStock="125"/>

Well, our results are XML, so our first goal is accomplished. However, the resulting XML fragment has several problems. First of all, it is not exactly the shape of XML that we are interested in. In other words, the XML is not valid for the XSD Customer schema that we have defined, and hence we have broken our agreement with our partner. Second, there is some duplicate information in the resulting XML document. Specifically, the document contains duplicated customer and order information. Since RAW mode is really just an XML representation of a rowset, this is a result of the data being duplicated in the rowset generated by the join. Bottom line, although it is nice that we now have an XML document instead of a rowset, this isn't going to satisfy our needs.

So let's next try AUTO mode. AUTO mode returns query results as nested XML elements. Each table in the from clause is represented as an XML element. The columns listed in the select clause become attributes or sub-elements, depending on whether the ELEMENTS option is specified.

Public Function ForXMLAutoQuery()

    Dim adoConn     As New ADODB.Connection
    Dim adoCmd      As New ADODB.Command
    Dim adoStream   As New ADODB.Stream

adoConn.Open "provider=sqlxmloledb;data " & _
       "provider=sqloledb;Server=(local);Database=northwind;Integrated Security='SSPI';"
    Set adoCmd.ActiveConnection = adoConn
    adoCmd.CommandType = adCmdText
    adoCmd.CommandText = "select Customers.ContactName, " & _
       "Customers.ContactTitle, " & _
       "Customers.CompanyName, Customers.CustomerID," & _
       "Orders.OrderID, Orders.OrderDate, Orders.ShippedDate, " & _
       "[Order Details].UnitPrice, [Order Details].Quantity, " & _
       "Products.ProductName , Products.UnitsInStock " & _
       "From Customers, Orders, [Order Details], Products " & _
       "Where Customers.CustomerID = Orders.CustomerID " & _
       "and Orders.OrderID = [Order Details].OrderID " & _
       "and [Order Details].ProductID = Products.ProductID " & _
       "and Customers.CustomerID='GROSR' for XML AUTO"
    
    adoStream.Open
    adoCmd.Properties("Output Stream").Value = adoStream
    adoCmd.Execute , , adExecuteStream
   
    Debug.Print adoStream.ReadText

End Function

Generates the following XML:

<Customers ContactName="Manuel Pereira" ContactTitle="Owner" 
CompanyName="GROSELLA-Restaurante" CustomerID="GROSR">
     <Orders OrderID="10268" OrderDate="1996-07-30T00:00:00" 
     ShippedDate="1996-08-02T00:00:00">
       <Order_x0020_Details UnitPrice="99" Quantity="10">
         <Products ProductName="Thüringer Rostbratwurst"   
          UnitsInStock="0"/>
       </Order_x0020_Details>
       <Order_x0020_Details UnitPrice="27.8" Quantity="4">
         <Products ProductName="Mozzarella di Giovanni" 
          UnitsInStock="14"/>
       </Order_x0020_Details>
     </Orders>
     <Orders OrderID="10785" OrderDate="1997-12-18T00:00:00" 
      ShippedDate="1997-12-24T00:00:00">
       <Order_x0020_Details UnitPrice="31" Quantity="10">
         <Products ProductName="Ikura" UnitsInStock="31"/>
       </Order_x0020_Details>
       <Order_x0020_Details UnitPrice="7.75" Quantity="10">
         <Products ProductName="Rhönbräu Klosterbier" 
          UnitsInStock="125"/>
       </Order_x0020_Details>
       </Orders>
</Customers>

Now that seems much better then the results we got from RAW mode. In this case, we no longer have the duplicate customer and order data, and the shape of the XML seems much closer to what is specified by our customer XSD schema; however there are still a few problems. For example, there exist some inconsistencies in the element and attribute names between the XSD schema and the instance document. We can actually solve this problem by using aliases as part of the query. For example, AUTO mode is smart enough to recognize that you want CustomerID column to become an ID attribute in the resulting XML as long as you correctly alias the column in the select clause.

However, there is another problem. Look at the Customer element. In our XSD schema, the Customer element contains Contact, Title, and Company as sub-elements and not attributes as returned in the XML. As discussed above, we could modify our query to specify the ELEMENTS option, but the option maps all the columns to sub-elements, and our XSD schema specifies that only some of the content should be sub-elements. We could probably easily solve this problem by applying an XSLT transformation to resulting XML, but there is a way to avoid this problem. For the answer will take a look at the For XML EXPLICT mode.

In an EXPLICIT mode, the query writer controls the shape of the XML document returned by the execution of the query. The query must be written in a specific way so that the additional information about the expected nesting is explicitly specified as part of the query. You can also specify additional configurations at the column level using the directives. For example, the user can specify whether a given column appears as element or attribute content.

There is a good tutorial of I EXPLICIT mode and how to write the queries correctly in the SQL Server books online, so I not going to dive into any of those details at this point. I will however, give you a glimpse of an EXPLICIT mode query that will generate valid XML for our Customer XSD schema:

select 1 as TAG,0 as parent,
   ContactName as [CustomerRecord!1!Contact!element],
   ContactTitle as [CustomerRecord!1!Title!element],
   CompanyName as [CustomerRecord!1!Company!element],
   CustomerID as [CustomerRecord!1!ID],
   NULL as [Order!2!Shipped],
   NULL as [Order!2!Ordered],
   NULL as [Order!2!OrderID],
   NULL as [Items!3!ProductID!hide],
   NULL as [Items!3!Quantity],
   NULL as [Items!3!Price],
   NULL as [ItemPurchased!4!Name!element],
   NULL as [ItemPurchased!4!NumberInStock!element]
 from Customers  WHERE CustomerID = 'GROSR'
 UNION ALL
 select 2,1, NULL, NULL, NULL, NULL,
   Orders.ShippedDate, Orders.OrderDate, Orders.OrderID,
   NULL, NULL, NULL, NULL, NULL
 from Orders, Customers  WHERE Orders.CustomerID = Customers.CustomerID 
AND Customers.CustomerID = 'GROSR'
 UNION ALL
 select 3,2, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
   [Order Details].productid,
   [Order Details].Quantity,
   [Order Details].UnitPrice,
   NULL, NULL
 from Orders, Customers, [Order Details]  WHERE Orders.CustomerID = 
Customers.CustomerID AND [Order Details].Orderid = Orders.OrderID
               AND Customers.CustomerID = 'GROSR'   
 UNION ALL
 select 4,3, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
   NULL,
   Products.ProductName,
   Products.UnitsInStock
 from Orders, Customers, [Order Details], Products  WHERE 
Orders.CustomerID = Customers.CustomerID AND [Order Details].Orderid = 
Orders.OrderID
               AND [Order Details].ProductID = 
Products.ProductID AND Customers.CustomerID = 'GROSR'   
 FOR XML EXPLICIT

I don't think I need to convince anyone that this process does not seem like a very simple and maintainable query. However, if you are really comfortable with the SQL query language and want to be able to extensively shape relational data as XML, then EXPLICIT queries may be the way to go. However, for the rest of us mere mortals, there is a much easier way.

Annotated Mapping Schemas

So let's revisit our Customer XSD schema. The end goal was to generate XML that validated against the schema. Instead of writing SQL queries to generate the document, let's try mapping the schema directly to our database structures.

To accomplish this, SQL Server has introduced a number of annotations to the XSD schema language that specify the XML-to-relational mapping. The basics of these annotations includes mapping between elements and attributes in the XSD schema to tables/views and columns in the databases. By default, an element name in an annotated schema maps to a table (view) name in the specified database, and the attribute name maps to the column name. These annotations can also be used to specify the hierarchical relationships in XML (thus, representing the relationships in the database).

Here is our Customer XSD schema, but now with the appropriate mapping annotations:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
       xmlns:sql="urn:schemas-microsoft-com:mapping-schema">

    <xsd:element name="CustomerRecord" sql:relation="Customers" >
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="Contact"  sql:field="ContactName" type="xsd:string" /> 
                <xsd:element name="Title" sql:field="ContactTitle" type="xsd:string" /> 
                <xsd:element name="Company"  sql:field="CompanyName" type="xsd:string" />
                <xsd:element ref="Order" sql:relationship="CustomerOrder" />
            </xsd:sequence>
                <xsd:attribute name="ID" sql:field="CustomerID" type="xsd:string" />   
        </xsd:complexType>
    </xsd:element>

    <xsd:element name="Order" sql:relation="Orders" >
        <xsd:complexType>
            <xsd:sequence>    
                <xsd:element ref="Items" sql:relationship="OrderOrderDetails" />
            </xsd:sequence>
            <xsd:attribute name="OrderID" type="xsd:int"/>     
            <xsd:attribute name="Ordered" sql:field="OrderDate" type="xsd:dateTime"/>
            <xsd:attribute name="Shipped" sql:field="ShippedDate" type="xsd:dateTime"/>
        </xsd:complexType>
    </xsd:element>

    <xsd:element name="Items" sql:relation="[Order Details]" >
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element ref="ItemPurchased" 
sql:relationship="OrderDetailsProducts" />    
            </xsd:sequence>
            <xsd:attribute name="Price" sql:field="UnitPrice" type="xsd:int"/>     
            <xsd:attribute name="Quantity"  type="xsd:int"/>
        </xsd:complexType>   
    </xsd:element>

    <xsd:element name="ItemPurchased" type="Products" sql:relation="Products" />

    <xsd:complexType name="Products">
        <xsd:sequence>
            <xsd:element name="Name"  sql:field="ProductName" type="xsd:string" /> 
            <xsd:element name="NumberInStock"  sql:field="UnitsInStock" type="xsd:int" /> 
        </xsd:sequence>
    </xsd:complexType>

    <xsd:annotation>
        <xsd:appinfo>
            <sql:relationship name="CustomerOrder"                
                              parent="Customers"
                              parent-key="CustomerID"
                              child="Orders"
                              child-key="CustomerID" />

            <sql:relationship name="OrderOrderDetails"                
                              parent="Orders"
                              parent-key="OrderID"
                              child="[Order Details]"
                              child-key="OrderID" />

            <sql:relationship name="OrderDetailsProducts"              
                              parent="[Order Details]"
                              parent-key="ProductID"
                              child="Products"
                              child-key="ProductID" />
        </xsd:appinfo>
    </xsd:annotation>
   
</xsd:schema>

Basically, all that I've done here is provide an XML-to-relation table mapping. In other words, this is very similar to a traditional relation view accept the result is XML instead of a rowset. In fact, by building this mapping, I have developed what we call the XML View. The XML View can be thought of as an abstraction layer over the relational database. By providing this XML View, the user is able to expose traditional relational data in a hierarchical, semi-structured manner, which in reality, is often a much more natural way to expose business objects.

So let's take a closer look at the additions I made to the schema. First of all, I added the XML for SQL Server namespace and gave it the sql alias. For the mapping annotations to be properly recognized, you must provide this namespace and use it for all the mapping specific annotations. Next, I mapped all the attributes and elements that I knew mapped to columns in the database using the sql:field annotation. Note that a few of the attributes and elements have the same name as the column in the database to which I want to map. For these cases, no annotation was needed because the default mapping is adequate. I then went through the remaining elements in the schema and mapped those to tables in the database using the sql:relation annotation. The sql:relation annotation, because of scoping rules, also applies to all the column mapped attributes and sub-elements. Finally, I needed to specify how the elements mapped to tables would be joined. In other words, create the hierarchical shape of the XML View. If you look at the bottom of the schema, in the xsd:annotation section, I have defined three join relationships that specify exactly the same join information as the joins in our I sql queries, except they are specified using the sql:relationship annotation. To use these relationships elsewhere in the document, I have given each an alias. I then use the sql:relationship alias to specify the relational join information when I "join" elements. For example, I want to nest Order information inside of Customer information, so I use the CustomerOrder sql:relationship that specifies the join between the underlying tables in the database.

We have now completely defined an XML View for Customer information in our database. All it took was a couple of quick annotations to an XSD schema. As you can see, this seems much simpler and maintainable than the proceeding explicit query. In fact, truth be told, the XML for SQL Server code is really translating the mapping information you have provided into a For XML explicit mode query.

For the example above, we only used a few of the mappings annotations available. There are several other annotations which all provide different ways to expose the relational data as XML. For example, there are annotations to add CDATA sections, to provide constant elements, and even to expose open XML content that does not map to a specific column in the database. For more information, see the XML for SQL Server books online.

So, we've mapped XML to our relation database structures, but we still haven't produced an instance document. What our XML View needs now is a query mechanism.

XPath Queries

Traditional SQL queries are good tools for filtering and joining relational tables and views. However, our XML View doesn't look anything like relational tables and views. So instead of SQL, the query mechanism for the XML View is the XPath query language. XPath (XML Path Language) is a graph navigation language. XPath is used to select a set of nodes from an XML document. Each XPath operator selects a node-set based on a node-set selected by a previous XPath operator. XML for SQL Server supports a subset of the XPath language that allows the user to execute queries against the virtual XML View provided by the annotated mapping schemas. So let's try querying against our annotated XSD Schema.

Since ADO now supports executing XPath queries, we can use our same code examples with a couple of changes:

 Public Function XPathQuery()

    Dim adoConn     As New ADODB.Connection
    Dim adoCmd      As New ADODB.Command
    Dim adoStream   As New ADODB.Stream

    adoConn.Open "provider=sqlxmloledb;data " & _
         provider=sqloledb;Server=(local);Database=northwind;Integrated Security='SSPI';"
    Set adoCmd.ActiveConnection = adoConn
    adoCmd.CommandType = adCmdText
    adoCmd.Dialect = "{EC2A4293-E898-11D2-B1B7-00C04F680C56}"
    adoCmd.Properties("Mapping Schema") = "customers.xsd"
    adoCmd.CommandText = "/CustomerRecord[@ID='GROSR']"
    
    adoStream.Open
    adoCmd.Properties("Output Stream").Value = adoStream
    adoCmd.Execute , , adExecuteStream
   
    Debug.Print adoStream.ReadText

End Function

First, we have specified a different command dialect. (The SQL dialect is the default). In this case, we are specifying that the command text is an XPATH query. This is done by setting the command dialect property to the globally unique identifier (GUID) that represents the XPath dialect. Second, we are setting the mapping schema property for the command, giving it the file path to the schema we have annotated. And lastly, we are setting the command text to an XPATH query that will retrieve all the CustomerRecord elements where the attribute ID is equal to GROSR. So here is the XML instance document that is returned:

<CustomerRecord ID="GROSR">
   <Contact>Manuel Pereira</Contact>
   <Title>Owner</Title>
   <Company>GROSELLA-Restaurante</Company>
   <Order OrderID="10268" Ordered="1996-07-30T00:00:00" 
      Shipped="1996-08-02T00:00:00">
      <Items Price="99" Quantity="10">
         <ItemPurchased>
            <Name>Thüringer Rostbratwurst</Name>
            <NumberInStock>0</NumberInStock>
         </ItemPurchased>
      </Items>
      <Items Price="27.8" Quantity="4">
         <ItemPurchased>
            <Name>Mozzarella di Giovanni</Name>
            <NumberInStock>14</NumberInStock>
         </ItemPurchased>
      </Items>
   </Order>
   <Order OrderID="10785" Ordered="1997-12-18T00:00:00" 
      Shipped="1997-12-24T00:00:00">
      <Items Price="31" Quantity="10">
         <ItemPurchased>   
            <Name>Ikura</Name>
            <NumberInStock>31</NumberInStock>
         </ItemPurchased>
      </Items>
      <Items Price="7.75" Quantity="10">
         <ItemPurchased>
            <Name>Rhönbräu Klosterbier</Name>
            <NumberInStock>125</NumberInStock>
         </ItemPurchased>
      </Items>
   </Order>
</CustomerRecord>

Wow! We have solved the problem. That document validates against our original customer XSD schema and is exactly the structure that was required. Further, after the initial effort of mapping to the XSD schema to the database structures, it actually became quite easy to query the XML View. However, the query we specified is actually quite simple, and doesn't even really display the true power of performing queries over hierarchical XML with XPATH. For example, what happens if I want to use our same XML View, but just return a list of items purchased by a given customer. The solution is actually quite simple with XPATH, I just modify my query to be:

/CustomerRecord[@ID='GROSR']/Order/Items/ItemPurchased

Which generates the XML:

<ItemPurchased>
   <Name>Thüringer Rostbratwurst</Name>
   <NumberInStock>0</NumberInStock>
</ItemPurchased>
<ItemPurchased>
   <Name>Mozzarella di Giovanni</Name>
   <NumberInStock>14</NumberInStock>
</ItemPurchased>
<ItemPurchased>
   <Name>Ikura</Name>
   <NumberInStock>31</NumberInStock>
</ItemPurchased>
<ItemPurchased>
   <Name>Rhönbräu Klosterbier</Name>
   <NumberInStock>125</NumberInStock>
</ItemPurchased>

The new query is basically selecting all the ItemPurchased elements where the ID is equal to GROSR. The new document generated is actually a sub-section of the virtual XML View defined by the mapping schema. Further, the For XML explicit query being generated under the covers is smart enough to only retrieve the relational data needed to construct the section of the XML View being returned as a result of the query, in this case, a few rows from the product table. In other words, all the node selection and filtering is happening on the server.

So the only problem with the results is that it is not well-formed XML because there is more then one root element. This is often the case because the result of an XPATH query can be a set of elements. However, we can solve this in one of two ways. We can use the XML Root ADO command property, which simply adds a root element with the specified name around the query results, or we can use an XML template.

An XML template is an XML file that can contain multiple XML producing commands and static XML text. As the template is processed, each command node in the template is executed and the results are inlined into the resulting XML. The template can also specify a post processing XSLT script. By using XML templates, the user can batch together several XML producing commands (XPATH and/ or For XML) and create complex XML documents. Let's execute a template that contains all the ItemsPurchase for a given customer, and also a list of employees that have worked with that customer:

 Public Function TemplateQuery()

    Dim adoConn     As New ADODB.Connection
    Dim adoCmd      As New ADODB.Command
    Dim adoStream   As New ADODB.Stream

    adoConn.Open "provider=sqlxmloledb;data _
    provider=sqloledb;Server=(local);Database=northwind;Integrated Security='SSPI';"
    Set adoCmd.ActiveConnection = adoConn
    adoCmd.CommandType = adCmdText
    adoCmd.Dialect = "{5D531CB2-E6Ed-11D2-B252-00C04F681B71}"
    adoCmd.CommandText = "<ROOT xmlns:sql=""urn:schemas-microsoft-com:xml-sql"" >" & _
    "<sql:query>" & _
    " select employees.Firstname, employees.Lastname  from employees, orders" & _
    " where employees.employeeid = orders.employeeid and orders.Customerid = 'GROSR'" & _
    " for XML auto" & _
  "</sql:query>" & _
  "<sql:xpath-query mapping-schema=""Customers.xsd"">" & _
   " /CustomerRecord[@ID='GROSR']/Order/Items/ItemPurchased" & _
  "</sql:xpath-query>" & _
" </ROOT>"
    
    adoStream.Open
    adoCmd.Properties("Output Stream").Value = adoStream
    adoCmd.Execute , , adExecuteStream
   
    Debug.Print adoStream.ReadText

End Function

Generates the following XML:

<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql">
   <employees Firstname="Laura" Lastname="Callahan"/>
   <employees Firstname="Nancy" Lastname="Davolio"/>
   <ItemPurchased>
      <Name>Thüringer Rostbratwurst</Name>
      <NumberInStock>0</NumberInStock>
   </ItemPurchased>
   <ItemPurchased>
      <Name>Mozzarella di Giovanni</Name>
      <NumberInStock>14</NumberInStock>
   </ItemPurchased>
   <ItemPurchased>
      <Name>Ikura</Name>
      <NumberInStock>31</NumberInStock>
   </ItemPurchased>
   <ItemPurchased>
      <Name>Rhönbräu Klosterbier</Name>
      <NumberInStock>125</NumberInStock>
   </ItemPurchased> 
</ROOT>

Okay, we now have valid XML. In addition, we have seen how we can group several different XML producing queries to produce a new, more extensible XML View. In fact, templates can also be parameter driven, so we could have made the CustomerID a parameter and passed the same value to both queries, in effect creating XML that produces a stored procedure. So, by using annotated mapping schemas and templates, we have created a simple yet powerful XML View query mechanism. However, querying is only half the story. What happens if I want to insert or change the XML View?

Updategrams and XML Bulkload

XML for SQL Server provides two mechanisms for persisting changes to the XML View: Updategrams and XML Bulkload. As with XPath, both use the annotated mapping schema to translate XML to relational tables. If you need to bulk insert large amounts of XML, then XML Bulkload is the tool for you. It is a scriptable COM component that is similar to traditional BCP, except the source is an XML file, which depending on your mapping schema, can include hierarchical, complex data.

For more DML-like changes to the XML View, updategrams are provided. Updategrams are executed through XML templates, and are just another template command node type. In the updategram, the user specifies the before and after XML instances. Then, based on the mapping schema, the changes are translated to appropriate sql insert/ update/delete statements. For example, if a CustomerRecord element appears in the before block, but not in the after block, then a delete sql statement against the Customer table is generated. Here is a sample updategram that updates a customer and inserts an order:

<ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram">
<updg:sync mapping-schema="Customers.XSD" >
  <updg:before>
      <CustomerRecord ID="GROSR">
<Contact>Manuel Pereira</Contact>
<Title>Owner</Title>
<Company>GROSELLA-Restaurante</Company>
             </CustomerRecord>   
  </updg:before>

  <updg:after>
      <CustomerRecord ID="GROSR">
<Contact>Manuel Pereira</Contact>
<Title>Co-Owner</Title>
<Company>GROSELLA-Restaurante</Company>
<Order OrderID="10268" Ordered="2000-07-30T00:00:00" 
Shipped="2000-08-02T00:00:00"/>
   </CustomerRecord>   
  </updg:after>
</updg:sync>
</ROOT>

There are a couple of things to note here. First of all, updategrams must have the appropriate namespace (urn:schemas-microsoft-com:xml-updategram) included in the template. Next, the before and after blocks are included in a sync block, which translates into a single relation transaction. In other words, if you have several changes to the XML View that you are trying to persist to the database, and one of the sql statements fails, the entire transaction will be rolled back. Further, updategrams use optimistic concurrency that ensures that the updated data hasn't been altered by another user application since it was read from the database. This is done in updategrams by including these values in the <before> block of the updategrams. Before updating the database, the updategram checks the values specified in the <before> block against the values currently in the database to ensure that the update is valid. The reason for using this type of concurrency control is that most clients who are using the XML View are disconnected clients. In fact, once you have queried against the XML View, you have basically taken a snapshot of the data, and have no way of knowing if it has changed. For most Internet application scenarios, this is a design reality.

Conclusion

So there you have it, XML for SQL Server 101. Hopefully you have learned a thing or two about the XML features provided by SQL Server, and have some idea of how the XML View can be used in your application domain. This was really just a survey class and perhaps some of you are interested in learning more, or perhaps, even majoring in XML for SQL Server. Some of you are probably wondering where to sign up for the 200 level course. The place to go is the MSDN XML site: http://www.msdn.microsoft.com/xml. This is where you will find information on the XML for SQL Server Web Releases and links to the latest product updates. New release are coming out every few months, so check back often for updates to the product. Note, these are not service pack like releases, but complete, supported new versions of XML for SQL Server and contain whole new feature sets. Well, this professor has to run—got to figure out how this tenure thing works. Good luck and good mapping.

A Survey of Microsoft SQL Server 2000 XML Features

Data as XML

Storing Data in Relational Database

Exposing Relation Data as XML

The Problem

The For XML Query Extension

Annotated Mapping Schemas

XPath Queries

Updategrams and XML Bulkload

Conclusion

Additional resources