Printer Friendly Version      Send     
Click to Rate and Give Feedback
Related Articles

Building a data access layer using LINQ to SQL and the ADO.NET Entity Framework allows you to decouple your application from the persistence technology you're using.

Anthony Sneed

MSDN Magazine December 2008

...

Read more!

This month's column explains how to use Windows HTTP Services, or WinHTTP, the new, powerful API for implementing HTTP clients.

Kenny Kerr

MSDN Magazine August 2008

...

Read more!

Here the author introduces SQL Server Data Services, which exposes its functionality over standard Web service interfaces.

David Robinson

MSDN Magazine July 2008

...

Read more!

We introduce you to the benefits of building composite applications with the Composite Application Guidance for WPF from Microsoft patterns & practices.

Glenn Block

MSDN Magazine September 2008

...

Read more!

Here is an ASP.NET AJAX data-driven Web application that takes the best features from server- and client-side programming to deliver an efficient, user-friendly experience.

Bertrand Le Roy

MSDN Magazine October 2008

...

Read more!

Also by this Author

This article describes a collection of new programming frameworks that are part of "Longhorn," the upcoming version of Windows. "Indigo," the code name for this framework, provides rich support for service-oriented design that is complementary to traditional object-oriented approaches. Indigo marries the best features of .NET Remoting, ASMX, and .NET Enterprise Services into a unified programming and administration model. Indigo's deep support for standard protocols, including HTTP, XML, and SOAP, makes it easier to integrate applications and ...

Read more!

Don Box

MSDN Magazine November 2001

...

Read more!

Don Box

MSDN Magazine February 2002

...

Read more!

The common language runtime of the .NET Framework has its own secure execution model that isn't bound by the limitations of the operating system it's running on. In addition, unlike the old principal-based security, the CLR enforces security policy based on where code is coming from rather than who the user is. This model, called code access security, makes sense in today's environment because so much code is installed over the Internet and even a trusted user doesn't know when that code is safe.In this article, Don Box explains how code access ...

Read more!

The XSL Transformations (XSLT) specification defines an XML-based language for expressing transformation rules that map one XML document to another. XSLT has many of the constructs found in traditional programming languages, including variables, functions, iteration, and conditional statements. In this article you'll learn how to use the XSLT instructions and template rules, manage namespaces, control transformation output, use multiple stylesheets, and employ pattern-matching with template rules. A sidebar explains how to access XSLT from MSXML ...

Read more!

Popular Articles

Writing a Web application with ASP.NET is unbelievably easy. So many developers don't take the time to structure their applications for great performance. In this article, the author presents 10 tips for writing high-performance Web apps. The discussion is not limited to ASP.NET applications because they are just one subset of Web applications.

Rob Howard

MSDN Magazine January 2005

...

Read more!

One-time passwords offer solutions to dictionary attacks, phishing, interception, and lots of other security breaches. Here's how it all works.

Dan Griffin

MSDN Magazine May 2008

...

Read more!

Here we describe some of the more common challenges to concurrent programming and present advice for coping with them in your software.

Joe Duffy

MSDN Magazine October 2008

...

Read more!

C# allows developers to embed XML comments into their source files-a useful facility, especially when more than one programmer is working on the same code. The C# parser can expand these XML tags to provide additional information and export them to an external document for further processing. This article shows how to use XML comments and explains the relevant tags. The author demonstrates how to set up your project to export your XML comments into convenient documentation for the benefit of other developers. He also shows how to use comments ...

Read more!

A Sidebar gadget is a powerful little too that's surprisingly easy to create. Get in on the fun with Donavon West.

Donavon West

MSDN Magazine August 2007

...

Read more!

Our Blog

Windows Presentation Foundation (WPF) adds functionality to the Microsoft .NET Framework so that you actually can reliably keep bound controls synchronized with their data sources.

In the December 2008 issue of MSDN Magazine, Ken Getz demonstrates how to use the ObservableCollection class provided by WPF to keep bound controls in ...

Read more!

It’s helpful to think about secure design from a more holistic perspective by using threat models to drive your security engineering process.

In the November 2008 issue of MSDN Magazine, Michael Howard proposes using the threat model to help drive other SDL security requirements, primarily code review priority, fuzz testing priority, ...

Read more!

Silverlight provides a browser interoperability layer that allows managed code to access the document object model (DOM) of the underlying page. At the same time, JavaScript code running in the page can access the XAML content of the plug-in and even make modifications.

In the November 2008 issue of MSDN Magazine, Dino Esposito discusses the ...

Read more!

Choosing the best alternative is a common task in software development and testing. A group of beta users may need to choose the best user interface from a set of prototypes. Or imagine the members of an open source project voting for a policy.

In the November 2008 issue of MSDN Magazine, Dr. James McCaffrey describes five of the ...

Read more!

So many factors can affect the performance of a Web page—the distance between server and client, the size of the elements on the page, how the browser loads these elements, available bandwidth. Finding those bottlenecks and identifying the culprits is no easy task.

In the November 2008 issue of MSDN Magazine, Jim Pierson introduces ...

Read more!

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
MSDN Magazine

Using ADO to Create XML-based Recordsets
Don Box

Code for this article: HouseofCOM0700.exe (46KB)

T
here is a lot of mythology surrounding ActiveX® Data Objects (ADO) and XML. ADO has supported XML since version 2.1. In general, this is a good thing, especially for developers looking for better ways to reach out to the four million programmers who use Visual Basic® and who also worship at the altar of the disconnected recordset. However, simply supporting XML in your application is not good enough because ADO uses a very stylized XML-based vocabulary to perform its magic. The way ADO uses XML allows ADO to interoperate with applications written in programming languages such as Java, Perl, or Python and on platforms such as Unix, Linux, and Apache. Yes, the developer who uses Visual Basic can deal with arbitrary XML using any number of XML parsers. However, morphing your XML into the vocabulary of ADO makes it a no-brainer for any typical application-focused programmer to access your information with Visual Basic.
      The utility of the XML support in ADO is especially apparent when crossing platform boundaries. Consider the case of a server appliance running Linux and Apache in a production environment. A system administrator running Windows® on her desktop may want to monitor activity on the server machine without resorting to a circa-1980 telnet session. Assuming that the Perl program shown in Figure 1 is installed under an Apache-executable directory on the Linux machine (typically in /cgi-bin), you could run a simple Visual Basic-based program as shown in Figure 2 to display the currently running processes. The code for this application is shown in Figure 3, but you should note that most of this code is simply the Visual Basic form template code. The Refresh button handler that does all of the work is only four lines long.
Figure 2 Viewing a Recordset in Visual Basic
Figure 2 Viewing a Recordset in Visual Basic

      Despite the simplicity of the code in Figure 3, you cannot simply sprinkle the magic XML pixie dust on an application and expect ADO to perform miracles. ADO is useless for reading or writing arbitrary XML vocabularies. ADO is not a replacement for the Simple API for XML (SAX) or the Document Object Model (DOM). Rather, ADO recordsets support an XML-based serialization format in parallel with its native Advanced Data Tablegram (ADTG) format. The two formats are functionally identical. The difference between the two is that ADTG is a binary format that is proprietary to Microsoft. XML is a text-based format that is supported by virtually every vendor, platform, and language on the planet. If you understand the XML vocabulary that ADO is expecting, you can generate a serialized ADO recordset without ever having ADO (or OLE DB, a database, or Microsoft® Windows) anywhere near your code.
      Technically, ADO does very little to support recordset persistence. ADO simply utilizes the OLE DB persistence provider to do the heavy lifting. If you are working in OLE DB natively, you must use the ADO Recordset Construction interface to create a transient recordset on top of your existing OLE DB rowset objects. The following code illustrates this technique from C++:



HRESULT GetRecordset(IRowset *prs, _Recordset **pprs) {
  CComPtr<IADORecordsetConstruction> prc;
  HR(prc.CoCreateInstance(__uuidof(Recordset)));
  HR(prc->put_Rowset(prs));
  return prc->QueryInterface(pprs);
}
Translating in the other direction is equally simple:


HRESULT GetRowset(_Recordset *prs, IRowset   **pprs) {
  CComPtr<IADORecordsetConstruction> prc;
  HR(prs->QueryInterface(&prc));
  CComPtr<IUnknown> punk;
  HR(prc->get_Rowset(&punk));
  return punk->QueryInterface(pprs);
}
At the time of this writing, there is no documented way to access the persistence provider except via ADO.
      To invoke the services of the OLE DB persistence provider, you call the _Recordset::Save method on an ADO Recordset. Save is defined as follows


HRESULT Save([in, optional] VARIANT Destination, 
             [in, optional, defaultvalue(0)] 
             PersistFormatEnum PersistFormat);
where PersistFormatEnum is an enumeration with two values:


enum {
    adPersistADTG = 0,
    adPersistXML = 1
} PersistFormatEnum;
The first parameter to Save can be either a file name or an IStream interface pointer. The following line of JScript® serializes a recordset to a file:


rs.Save("C:\\foo.rs");
Because no second parameter was provided, the format will be ADTG, not XML, just as if the method was called like this:


rs.Save("C:\\foo.rs", 0);
To force the serialization format to be XML, the second parameter must be adPersistXML or 1:


rs.Save("C:\\foo.rs", 1);
      When ADO serializes to XML, it uses the UTF-8 character encoding. Additionally, ADO does not emit the initial XML declaration, which is legal given that UTF-8 is in use.
      To recreate an ADO recordset from its serialized form, you call the _Recordset15::Open method:


HRESULT Open([in, optional] VARIANT Source, 
             [in, optional] VARIANT ActiveConnection, 

             [in, optional, defaultvalue(-1)] 
                            CursorTypeEnum CursorType, 
             [in, optional, defaultvalue(-1)] 
                            LockTypeEnum LockType, 
             [in, optional, defaultvalue(-1)] long Options);
When used with serialized recordsets, the first parameter of Open can be a file name, a URL, or an IStream implementation. The following JScript restores a serialized recordset that was saved as either ADTG or XML:


rs.Open("C:\\foo.rs");
The recordset can tell which format was used based on the initial few bytes of the file.
      The first parameter of Save can also be an IStream interface pointer. This means that you can pass an MSXML DOMDocument object to Save in order to use the facilities of the MSXML DOM over the underlying data.


var dom = new ActiveXObject("MSXML.DOMDocument");
rs.Save(dom, 1);
The data from the recordset can now be used with XPath, XSLT, or other XML-based technologies. The ASP Response object can also be used to stream an ADO recordset in an HTTP response message. Figure 4 shows an ASP file that returns a serialized ADO recordset corresponding to the Authors table in the Pubs database that ships with Microsoft SQL Server™. Assuming that this page is exposed via the URL http://example.com/get.asp, you could load the recordset directly via HTTP using the following code:


rs.Open("http://example.com/get.asp");
      When a URL is passed instead of a file name, the recordset simply uses WinInet to contact the server and fetch either the ADTG or XML, which it then processes as if it were a local file.
      Sending an ADO recordset over HTTP is trivial when the recordset is being sent from the HTTP server to the HTTP client. Sending a recordset in the other direction is fairly trivial as well if you are willing to use the MSXML XMLHTTPRequest object. The XMLHTTPRequest object wraps the WinInet calls needed to send an HTTP request and receive an HTTP response. Despite the fact that the XMLHTTPRequest object ships as part of an XML parser, it can be used to send any text-based MIME type, not just XML. Figure 5 shows the overall architecture of the process.
Figure 5 Database Updates via HTTP
Figure 5 Database Updates via HTTP

Figure 6 shows a client-side JScript file that fetches an ADO recordset from the get.asp script, modifies a record, and sends it back to the server using the XMLHTTPRequest object. Figure 7 shows the corresponding ASP script that receives the recordset and sends the changes back to the database using the UpdateBatch method. Note that because the Open method accepts an IStream reference, you can pass the ASP Request object to Open to read the recordset from the HTTP POST payload:



rs.Open(Request);
This technique avoids the problems related to calling Request.BinaryRead from scripting languages.
      So far I have avoided a discussion of the XML vocabulary used by ADO. In general, you can lead a rich and full life without ever seeing this vocabulary provided the Save and Open methods of the Recordset object are all you use. However, to write interoperable programs that do not use ADO, but instead generate or consume XML directly, you must understand the vocabulary well enough to fool ADO into thinking that the data was generated by the OLE DB persistence provider and not your random calls to printf.
      ADO expects a serialized recordset to be contained in an XML document whose root element is named xml. The root element of the document must contain two child elements. The first element contains the schema information describing the type of each row and column. The second child element represents the serialized OLE DB rowset. This rowset element contains one child element per row. Each row element subsequently contains one attribute per non-null column value.
      ADO makes extensive use of XML namespaces. In particular, the three XML namespaces shown in Figure 8 are used in a serialized ADO recordset. Note that the URIs identifying both the schema and datatypes namespaces do not match those found in the XML Data Reduced (XDR) specification. One reason they don't match is that ADO uses its own dialect of XDR that, at the time of this writing, is not supported outside of ADO.
       Figure 9 shows a minimal serialized recordset. The first child element contains an XDR description of the row element type:


<xdr:ElementType name='row' content='eltOnly'>
     <xdr:AttributeType name='name' />
     <xdr:AttributeType name='age'  />
     <xdr:extends type='rowset:rowbase' />
</xdr:ElementType>
This type definition is roughly analogous to the following Document Type Definition (DTD) fragment:


<!ELEMENT row EMPTY>
<!ATTLIST row name CDATA #REQUIRED age  CDATA #REQUIRED>
Both the XDR type definition and the DTD assume that row elements will look something like this:


<target:row name='Don' age='25' />
      Note that the element type definition is contained in a Schema element with an id attribute:


<xdr:Schema id='R' >
  <!-- type definition goes here -->
</xdr:Schema>
This schema is bound to a namespace prefix using a namespace declaration whose URI reference is simply a fragment identifier


xmlns:target='#R'
This means that while the following element is bound to the element type definition


<target:row name='Don' age='25' />
this one is not:


<row name='Don' age='25' />
Support for mapping schema definitions using fragment identifiers is a fairly new feature of MSXML.
      The OLE DB persistence provider expects to see the row data immediately following the schema definition. The data is contained in a data element that contains zero or more row elements as described by the XDR schema definition.


<rowset:data>
  <target:row name='Don' age='25' />
  <target:row name='Aaron' age='20' />
</rowset:data>
      Upon loading the serialized rowset, the persistence provider will simply create a row for each row element and populate the column data based on the name and age attributes.
      The schema definition shown earlier is technically legal and is sufficient if all you want to do is use the resultant recordset to display simple text-based information. However, if you want type-aware access to the individual fields, you must flesh out the AttributeType definitions to provide type descriptions of each indi- vidual column. This is accomplished via XDR's datatype element:


<xdr:AttributeType name='name' >
     <xdr:datatype ... />
</xdr:AttributeType>
      The datatype element allows the schema designer to describe the type of the corresponding character data used in the attribute value (if not specified, the assumed type is BSTR). The primary attribute of this element is the dt:type attribute, which must reference one of the predefined XDR datatypes shown in Figure 10.


<xdr:AttributeType name='age' >
     <xdr:datatype dt:type='i2' />
</xdr:AttributeType>
      This schema fragment states that the age attribute is a two-byte signed integer. For variable-length datatypes like string, XDR also allows a maximum length to be specified:


<xdr:AttributeType name='name' >
     <xdr:datatype dt:type='string' 
                   dt:maxLength='32'/>
</xdr:AttributeType>
In SQL terms, this type corresponds to the VARCHAR(32) datatype.
      The datatypes supported by OLE DB cannot be adequately described using XDR's intrinsic datatype support. For example, the type definition just shown does not distinguish between Unicode and ANSI strings, or between variable-length and fixed-length strings. To support full-fidelity type information, the OLE DB persistence provider augments the datatype definition with namespace-qualified attributes that capture the OLE DB characteristics without changing the core datatype described by the XDR schema definition. The first extended attribute I will look at is the rs:dbtype attribute. This attribute allows the corresponding OLE DB type name to be specified alongside the XDR datatype name.


   <xdr:datatype dt:type='string' rs:dbtype='str' 
              dt:maxLength='32' />
      By default, the XDR string type corresponds to the DBTYPE_WSTR datatype, which is Unicode. The rs:dbtype='str' attribute informs the persistence provider that the corresponding type is DBTYPE_STR, not DBTYPE_WSTR. Additionally, you can indicate whether a column is fixed-length or variable using the rs:fixedLength attribute:


<xdr:datatype dt:type='string' rs:dbtype='str' 
              dt:maxLength='32' rs:fixedLength='false'/>
<xdr:datatype dt:type='string' rs:dbtype='str' 
              dt:maxLength='32' rs:fixedLength='true'/>
The former datatype corresponds to the SQL VARCHAR(32) type; the latter corresponds to the SQL CHAR(32) type. Figure 10 shows how each of the OLE DB datatypes is represented as an extended XDR datatype.
      The persistence provider uses another extended attribute to indicate whether a given column can accept null values. This is indicated using the rs:maybenull attribute:


<xdr:datatype dt:type='i2' rs:maybenull='true'/>
This attribute allows the corresponding column attribute to be missing, which is interpreted by the persistence provider as a NULL value. Figure 11, Figure 12, and Figure 13 show the remaining OLE DB column attributes and properties and how they are serialized to XML.
      The OLE DB persistence provider supports a variety of rowset properties. To provide more granular control over resynchronizing the rowset with the original data source, the persistence provider supports its own property set (DBPROPSET_ADC) whose values are shown in Figure 14. The corresponding XML attribute must appear on the ElementType element in the schema definition. In particular, if you allow updating of the rowset, the updatable attribute must be set to true:


<xdr:ElementType name='row' content='eltOnly'
     rowset:updatable='true' >
Additionally, each column that is writable must have its AttributeType marked as either write or writeunknown:


<xdr:AttributeType name='age' rowset:write='true'/>
Recordsets that are opened with the adLockBatchOptimistic lock mode are written with these values automatically.
      To support ADO's optimistic concurrency control, when a persisted recordset is reserialized after being modified, the changes are not simply inserted into the stream of row elements. Rather, the updates, inserts, and deletes are called out using special elements that contain the original column values in order to detect concurrency violations at UpdateBatch time. These distinguished elements appear after the unmodified rows.
      When an existing row is updated, its serialized row element is removed from the normal sequence. Its original form will appear beneath an update element alongside the updated fields. Assuming that the following row


<target:row name='Don' age='25' />
were to be modified using the following JScript


rs.Fields(0) = "Donald";
the serialized form of the update would look like this:


<rowset:update>
    <rowset:original>
        <target:row name='Don' age='25' />
    </rowset:original>
    <target:row name='Donald' />
</rowset:update>
One of these update elements is written for each row that is modified. When a new row is inserted via ADO's AddNew method, the corresponding row element is serialized inside an insert element:


<rowset:insert>
    <target:row name='John' age='22' />
</rowset:insert>
      Finally, when a row is deleted via ADO's Delete method, it will not appear in the normal list of rows. Rather, its original form will appear as a child of a delete element:


<rowset:delete>
    <target:row name='Aaron' age='20' />
</rowset:delete>
The original values are written so that Update-Batch can verify that no changes have occurred since the original serialization.
      The update, insert, and delete elements must appear as child elements of the data element. All three element types must appear after any original rows. Additionally, all updates must precede the insert and delete elements, and all delete elements must follow the insert elements as if the following pseudo-DTD fragment were in effect:


<!ELEMENT rowset:data (target:row*,
                       rowset:update*, 
                       rowset:insert*, 
                       rowset:delete*)>
      At UpdateBatch time, the persistence provider attempts to submit all changes at once based on the update criteria rowset property. This property can use either timestamps or the stability of selected fields to determine whether a concurrency violation has occurred.
      It is interesting to compare the XML format used by the OLE DB persistence provider (and ADO) and the format used by the SQL Server XML technology preview (which is a precursor to SQL Server 2000). The OLE DB persistence provider uses a fairly inflexible XML vocabulary as its serialization format. There has been no attempt to make this vocabulary useful to the world at large. Rather, the format is completely dedicated to needs of the persistence provider. The SQL Server XML technology preview, on the other hand, seeks to satisfy a broader audience by providing a great deal of flexibility in terms of the generated XML. It is ironic, however, that the current technology preview does not support the OLE DB persistence provider's format as one of its native modes of operation. I hope that Microsoft's database and data access technologies will share a common vocabulary by the time SQL Server 2000 is released.

Don Box is a cofounder of DevelopMentor, a COM think tank that educates the software industry in COM, MTS, and ATL. Don wrote Essential COM, and coauthored the follow-up Effective COM (Addison Wesley). Reach Don at http://www.develop.com/dbox.

From the July 2000 issue of MSDN Magazine.

Page view tracker