Communicating XML Data over the Web with WebDAV
Craig Neable and Sean Lyndersay
Summary: This article discusses WebDAV, which has become an important communication protocol for the Web as an extension to HTTP 1.1. It describes what WebDAV is and how it could be useful in your client/server architecture. (13 printed pages)
With increased focus on Internet standards and network interoperability, WebDAV (Web Distributed Authoring and Versioning) has become an important communication protocol for the Web as an extension to HTTP 1.1 (see IETF RFC 2616 for more information). The WebDAV specification (see IETF RFC 2518 for more information) was published by the Internet Engineering Task Force (IETF) in February 1999, with significant contributions from Microsoft, and with support from many third-party vendors such as Netscape, Xerox, IBM, and Novell.
At Microsoft, WebDAV has found application in many different areas. It enables rich, collaborative publishing to Microsoft® Internet Information Services (IIS) 5.0 servers via the Web. It is the protocol behind the Microsoft Office 2000 Web Folders. And it is the technology that provides a Web interface to the Microsoft Exchange 2000 Web Storage System, allowing direct access to Exchange's object-oriented, hierarchical database over the Web.
Because of its inherent integration with Extensible Markup Language (XML), WebDAV not only has a large dependency on XML, but also has emerged as an excellent method for communicating XML data over the Web. However, before the strength of coupling these technologies can fully be understood, it is important to understand what WebDAV is and how it could be useful in your client/server architecture.
HTTP 1.1 (Hypertext Transfer Protocol) has proven itself as a flexible, universal protocol for transferring data by virtue of the fact that the Web has become the bastion of the Internet. However, HTTP has some obvious shortcomings that have limited its adoption as a comprehensive Internet communication protocol: it works well for static documents intended for viewing, but does not provide the means to handle documents in a manner sophisticated enough to provide clients with rich authoring capabilities.
For example, when two authors make changes simultaneously to a document without consulting one another, the "lost update" problem occurs. Only the revisions made by the last person to upload the document back to the server will remain, and the changes made by the other author will be lost.
It was the goal of the IETF WebDAV working group to design a protocol that would provide functionality that could be required by any distributed authoring tool in a standards-based forum. The current WebDAV specification (IETF RFC 2518) tackles three major concerns of collaborative authoring tools:
- Overwrite Protection. HTTP 1.1 has no method of ensuring that clients can protect resources and make changes without fear of another client simultaneously editing them. With WebDAV, you can lock resources in a variety of ways to let other clients know you have an interest in the resource in question, or to prevent other clients from being able to access the resource.
- Resource Management. HTTP deals only with direct access to an individual resource. WebDAV provides a means of organizing data more efficiently. WebDAV introduces the notion of the collection (analogous to a file system folder), which can contain resources. Resource management via WebDAV includes the ability to create, move, copy, and delete collections, as well as the ability to do the same things to the resources or files within the collection.
- Document Properties. Different types of data have unique properties that help describe the data. For example, in an e-mail message, these properties might be the sender's name and time received. In a collaborative document, these properties might be the original author's name and the name of the last editor of the document. As the types of documents that people use diversify, the list of possible property types becomes infinite. XML is the type of extensible communication vehicle required by WebDAV.
HTTP 1.1 (see IETF RFC 2068) provides a set of methods that clients can use to communicate with servers and specifies the format of responses from servers back to the clients that have issued requests. WebDAV fully adopts all of the methods of this specification, extends some of these methods, and introduces additional methods to provide the functionality described. The methods used in WebDAV are:
- Options, Head, and Trace. Primarily used by applications for discovering and tracking server support and network behavior.
- Get. Retrieves documents.
- Put and Post. Submits documents to the server.
- Delete. Destroys resources or collections.
- Mkcol. Creates collections.
- PropFind and PropPatch. Retrieves and sets properties on both resources and collections.
- Copy and Move. Manages both collections and resources within the context of a namespace.
- Lock and Unlock. Overwrites protection.
The general structure of WebDAV requests follows the format of HTTP and is comprised of the following three components:
- The method. States the method (described previously) to be executed by the client.
- Headers. Describe instructions about how the task is to be completed.
- A body (optional). Defines the data used in the instruction, or additional instructions, about how the method is to be executed.
In the body component, XML becomes a crucial element in the overall picture of WebDAV.
WebDAV was designed to provide more methods for handling resources on a server. These additional methods generally require a great deal of information to be associated with both requests and responses to explicitly define the intention of the client or server. The method of communicating all of this information in HTTP was solely the responsibility of the headers in requests and responses. This imposes some limitations on transfers. It is difficult to apply header information to multiple resources in a request and to represent hierarchy.
Because of its inherent extensibility, XML was chosen to describe how these instructions are communicated. XML is crucial to the operation of WebDAV because it provides:
- A method of formatting instructions describing how data is to be handled.
- A method of formatting complex responses from the server.
- A method of communicating customized information about the collections and resources handled.
- A flexible vehicle for the data itself.
At a high level, a WebDAV instruction processor is really a set of logic that interprets WebDAV methods, followed by an XML parser that interprets the majority of the information communicated.
How does the use of XML in WebDAV turn this technology into such a powerful tool? First, XML provides a way of separating data from either the methods that act on that data, or the way the data is presented. This enables straightforward and consistent abstraction of data. For this abstracted data, WebDAV provides a method of consistent, unified transfer between all tiers in the network architecture over channels that are familiar to existing network architectures. This technology enables a much higher level of interoperability between both Microsoft products and third-party applications.
Second, XML enhances WebDAV by providing a means for extensibility. XML allows clients to describe and set properties on a WebDAV server. These properties can then be used to index, search, and process resources on the server. Because of the inherent extensibility of XML, the types of properties and uses for these properties are infinite.
The following example submits data to a server (using the PropPatch method) that is to be associated with each of the resources on the server (in this case HTML documents), and then performs searches of these documents based on the custom properties that it had previously set. This example uses the raw WebDAV requests (the bits that are transferred over the wire) necessary to successfully complete these tasks, and then shows how the MSXML XMLHTTPRequest object can be used to create such requests.
Imagine that you need to easily identify the author of each document in a large pool of documents on a server. In a non-WebDAV world, in order to find all documents that are authored by a certain person, you might be able to search through the text in these documents looking for a particular author's name. A search of this nature would also return all documents in which that particular person was casually referenced as well. What about trying to populate a table containing all of these documents and the author of each document? This would be virtually impossible based solely on such a raw text search.
Using WebDAV requests encoded in XML, it is possible to set an Author property on each of the documents in a collection. This property could then be used for the organizational purposes described above.
Setting an Author Property Using PropPatch
The following WebDAV request would set an Author property on the document entitled Webdav-xml.htm in the collection called WebDavDocs on the MyServer.com server:
PROPPATCH /WebDavDocs/webdav-xml.htm HTTP/1.1 Host: myserver.com Content-Type: text/xml Content-Length: 138 <?xml version="1.0"> <d:propertyupdate xmlns:d="DAV:" xmlns:o="urn:schemas-microsoft- com:office:office"> <d:set> <d:prop> <o:Author>Sean Purcell</o:Author> </d:prop> </d:set> </d:propertyupdate>
The first line of this request specifies the method that the client wishes to enact (PropPatch) and gives the absolute URL of the file on which to set the property. The three lines that follow the method are headers that specify the server to which the method will be submitted, and tell the server the type and length of the content to expect.
The XML-encoded body is what tells the server exactly which property to set and the value that should be assigned to it. One important observation in this XML document is the use of namespace declarations. The first attribute in the
<d:propertyupdate> element defines the use of the WebDAV namespace through the document. To all elements with this prefix throughout the document, the WebDAV-compliant server will know to apply behaviors based on the "DAV:" schema. In this case, these specific properties define how to set a property on a document.
The second namespace declaration is for the urn:schemas-microsoft-com:office:office namespace. An excellent strategy to use when designing XML properties is to thoroughly examine existing namespaces for existing properties that could be of use. Equally important, however, is to ensure that the existing property is used in the manner to which it was originally intended to prevent property collision. By using an existing Office property in our scenario, it will allow other clients who recognize this Microsoft custom namespace to interpret this property.
In response to this request, the server would send back a response indicating that the property was successfully set.
HTTP/1.1 207 Multi-Status Server: Microsoft-IIS/5.0 Date: Wed, 04 Aug 1999 21:52:58 GMT Content-Type: text/xml Content-Length: 310 <?xml version="1.0"?> <a:multistatus xmlns:b="urn:schemas-microsoft-com:office:office" xmlns:a="DAV:"> <a:response> <a:href>http://myserver.com/WebDavDocs/webdav-xml.htm</a:href> <a:propstat> <a:status>HTTP/1.1 200 OK</a:status> <a:prop> <b:Author/> </a:prop> </a:propstat> </a:response> </a:multistatus>
Retrieving All Documents in the Collection by Author
A PropPatch similar to the one issued above is issued to each of the resources in the \WebDavDocs folder on the server, so that every one of the resources in this collection has associated it with the Author property. Now to solve one of the problems outlined in the scenario, the following example retrieves information necessary to populate a table outlining who authored each document in the collection, and the name of each document.
The request to retrieve this information will be the following:
PROPFIND /WebDavDocs/ HTTP/1.1 Depth: 1,noroot Host: myserver.com Content-Type: text/xml Content-Length: 184 <?xml version="1.0"?> <d:propfind xmlns:d="DAV:" xmlns:o="urn:schemas-microsoft-com:office:office"> <d:prop> <d:displayname/> <o:Author/> </d:prop> </d:propfind>
An additional header that has been added in this request is the depth header, which specifies to which resources the method should be applied. In this case the value "1,NOROOT" specifies that the method should be applied to all immediate children of the specified URL, but not to the URL itself.
The body of the XML request contains the two properties to retrieve: the name of the document (a property in the "DAV:" namespace) and the Author property that was set using the PropPatch (from the "office:" namespace). The server sends the following response:
HTTP/1.1 207 Multi-Status Server: Microsoft-IIS/5.0 Date: Wed, 04 Aug 1999 22:38:42 GMT Content-Type: text/xml <?xml version="1.0"?> <a:multistatus xmlns:d="urn:schemas-microsoft-com:office:office" xmlns:a="DAV:"> <a:response> <a:href>http://myserver.com/WebDavDocs/webdav-xml.htm</a:href> <a:propstat> <a:status>HTTP/1.1 200 OK</a:status> <a:prop> <a:displayname>webdav-xml.htm</a:displayname> <d:Author>Sean Purcell</d:Author> </a:prop> </a:propstat> </a:response> <a:response> <a:href>http://myserver.com/WebDavDocs/webdav-http- requests.htm</a:href> <a:propstat> <a:status>HTTP/1.1 200 OK</a:status> <a:prop> <a:displayname>webdav-http-requests.htm</a:displayname> <d:Author>Sean Purcell</d:Author> </a:prop> </a:propstat> </a:response> <a:response> <a:href>http://myserver.com/WebDavDocs/webdav-implementation- plan.xls</a:href> <a:propstat> <a:status>HTTP/1.1 200 OK</a:status> <a:prop> <a:displayname>webdav-implementation-plan.xls</a:displayname> <d:Author>Adam Barr</d:Author> </a:prop> </a:propstat> </a:response> <a:response> <a:href>http://myserver.com/WebDavDocs/webdav-search.doc</a:href> <a:propstat> <a:status>HTTP/1.1 200 OK</a:status> <a:prop> <a:displayname>dav-search.doc</a:displayname> <d:Author>Laura Jennings</d:Author> </a:prop> </a:propstat> </a:response> <a:response> <a:href>http://myserver.com/WebDavDocs/webdav-info.txt</a:href> <a:propstat> <a:status>HTTP/1.1 200 OK</a:status> <a:prop> <a:displayname>webdav-info.txt</a:displayname> </a:prop> </a:propstat> <a:propstat> <a:status>HTTP/1.1 404 Resource Not Found</a:status> <a:prop> <d:Author/> </a:prop> </a:propstat> </a:response> </a:multistatus>
Each of the
<DAV:response> elements in this XML document represents one of the resources in the collection. This is a good example of how WebDAV responses use XML to represent hierarchy and different responses for different resources. In this request, the displayname and the Author properties were successfully retrieved for all items in the collection except for the last one, in which the Author property was not set. This is communicated in the
<status> element with the text "HTTP/1.1 404 Resource Not Found", which refers to the Author property.
This data can be used to populate the table in our application displaying the requested data:
Searching for Documents by Author
The final problem posed in the scenario is that of searching through a large pool of documents based on the Author property. The IETF DAV Searching and Locating (DASL) group is one of the groups that has been formed to extend the functionality provided by WebDAV. This group is concerned with defining a syntax that can be used for searching through WebDAV resources. Because the work of this group has not been finalized, the Exchange team has implemented a Search method as part of the WebDAV server component of Exchange 2000 that uses SQL syntax to perform searches. The following example demonstrates a WebDAV search request for all documents authored by "Sean Purcell" in our collection.
SEARCH /WebDavDocs/ HTTP/1.1 Host: myserver.com Content-Type: text/xml Content-Length: 295 <?xml version="1.0"?> <g:searchrequest xmlns:g="DAV:"> <g:sql>SELECT "DAV:displayname" as prop1, "urn:schemas-microsoft-com:office:office#Author" as prop2 FROM SCOPE('SHALLOW TRAVERSAL OF "."') WHERE "prop2" = 'Sean Purcell' </g:sql> </g:searchrequest>
The response to this request returns all of the documents authored by Sean Purcell in the collection, and for each of them returns the displayname and Author properties tagged by
HTTP/1.1 207 Multi-Status Server: Microsoft-IIS/5.0 Date: Wed, 04 Aug 1999 23:56:47 GMT Content-Type: text/xml <?xml version="1.0"?> <a:multistatus xmlns:b="urn:uuid:c2f41010-65b3-11d1-a29f-00aa00c14882/" xmlns:c="xml:" xmlns:a="DAV:"> <a:response> <a:href>http://myserver.com/WebDavDocs/webdav-xml.htm</a:href> <a:propstat> <a:status>HTTP/1.1 200 OK</a:status> <a:prop> <prop1>webdav-xml.htm</prop1> <prop2>Sean Purcell</prop2> </a:prop> </a:propstat> </a:response> <a:response> <a:href>http://myserver.com/WebDavDocs/webdav-http- requests.htm</a:href> <a:propstat> <a:status>HTTP/1.1 200 OK</a:status> <a:prop> <prop1>webdav-http-requests.htm</prop1> <prop2>Sean Purcell</prop2> </a:prop> </a:propstat> </a:response> </a:multistatus>
Using the XMLHTTPRequest Object to Create WebDAV Requests
The fragments that were presented above represent raw WebDAV requests that accomplish some very useful functions. The question still remains of how these requests can be created programmatically. The answer to this question lies in the XMLHTTPRequest object that is a part of Msxml.dll. It allows programmers to create customized HTTP requests and read their responses, rather than to use Load and Save methods built into the XMLDOMDocument object. Because WebDAV requests have the exact structure of HTTP requests, this object can be used to create any WebDAV request.
The following Visual Basic subroutine would issue the PropPatch method discussed in the example above to our server:
Const REQUEST_TEXT As String = "<?xml version=""1.0""?>" & _ "<g:propertyupdate xmlns:g=""DAV:"" xmlns:o=""urn:schemas-microsoft- com:office:office"">" & _ "<g:set>" & _ "<g:prop>" & _ "<o:Author>Sean Purcell</o:Author>" & _ "</g:prop>" & _ "</g:set>" & _ "</g:propertyupdate>" Const RESOURCE_URL As String = "http://myserver.com/WebDavDocs/webdav- xml.htm" Public Sub DoPROPPATCH() Dim objDAVMethod As New XMLHTTPRequest Dim objXMLBody As New DOMDocument ' First, load the object that will be the body of the request with the XML ' that will be the body of the request. In reality, this could be ' created using any of the DOMDocument methods, but we will use a simple ' load for simplicity's sake: objXMLBody.loadXML REQUEST_TEXT ' Open the object, assigning it a method. objDAVMethod.open "PROPPATCH", RESOURCE_URL, False ' Set the necessary headers for the request objDAVMethod.setRequestHeader "Content-Type", "text/xml" objDAVMethod.setRequestHeader "Content-Length", Len(REQUEST_TEXT) ' Send the request, using the XML document as the body. objDAVMethod.send objXMLBody.documentElement.xml ' Output the response in a message box. MsgBox "HTTP/1.1" & objDAVMethod.Status & " " & _ objDAVMethod.StatusText & vbNewLine & vbNewLine & _ objDAVMethod.getAllResponseHeaders & _ objDAVMethod.responseText End Sub
For more information on the XMLHTTPRequest object, refer to the XML SDK documentation.
Efforts are already under way to extend WebDAV to fulfill the original vision of having a protocol to meet the needs of all distributed authoring tools. Both the DASL working group and the recently created Delta-V (Web Versioning and Configuration Management) working group of the IETF are making efforts in this direction. Microsoft is heavily involved in both these working groups. However, in order to deliver a comprehensive technology, Microsoft is making its own headway in both of these directions, demonstrated by the addition of the Search method to the arsenal of methods with which the Exchange 2000 WebDAV implementation is familiar.
This strong combination of technologies will fulfill the requirements of many client/server technologies both within Microsoft and in third-party products.
- WebDAV.org, at http://www.webdav.org.
- Exchange 2000 official Web site, at http://www.microsoft.com/exchange/prodinfo/2000/default.htm.
The example companies, organizations, products, people, and events depicted herein are fictitious. No association with any real company, organization, product, person or event is intended or should be inferred.