Service Station

What's new in System.Xml 2.0?

Aaron Skonnard

Contents

System.Xml 2.0 Design Goals
Reading XML
Converting Values
Reading Large Values
Finding Nodes
Reading Subtrees
Writing
Validating
Loading and Saving
Navigating and Querying
Updating
Transforming
What Didn't Make It
Extrapolation

It's been a while since I've written on a core XML topic, and I miss it. Now that the Microsoft® .NET Framework 2.0 has shipped and is in the hands of countless developers worldwide, it seems like a good time to discuss the improvements found in System.Xml, which sits at the heart of all .NET-based Web service apps.

From early in the design process, the System.Xml team had some ambitious goals for version 2.0, but for better or worse, not everything made it into the final release. In fact, many features appeared in one beta and then quickly disappeared in the next as the team worked to finalize their offering based on customer feedback.

Mark Fussell, the Lead Program Manager on the System.Xml team, wrote "What's New in System.Xml for Visual Studio® 2005 and the .NET Framework 2.0 Release" (msdn.microsoft.com/library/en-us/dnxml/html/SysXMLVS05.asp) to document the new System.Xml features found in the Beta 1 release. While some of the article is still accurate, it discusses numerous features that were either changed or dropped en route to RTM. What's needed is a view of the final System.Xml 2.0 feature set available today. So, this month I'll highlight some of the new features and show you how they can simplify some of your common XML programming tasks.

System.Xml 2.0 Design Goals

Improving System.Xml performance was a driving force behind many of the changes in System.Xml 2.0. The team invested significant resources and were quite successful. Compared to their .NET Framework 1.1 predecessors, the new XmlTextReader and XmlTextWriter classes are now twice as fast as before, the XSLT performance is three to four times as fast, and XML Schema validation is faster by about 20 to 25 percent.

To achieve most of these performance improvements some significant redesign and targeted optimizations were required. For example, it took some serious reworking to make the .NET XSLT implementation just as fast as its unmanaged predecessor, MSXML 4.0. Now in System.Xml 2.0, the XSLT implementation builds Microsoft intermediate language (MSIL) directly, which is then JIT compiled by the .NET runtime and executed as machine code. The resulting performance is very similar to that of MSXML 4.0 in most cases.

Taking advantage of these performance benefits requires proper use of the new and improved System.Xml library, which I'll discuss in the sections that follow. In addition to performance, the other design goals included enhanced usability, compatibility, and standards. Towards the end of the System.Xml 2.0 development cycle, it became apparent that the System.Xml team really cared about making their offering the most practical and usable solution on the market today. This required some tough choices regarding the feature set, but in the end the library's usability speaks for itself.

Now let's dive into the various XML programming tasks and see how you'd tackle them using the new and improved library.

Reading XML

When you need to read XML as quickly and efficiently as possible, you want to use XmlReader, which provides a pull model for reading one node at a time while traversing the document in a forward-only, read-only fashion. XmlReader is an abstract base class; there are several derived classes including XmlTextReader, XmlValidatingReader, and others that are used internally by the library.

One of the major changes in version 2.0 is how you create XmlReader objects. In the .NET Framework 1.1, you had to pick the XmlReader class, instantiate it, and configure it directly. This required you to be familiar with all of the XmlReader-derived classes and their individual settings. The recommended way to create an XmlReader in version 2.0 is to use the new static Create method, which shields you from the underlying implementation. The following shows how to create an XmlReader implementation to read an Order.xml file:

XmlReader r = XmlReader.Create("Order.xml");
while (r.Read())
{
   // read each node here
}
r.Close();

This creates an XmlReader implementation using a default configuration. You can configure the XmlReader by supplying an XmlReaderSettings object when calling Create. Here's an example that configures the underlying reader to ignore whitespace, comments, and processing instructions, and it also prohibits Document Type Definition (DTD) processing:

XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreWhitespace = true;
settings.IgnoreComments = true;
settings.IgnoreProcessingInstructions = true;
settings.ProhibitDtd = true;

XmlReader r = XmlReader.Create("Order.xml", settings);
while (r.Read())
{
   // read each node here
}
r.Close();

The primary benefit of this factory pattern is how it shields you from the internal implementation and configuration details. Now you simply focus on the XmlReaderSettings class and let the Create method take care of the rest.

The factory pattern also allows the library to leverage specialized internal optimizations. These optimizations provide some of the performance increases I've discussed. This shielding allows the library to evolve with new optimizations without affecting clients (as long as the XmlReader interface remains the same). You can still work with the individual XmlReader subclasses directly if you want, but you may forfeit some of the internal optimizations or conveniences when doing so.

Properly closing the XmlReader implementations in use is also important. In version 1.1 you had to make sure you called Close at the appropriate time. Now in System.Xml 2.0 , the XmlReader base class implements IDisposable, which allows the following use:

using (XmlReader r = XmlReader.Create("Order.xml", settings))
{
    while (r.Read())
    {
       // read the nodes here
    }
}

When Dispose is called at the end of the using statement, the implementation of Dispose calls Close on the XmlReader object.

Converting Values

In addition to the improvements I've mentioned thus far, version 2.0 also simplifies the process of converting XML text to .NET values. I've provided a simple XML document called Order.xml in Figure 1 and, as I discuss some of these new features, I'll provide examples that process it.

Figure 1 Order.xml Sample Document

<Order xmlns="https://northwind/order" 
   xmlns:o="https://northwind/order" >
   <CustomerID>GREAL</CustomerID>
   <Addresses>
      <Address Label="ShipTo">
         <Addressee>Howard Snyder</Addressee>
         <Line1>2732 Baker Blvd.</Line1>
         <Line2>Suite 200</Line2>
         <City>Eugene</City>
         <Region>OR</Region>
         <PostalCode>97403</PostalCode>
         <Country>USA</Country>
      </Address>
      <Address Label="BillTo">
         <Addressee>Great Lakes Food Market</Addressee>
         <Line1>PO BOX 12345</Line1>
         <Line2></Line2>
         <City>Eugene</City>
         <Region>OR</Region>
         <PostalCode>97403</PostalCode>
         <Country>USA</Country>
      </Address>
   </Addresses>
   <Items>
      <Item>
         <Sku>6</Sku>
         <Price>25.00</Price>
         <Description>Grandma’s Boysenberry Spread</Description>
         <Quantity>10</Quantity>
      </Item>
      <Item>
         <Sku>24</Sku>
         <Price>4.50</Price>
         <Description>Guaraná Fantástica</Description>
         <Quantity>20</Quantity>
      </Item>
      <Item>
         <Sku>14</Sku>
         <Price>23.25</Price>
         <Description>Tofu</Description>
         <Quantity>30</Quantity>
      </Item>
   </Items>
</Order>

So let's begin by supposing that your task is to read the content of each <Price> element as a System.Double value. With the .NET Framework 1.1, you would typically read the <Price> content as a string and then use the XmlConvert class to convert the string manually into a System.Double value (XmlConvert is a special conversion class aware of the XML Schema lexical representations). Given how often you need to perform this task in XML applications, it becomes tedious to do it by hand. System.Xml 2.0 includes a suite of methods that automate these conversions for you.

There are two types of conversion methods to choose from: ReadContentAsXXX and ReadElementContentAsXXX. You can use ReadContentAsXXX when the cursor is already positioned on a text or attribute node. You use ReadElementContentAsXXX when the cursor is positioned on a start element node and you want to read its content and advance the cursor past the corresponding end element node.

Here's an example using ReadElementContentAsDouble:

using (XmlReader r = XmlReader.Create("Order.xml", settings))
{
    while (r.Read())
    {
        if (r.IsStartElement() && r.LocalName.Equals("Price"))
        {
            // convert the price content to a double
            double price = r.ReadElementContentAsDouble();
            Console.WriteLine("Price={0}", price);
        }
    }
}

There is a separate ReadElementContentAsXXX method for each of the primitive .NET types (such as ReadElementContentAsBoolean, ReadElementContentAsDateTime, ReadElementContentAsDecimal, and so forth). And you can always fall back to ReadElementContentAsString when you don't have an appropriate .NET type to which to convert. There are equivalent methods available for the ReadContentAsXXX variants for use when positioned on attribute/text nodes.

The .NET Framework 2.0 also brings generic ReadContentAs and ReadElementContentAs methods, which allow you to specify the .NET type dynamically. One reason to use them is if you want to convert a space-delimited list into a .NET array. For example, consider the following <Price> element, which contains a list of prices:

<Price>25.00 26.00 27.00</Price>

You can read this <Price> element by specifying double[] as the type, as shown in Figure 2.

Figure 2 Convert Price to a Double Array

using (XmlReader r = XmlReader.Create("Order.xml", settings))
{
    while (r.Read())
    {
        if (r.IsStartElement() && r.LocalName.Equals("Price"))
        {
            // convert the price content to a double array
            double[] prices = (double[])r.ReadElementContentAs(
                typeof(double[]), null); 
            foreach(double p in prices)
                Console.WriteLine("Price={0}", p);
        }
    }
}
 

Reading Large Values

XmlReader has a new a suite of methods for reading large amounts of data more efficiently. The ReadValueChunk allows you to read one chunk of characters at a time. You can use ReadContentAsBase64 or ReadContentAsBinHex when you need to convert the text chunks from Base64 or BinHex encodings. The System.Xml team also provided ReadElementContentAsBase64 and ReadElementContentAsBinHex equivalents when you need to read from element content as opposed to a text node.

Finding Nodes

In addition to the conversion improvements, System.Xml 2.0 provides a few new methods to simplify finding nodes of interest in the document: ReadToDescendant, ReadToFollowing, and ReadToSibling. These methods search through the sequence of nodes along a particular axis. If you need to find a <Price> element within the current descendant tree, use ReadToDescendant. If you need to find a <Price> element that comes somewhere after the current node in the document, use ReadToFollowing. And finally, if you need to find the next sibling node, use ReadToNextSibling. Each of these methods is useful in common reading scenarios. The code in Figure 3 uses these methods to calculate and print the order total for Order.xml.

Figure 3 Calculate and Print Order Total

static void ReadOrder()
{
    string NS = "https://northwind/order";
    double total = 0;
    double price;
    int quantity;
    string customer;

    XmlReaderSettings settings = new XmlReaderSettings();
    settings.IgnoreWhitespace = true;
    settings.ProhibitDtd = true;
    settings.IgnoreComments = true;
    settings.IgnoreProcessingInstructions = true;
    settings.ConformanceLevel = ConformanceLevel.Document;

    using (XmlReader r = XmlReader.Create("Order.xml", settings))
    {
        r.ReadToDescendant("CustomerID", NS);
        customer = r.ReadElementContentAsString();

        while (r.ReadToFollowing("Price", NS))
        {
            price = r.ReadElementContentAsDouble();
            r.ReadToNextSibling("Quantity", NS);
            quantity = r.ReadElementContentAsInt();
            total += quantity * price;
        }

        Console.WriteLine("Order total for {0}: {1}", 
            customer, total);
    }
}
 

Accomplishing this same task using version 1.1 of the Framework would have required much more code in the way of cursor management and type conversions, which ultimately makes the solution more complicated and error prone. These few methods offer a great productivity boost while making the code both easier to read and maintain.

Reading Subtrees

Another aspect of cursor management, which is quite difficult, is that of passing an XmlReader reference to a method so it can process a particular subtree. This requires the method to pay careful attention to cursor position so it can identify when it's finished processing the subtree, leaving the XmlReader object in an appropriate state for the calling code. For example, consider the following method that takes an XmlReader object as input—its job is to process the <Addresses> element subtree:

void ProcessAddresses(XmlReader r)
{
    while (r.Read())
    {
        ... // process Address nodes here
    }
}

This particular implementation isn't privy to the cursor issues I just described; it blindly calls Read until it reaches the end of the stream, causing the cursor to advance past the end of <Addresses>. A correct implementation would need to identify the end of the <Addresses> element and stop there.

In System.Xml 2.0 , a ReadSubtree method has been added to simplify this type of scenario. You call it when positioned on a node and it creates another XmlReader object that will only read to the end of the current node. For example, the following snippet shows how to supply an XmlReader for the <Addresses> subtree:

using (XmlReader r = XmlReader.Create("Order.xml", settings))
{
    r.ReadToDescendant("CustomerID", NS);
    customer = r.ReadElementContentAsString();

    // create a new XmlReader for the Addresses subtree
    ProcessAddresses(r.ReadSubtree());

    while (r.ReadToFollowing("Price", NS))
    {
        ...

As you can see, System.Xml 2.0 provides numerous improvements to reading XML documents via XmlReader.

Writing

When you need to write XML as quickly and efficiently as possible, you want to use XmlWriter. The architecture of XmlWriter is very similar to that of XmlReader, but is focused instead on writing XML nodes.

Most of the improvements to XmlWriter are also similar to those for XmlReader. First, you can create XmlWriter objects using a static Create method. Second, there's an XmlWriterSettings class that you use to specify the various output settings for the object. And finally, XmlWriter implements IDisposable, as you might expect. The method in Figure 4 provides a complete example using XmlWriter. It generates the following text output when you supply "GREAL", 1037.50, and 3 as parameters to the Writing method:

<OrderSummary
    CustID="GREAL" xmlns="https://northwind/order">
    <Total>1037.50</Total>
    <NumberOfItems>3</NumberOfItems>
</OrderSummary>

Figure 4 Using XmlWriter

static void Writing(
    string customer, double total, int numberOfItems)
{
    string NS = "https://northwind/order";

    XmlWriterSettings settings = new XmlWriterSettings();
    settings.CloseOutput = true;
    settings.Encoding = Encoding.UTF8;
    settings.Indent = true;
    settings.IndentChars = "\t";
    settings.NewLineChars = "\r\n";
    settings.NewLineOnAttributes = true;
    settings.OmitXmlDeclaration = true;

    using (XmlWriter w = XmlWriter.Create("out.xml", settings))
    {
        w.WriteStartDocument();
        w.WriteStartElement("OrderSummary", NS);
        w.WriteAttributeString("CustID", customer);
        w.WriteElementString("Total", NS, 
            XmlConvert.ToString(total));
        w.WriteElementString("NumberOfItems", NS,  
            XmlConvert.ToString(numberOfItems));
        w.WriteEndElement();
        w.WriteEndDocument();
    }
}

In addition to these simplifications, XmlWriter now has support for type conversions through the new WriteValue method. There are several overloads of this method, each of which accepts a different .NET type (see Figure 5). They save you from having to explicitly convert .NET values to their equivalent XML Schema Definition (XSD) lexical representations. XmlWriter also has a new WriteNode overload to simplify copying an XPathNavigator node to the XmlWriter output (similar to what was provided in version 1.1 for copying an XmlReader node to the output). However, in general, not as many enhancements were made for writing documents as there were for reading documents.

Figure 5 WriteValue Overloads

using (XmlWriter w = XmlWriter.Create("out.xml", settings))
{
    w.WriteStartDocument();
    w.WriteStartElement("OrderSummary", NS);
    w.WriteStartAttribute("CustID");
    w.WriteValue(customer);
    w.WriteEndAttribute();
    w.WriteStartElement("Total", NS);
    w.WriteValue(total);
    w.WriteEndElement();
    w.WriteStartElement("NumberOfItems", NS);
    w.WriteValue(numberOfItems);
    w.WriteEndElement();
    w.WriteEndElement();
    w.WriteEndDocument();
}

Validating

When you want to validate a document while reading, you can simply configure the XmlReaderSettings object accordingly. The following example shows how to configure a new reader for XML Schema validation:

XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add("https://northwind/order", "Order.xsd");

using (XmlReader r = XmlReader.Create("Order.xml", settings))
{
    while (r.Read()) ; // do nothing, just validate
    Console.WriteLine("Order.xml is valid");
}

Here, calling Create causes multiple XmlReader objects to be instantiated and chained together: one for reading the text stream (XmlTextReaderImpl) and one for performing XSD validation (XsdValidatingReader). The beauty in this is that you don't have to deal with either of the underlying XmlReader objects directly. You can still program against XmlValidatingReader directly if you desire, but again, you should really use the factory pattern as much as possible.

In version 2.0 of the Framework, a few new validation features were also introduced. First, a new class named XmlSchemaValidator was added for performing custom validation routines in a way that doesn't require an XmlReader. The class provides a push-model API that can be used for a variety of advanced validation tasks. XmlDocument was also extended to include a Validate method for validating the entire document or for validating just a particular subtree against the document's schemas (exposed via the new Schemas property).

Loading and Saving

When you want to load an XML document into memory so you can use more sophisticated navigation techniques, System.Xml provides a few options. First, System.Xml provides an implementation of the W3C Document Object Model (DOM) API through the XmlDocument class, which most developers have used over the past few years.

XmlDocument exposes a Load method for reading an XML document from a stream (using an XmlReader implementation) and "loading" it into an in-memory tree of nodes. It also provides a Save method for writing it back out (using an XmlWriter implementation). Here's an example of how it works:

XmlDocument doc = new XmlDocument();
doc.Load("Order.xml");
... // navigate and update document
doc.Save("OrderUpdated.xml");

Once you have the document in memory you can use the various DOM APIs or XPathNavigator for traversing and updating the document. Although the DOM is widely understood and used today, it's a fairly heavyweight option with a number of issues. Most of the DOM's problems stem from the fact that the API attempts to mirror the XML 1.0 syntax (think entity references, CDATA sections, and so on). While this might make sense in some document-centric scenarios (for instance, if you were implementing an XML editor), it complicates matters in data-centric scenarios (such as SOAP processing). The DOM also exposes individual nodes directly to the developer, which makes it hard for the implementation to optimize internal storage.

Hence, System.Xml comes with an alternate in-memory store known as XPathDocument. XPathDocument is a read-only store optimized for queries and transformations. You navigate an XPathDocument using an XPathNavigator. You won't find any public methods on XPathDocument class besides CreateNavigator, which returns an XPathNavigator object.

You can use the XPathNavigator to write the document back out using an XmlWriter. To do so you can either call XmlWriter.WriteNode, supplying the XPathNavigator you want to write, or you can call XPathNavigator.WriteSubtree, supplying the XmlWriter you want to write to (see Figure 6). The main benefit of using the XPathDocument store over XmlDocument is that it's faster when navigating, especially when performing queries and transformations. Most often you'll use XmlDocument to update the document in memory.

Figure 6 Using XPathNavigator

public static void LoadAndSave()
{
    XPathDocument doc = new XPathDocument("Order.xml");
    XPathNavigator nav = doc.CreateNavigator();

    ... // navigate and update document

    using (XmlWriter w = XmlWriter.Create(
        "Order2.xml", GetWriterSettings()))
    {
        // save the document back out
        nav.WriteSubtree(w);
        // the following line accomplishes the same
        // w.WriteNode(nav, true); 
    }
}

Navigating and Querying

Whether you're using XPathDocument or XmlDocument as your in-memory store, XPathNavigator is the recommended API for navigating and querying the document. It provides an extra layer of abstraction between your code and the implementation, preparing you to take advantage of future enhancements.

XPathNavigator provides numerous MoveToXXX methods for navigating through the underlying store along the different axes (descendants, following, siblings, and so forth). Like XmlReader, XPathNavigator uses a cursor approach for navigating the document. Once you're positioned on a node you can retrieve its attributes or text value. In version 2.0, several typed ValueAsXXX properties have been introduced to simplify type conversions. Figure 7 is a complete example showing how to total the order using XPathNavigator.

Figure 7 Total Order Using XPathNavigator

public static void LoadAndProcessOrder()
{
    string NS = "https://northwind/order";
    double total = 0;
    double price;
    int quantity;
    string customer;
    
    XPathDocument doc = new XPathDocument("Order.xml");
    XPathNavigator nav = doc.CreateNavigator();
    // navigate document and total order

    nav.MoveToFollowing("CustomerID", NS);
    customer = nav.Value;

    while (nav.MoveToFollowing("Price", NS))
    {
        price = nav.ValueAsDouble;
        nav.MoveToNext("Quantity", NS);
        quantity = nav.ValueAsInt;
        total += quantity * price;
    }

    Console.WriteLine("Order total for {0}: {1}",
        customer, total); 
   
}
 

It's probably more common, however, for developers to use XPath expressions to navigate through the document. XPathNavigator provides XPath support through its various Select methods. The generic Select method takes an XPath expression as input and returns an XPathNodeIterator for processing the matched nodes. There are also more specific Select methods whose search scope is limited to a particular axis (SelectAncestors, SelectChildren, SelectDescendants, and so forth).

In System.Xml 2.0, XPathNavigator also comes with a SelectSingleNode method (just like the one on XmlDocument) to find the first matching node—this version returns an XPathNavigator positioned on the matched node.

Properly dealing with namespaces is one of the most confusing and tedious aspects of using XPath. In the .NET Framework 1.1, you had to populate an XmlNamespaceManager object with namespace mappings and supply it when calling Select. You could only use the namespace prefixes found within the XmlNamespaceManager object in your XPath expressions. However, intuition tells most developers that they should be able to use the namespace mappings found with the document they are querying.

Hence, in System.Xml 2.0, overloads have been added for specifying a namespace resolver when calling Select. The built-in XmlReader classes and XPathNavigator all implement IXmlNamespaceResolver, which means you can pass them to Select in order to leverage their namespace mappings automatically. You'll notice Order.xml in Figure 1 has a namespace declaration that maps the "o" prefix to the "https://northwind/order" namespace. Figure 8 illustrates how to leverage this prefix in your XPath expressions without manually creating an XmlNamespaceManager object.

Figure 8 Using a Prefix

public static void LoadAndQuery()
{
    string NS = "https://northwind/order";
    double total = 0;
    double price;
    int quantity;
    string customer;
    
    XPathDocument doc = new XPathDocument("..\\..\\Order.xml");
    XPathNavigator nav = doc.CreateNavigator();

    // move to root element so namespace mappings are in scope
    nav.MoveToChild(XPathNodeType.Element); 
    // supply nav for the namespace resolver
    customer = nav.SelectSingleNode(
        "/o:Order/o:CustomerID", nav).Value;

    // supply nav for the namespace resolver
    XPathNodeIterator prices = nav.Select("//o:Price", nav);

    // iterate over selected Price elements and process
    while (prices.MoveNext())
    {
        price = prices.Current.ValueAsDouble;
        prices.Current.MoveToNext("Quantity", NS);
        quantity = prices.Current.ValueAsInt;
        total += quantity * price;
    }

    Console.WriteLine("Order total for {0}: {1}",
        customer, total);   
}
 

It's important to note that if you don't use a prefix in your XPath expressions rather than the empty namespace, the default namespace mapping is assumed. That is why I needed to use the explicit prefix in my earlier example. Nevertheless, this is a great improvement that will simplify the most common XPath scenarios.

Updating

When you need to update or modify a document in memory, you need to use XmlDocument as your in-memory store since XPathDocument is not editable. However, you can use XPathNavigator to perform the updates against the underlying XmlDocument. In System.Xml 2.0, several new methods were added to XPathNavigator to support underlying modifications (in version 1.1 of the Framework, it was a read-only API).

For example, you'll find methods for appending children, prepending children, inserting nodes before or after a specific node, replacing nodes, deleting nodes, and setting values. Many of the provided methods return an XmlWriter object that you can use to write new XML nodes directly into the underlying store. For example, the code in Figure 9 illustrates how to write a <Total> element as the last child of <Order>.

Figure 9 Write New XML Nodes

XmlDocument doc = new XmlDocument();
doc.Load("Order.xml");
XPathNavigator nav = doc.CreateNavigator();

// move to root element 
nav.MoveToChild(XPathNodeType.Element);

... // calculate total

// append <Total> element to <Order> element
using (XmlWriter writer = nav.AppendChild())
{
    writer.WriteElementString(
        "Total", "https://northwind/order", total.ToString());
}
 

There are numerous methods available for performing these types of updates. If you're used to the DOM API, these methods may feel foreign at first, but it's recommended that you use them anyway. If you do, your code will remain unaffected as changes are made to the internal implementation down the road.

Transforming

When you need to execute an XSLT transformation in the .NET Framework 2.0, you have two options. First, the XslTransform class from version 1.1 is still available and works the same as before. System.Xml 2.0 also introduced a new class named XslCompiledTransform, which is similar to the design of XslTransform (simplifying migration across the classes), but with a major performance improvement.

The key benefit of XslCompiledTransform is that it compiles the XSLT into MSIL before executing the transformation. This increases load time, but also greatly increases transformation execution speed. Figure 10 shows how to use it. It's also important to note that the new XSLT engine also supports the various MSXML XPath extension functions.

Figure 10 Using XslCompiledTransform

public static void Transform(
    XPathNavigator input, XmlWriter output, string xsltUri)
{
    XslCompiledTransform transform = new XslCompiledTransform();

    XsltSettings settings = new XsltSettings();
    settings.EnableDocumentFunction = true;
    settings.EnableScript = true;

    // compiles the XSLT into MSIL
    transform.Load(xsltUri, settings, null);

    // execute transformation
    transform.Transform(input, output);
}
 

What Didn't Make It

At one point during design and implementation phases, the System.Xml team was actually planning to ship XPathDocument as an official replacement for XmlDocument, and XsltCom-mand as an official replacement for XslTransform. However, the team decided that the differences across the APIs would cause significant migration pain for developers. So in both cases they decided to keep the old and simply add the new.

They also wanted to minimize the number of new classes and interfaces developers would need to know and ditched several of them that you may have seen in Beta 1 including XPathEditableNavigator, IXPathEditable, IChangeTracking, IRevertibleChangeTracking, and IXPathChangeNavigable. They merged the update functionality with XPathNavigator, and XsltCommand was renamed to XslCompiledTransform and redesigned to more closely match XslTransform.

There were also some methods available in Beta 1 for bridging between the worlds of XmlSerializer and XmlReader/Writer. They were called ReadAsObject and WriteFromObject. Both of these methods were dropped in the final release.

The main feature that didn't make it into the final System.Xml release was support for XQuery. In Beta 1 there was an XQueryCommand class that provided client-side XQuery support. The problem: XQuery wasn't a W3C Recommendation and it still had a ways to go as the .NET Framework 2.0 ship date approached. After what must have been some long and heated debates, the team decided to drop XQuery support in System.Xml, and instead they shipped a well-defined subset as part of SQL Server™ 2005. It's unclear if the team will ever ship an XQuery API as part of System.Xml.

Extrapolation

System.Xml 2.0 provides enhancements that improve performance and productivity. The library provides clear solutions for the common XML tasks and really tries to simplify things. To summarize what I covered into key actions, here's what I'd recommend:

  • Always use the static Create factory methods for creating readers and writers, even when you need support for things such as validation.
  • If you care about performance, you should always use XPathDocument as your in-memory store when querying or transforming the document.
  • Only use XmlDocument when you need an editable store, and when you do need one, use XPathNavigator to write the updating logic.
  • Always use XslCompiledTransform to execute XSLT transformations when you're concerned about performance.
  • Take advantage of the various API improvements to simplify your code.
  • Read "What's New in System.Xml for Visual Studio 2005 and the .NET Framework 2.0 Release," by Mark Fussell

By following this guidance you'll be able to take advantage of the various performance improvements, and your code will be easier to migrate and maintain over the coming years.

Send your questions and comments for Aaron to  sstation@microsoft.com.

Aaron Skonnard is a cofounder of Pluralsight, a Microsoft .NET training provider. Aaron is the author of Pluralsight's Applied Web Services 2.0, Applied BizTalk Server 2006, and Introducing Windows Communication Foundation courses. Aaron has spent years developing courses, speaking at conferences, and teaching professional developers. Reach him at pluralsight.com/aaron.