The XML Files
XML Report from the Microsoft PDC 2003
This issue marks three years of The XML Files. As I reminisce over my personal XML journey and how simply it all began, I'm astonished at how firmly rooted XML is in today's software development landscape. I always said "XML will change the world," but only after reflection did I realize that it has actually happened.
XML was everywhere at the November 2003 PDC. There weren't many sessions that didn't at least mention XML or some other new "X" technology. Although Microsoft has always been a leader in XML, their innovative use of XML in future technologies highlights their commitment to interoperability. In this installment of The XML Files, I'll summarize the juicy XML activity at the PDC. You can find the PDC session slides at msdn.microsoft.com/events/pdc/agendaandsessions/sessions. Please note that much of the information in this month's column covers beta releases and, as such, is subject to change at any time.
The next version of the Windows® operating system, code-named "Longhorn," comes with a new communications subsystem, code-named "Indigo." Indigo was the major XML-related topic at the show. Indigo is a set of Microsoft® .NET Framework technologies for building loosely coupled, connected systems. Although Indigo is a key ingredient of Longhorn, it will also run on Windows XP and Windows Server™ 2003.
The Indigo infrastructure is built around the advanced Web services architecture shown in Figure 1. The technologies that make up this architecture enable rich interoperability between systems, ranging from simple communications to more advanced features like security, reliability, and distributed transactions. With this foundation, Indigo is capable of providing solutions for many distributed application needs within a single unified programming model that's approachable by the masses.
Figure 1 Web Services Architecture
Indigo represents the unification of ASP.NET Web Methods (.ASMX), .NET Remoting, and Enterprise Services. Indigo also brings along COM+ and Microsoft Message Queue Server (MSMQ), the most widely used .NET Framework distributed technologies, and understands their protocols natively. For more background and detailed information on how to use the Indigo components, check out Don Box's article "A Guide to Developing and Running Connected Systems with Indigo
" in the January 2004 issue of MSDN
. The Web Services Developer Center can be found at http://msdn.microsoft.com/webservices
To best position your applications for Indigo migration, Microsoft prescriptive guidance states that you should embrace ASMX, Enterprise Services, and a constrained use of .NET Remoting for cross-AppDomain or cross-process scenarios if you're looking for the smoothest transition possible. See The Road to Indigo Technology Roadmap
for more information on the Indigo roadmap.
Overall, the Indigo unified programming model simplifies the developer experience and provides the flexibility required for a wide range of communications, from simple in-process scenarios, performance-sensitive intranet scenarios, to highly scalable Web services scenarios.
For more information, check out the Indigo section in the Longhorn SDK.
System.Xml in "Whidbey"
The new version of Visual Studio® and the .NET Framework, code-named "Whidbey," will include new System.Xml features. Most of the System.Xml improvements in Whidbey will be centered on performance or usability. In terms of performance, both XmlTextReader and XmlTextWriter are twice as fast as they were in version 1.1 implementations. XML Schema validation is a more modest 20 percent faster while the XSLT transformation engine blows away previous versions with a whopping 400 percent performance increase (Figure 2) comparing the time it takes to perform various tasks. These improvements are highly anticipated since performance speeds became one of the most common criticisms of the .NET Framework version 1.1.
Figure 2 Time to Complete Tasks
In terms of usability improvements, common language runtime (CLR) type accessors are being added on XmlReader, XmlWriter, and XPathNavigator. These accessors will allow you to write the following convenient code, eliminating the need to coerce text to CLR types manually:
double weight = reader.ValueAsDouble;
Microsoft also introduced a new class called XmlFactory for instantiating XmlReader, XmlWriter, and XPathNavigator implementations based on user-defined settings:
XmlFactory factory = new XmlFactory();
factory.ReaderSettings.XsdValidate = true;
XmlReader r =
Probably the biggest API change in version 2.0 is the shift away from XmlDocument, the .NET implementation of the W3C DOM Level 2 Recommendation. Although XmlDocument will still be present in the Whidbey release, the System.Xml team hopes that XPathDocument will supplant it as the primary in-memory XML store. Unlike the read-only version 1.1 implementation, XPathDocument will now provide editing capabilities through the XPathEditor class (see Figure 3).
Figure 3 XPathDocument
XPathDocument provides many advantages over XmlDocument, including a 20 to 40 percent performance boost for XSLT operations, built-in XML Schema validation, support for strongly typed values (for example, xsd:integer stored as int), automatic change tracking at the node level (equivalent to what the ADO.NET DataSet provides), and data-binding support with Windows Forms and ASP.NET controls. You'll still use an XPathNavigator to read and query an XPathDocument in Whidbey, although a new class called XPathChangeNavigator has been introduced to access the changes made to an XPathDocument from its creation.
Another major addition to the library is XML Views, a mechanism for mapping between relational data and XML types. This makes it possible to view data stored in SQL Server™ in a specific XML format. This functionality is equivalent to what's provided by SQLXML 3.0 today, but with several enhancements. There's also a new class called XmlAdapter for connecting an XPathDocument to a database through an XML view (equivalent to the ADO.NET DataAdapter).
Whidbey also introduces support for the W3C XQuery language. Think of XQuery as SQL for XML. Its overall functionality is virtually equivalent to XSLT but most developers will probably have an easier time with its SQL-like syntax. As a result, Microsoft believes that XQuery will become the preferred query language when working with XML sources.
Since both XSLT and XQuery are built on XPath, they both require the same fundamental query engine. Hence, in Whidbey, System.Xml provides a common query engine for processing instructions from multiple query languages including XPath, XQuery, XSLT, or an XML view. Expressions from any of these languages compile down into an intermediate query language representation. The intermediate representation is then fed into the query engine, which generates Microsoft intermediate language (MSIL) code. Microsoft calls this new design the Common Query Architecture.
The Common Query Architecture provides a flexible layer of abstraction that allows the introduction of new query languages down the road. You would simply need to write a compiler to generate the intermediate language representation for the new language. The architecture also provides great performance improvements due to the fact that it generates MSIL code, which then compiles down to machine code before executing.
System.Xml.Query is a new namespace that encapsulates the classes making up this architecture. It includes a class named XQueryProcessor for compiling and processing XQuery expressions, and another class named XsltProcessor for compiling and processing XSLT transformations. Both of these classes share the same underlying query engine.
System.Xml has become one of the industry's favorite XML libraries since it first shipped with .NET version 1.0. The innovations introduced with System.Xml have improved the experience for XML developers (for example, pull-parsing is becoming commonplace, even in Java stacks). The improvements and additional functionality described here should raise the bar even higher.
Whidbey XML Serialization
Another way to program XML is to rely on serialization. Serialization is the process of transforming an object graph into a byte stream, while deserialization is the process of transforming the serialized byte stream back into the same object graph. Serialization also includes the process of transforming a type from one type system into a type in another type system (transforming a CLR type into an XSD type or the reverse, for example).
There are two different serialization engines available in .NET Framework 1.1. One provides XML-based serialization (System.Xml.Serialization), while the other provides CLR-based serialization (System.Runtime.Serialization). The former makes it possible to serialize objects or types in an interoperable manner, using XML and XSD, while the latter is useful for serializing CLR objects between .NET environments. This fundamental architecture remains the same in Whidbey, although they've made several improvements such as introducing support for generics and SqlTypes. The BinaryFormatter was also enhanced with version tolerance capabilities and performance improvements. The more significant changes, however, are slated for Indigo.
Moving towards Indigo has made two things clear when it comes to serialization: interoperability is a must, and there are different programming models where serialization comes into play. The interoperability requirement rules out the use of System.Runtime.Serialization as it exists today. There's an XML programming model where you start with XML Schema and use serialization to get back to the CLR. There's also a CLR programming model where you start with CLR types and use serialization to expose interoperable XSD contracts to the rest of the world.
Today System.Xml.Serialization supports both programming models. Supporting the XML programming model is challenging because there are so many aspects of XML Schema that don't map nicely to the CLR. Starting with CLR types is more straightforward because it's possible to define a standard XSD mapping for any CLR type. Catering to both programming models simultaneously, however, requires lowering the bar and restricting options. What's missing is a serialization layer that caters to a CLR programming model without sacrificing interoperability and loose coupling. Indigo introduces a new serialization engine accessible via XmlFormatter that fills this need. XmlFormatter can export all CLR types to XSD, providing support for a canonical mapping between the two. Once Indigo ships, XmlFormatter will likely become the serialization choice for most developers using .NET.
Visual Studio Whidbey
The XML editor that ships with Visual Studio .NET 2003 leaves most XML developers wanting more. This has been good news for ISVs because the void created a strong market for third-party tools. The Visual Studio .NET XML editor did a so-so job with IntelliSense® and offered little beyond that. The new, improved XML editor included in Whidbey represents a significant improvement.
First, and most importantly, the new XML editor does a much better job with IntelliSense and validation. I heard one happy developer proclaim that it's always right, even when namespaces come into play. IntelliSense can be driven by XML Schema definitions or Document Type Definitions (DTDs)—the original XML editor only provided support for XML Schema. The new editor also performs real-time validation against the target definition and provides traditional red squiggles along with helpful error messages to assist with detection and correction.
The other major new IDE feature is XSLT editing and debugging. The new editor provides complete XSLT IntelliSense as well as full-fledged interactive debugging support with language integration. Before, debugging support was only available through commercial third-party add-ons.
There are a few other usability improvements worth mentioning. One is XML outlining support, with expand and collapse functionality that makes it easier to navigate large XML files. Another is that you can select a piece of text and automatically wrap it in XML comments or CDATA sections. It also provides some nice reformatting features for automatically adjusting the style of your document. Overall, it appears that the new Whidbey XML editor will be much more enjoyable and productive to use.
Whidbey Service-oriented App Designer
At the 2003 PDC, Microsoft unveiled a new suite of Visual Studio tools for building service-oriented applications (SOA), code-named "Whitehorse." These tools have a simple objective: to make SOA development easier. The first version of Whitehorse is scheduled to ship with Whidbey and they'll continue to evolve into one of the major components in the subsequent version of Visual Studio, code-named "Orcas." The official name for Whitehorse is Visual Studio Whidbey Service-oriented Application Designer, but I'll simply refer to it as Whitehorse.
Whitehorse is intended to simplify the architecture, design, development, and deployment of applications that are composed from various distributed services. Whitehorse includes a distributed services designer for visualizing application architectures along with tools that cover data modeling, constraint validation, code generation, and reporting details. The tools provide friendly drag and drop-style environments that are intuitive for solution architects, abstracting away the details of lower-level code. These tools allow solution architects to direct the design without dealing with complex components like WSDL files directly. The generated code always stays in sync with the visual model and can be deployed to different environments.
Another major benefit is the support for matching application solutions with compatible deployment environments. The Whitehorse designers allow datacenter administrators and solution architects to work independently and simultaneously, each specifying their own constraints for their parts of the overall solution. Whitehorse can then ride in to help identify incompatibilities and assist in resolving conflicts early in the process.
The Whitehorse SOA designer was not included with the PDC Whidbey bits but Microsoft says it will be included in a future Whidbey beta. In the meantime, you can check out Roadmap to the Future for various Whitehorse articles including a roadmap to future Microsoft development tools (covers both Whidbey and Orcas, an overview to the Visual Studio Whidbey Service-oriented Application Designer, and a helpful FAQ). And for some Whitehorse screenshots, check out http://msdn.microsoft.com/vstudio/productinfo/roadmap.aspx.
SQL Server Yukon
Coverage of the next version of SQL Server, code-named "Yukon," was also one of the highlights of the show. One of the main Yukon features anticipated by developers is the expanded .NET language support. Yukon does this by hosting the CLR in the database engine, letting developers author stored procedures and user-defined types in any .NET-compatible language.
The other major new Yukon feature is richer XML support. In Yukon, XML is now a native datatype. This means that you can create a table with a column of type xml, as illustrated here:
CREATE TABLE BlogEntry(id int, entry xml)
In this case, entry is referred to as an untyped XML column. When you insert a value into an untyped XML column, the XML is checked for well-formedness before completing the operation. With untyped columns, the XML is stored in its native XML 1.0 format using the UCS-2 encoding. You can also create a typed XML column by associating it with an XML Schema definition. To do this, you must first register the schema through the CREATE XMLSCHEMA statement. Then you specify the schema mapping in the CREATE TABLE statement, as shown here:
CREATE XMLSCHEMA '<xsd:schema targetNamespace="http://example.org/blog"
CREATE TABLE BlogEntry(id int, entry xml ('http://example.org/blog'))
With the schema mapping in place, the database engine can perform schema validation on inserts or updates and it can optimize the way the XML is stored. Instead of storing it in XML 1.0 format, the database engine can create additional internal structures for storing the information in the XML document as native T-SQL types (such as varchar, int, and so on), which greatly improves performance and reduces storage overhead.
After you get the XML values into the database, there are various ways to get them back out. You can simply select the column if you want to retrieve the entire XML document. Yukon also provides native support for XQuery, which allows you to query XML columns for specific fragments or values:
SELECT entry::query('/item/author') FROM BlogEntry
In addition to the new XML datatype and XQuery support, Yukon also improves its support for Web services. Yukon can host Web services straight from the database using existing Web services standards including HTTP, XML Schema, SOAP, and WSDL.
Longhorn's X Technology: XAML
Longhorn also comes with a new presentation subsystem, code-named "Avalon," that relies heavily on XML. It provides a declarative programming model similar to what's used on the Web today. In fact, the easiest way to describe Avalon is to say that it brings together the best from the Web (for example, seamless deployment, flowable layout, and progressive download and rendering) and the best from Windows (unrestricted functionality, desktop integration, good offline support, performance, and so on).
The Avalon declarative programming model is based on a new XML vocabulary called Extensible Application Markup, code-named "XAML" (pronounced "zammel"). Writing an Avalon-based application starts with authoring a XAML file (typically through a designer) to define the layout of UI elements. Then you write the functionality of the application in procedural code, which you can embed in the XAML file itself or keep separate in a codebehind file (like in ASP.NET). When you build the application, the compiler reads the XAML markup and generates code using an equivalent object model (you can use the object model directly if you want). Avalon-based apps can run in a desktop window or be hosted in a browser. Figure 4 shows an example of a simple XAML document that contains a single button and a codebehind file. Figure 5 shows the codebehind file that simply changes the button's text to "Hello World" when it's pressed. For more information on Longhorn, Avalon, and XAML surf to http://longhorn.msdn.microsoft.com.
Figure 5 The CodeBehind
public partial class MyPage
void HandleClick(object sender, ClickEventArgs e)
Button1.Content = "Hello World";
Figure 4 Simple XAML Doc
XML in Action: PDC Blogging
Over the past year, blogging has emerged as the latest XML trend as well as an effective form of communication among developers. There are several tools available today that make it easy to set up your own blog (.Text and dasBlog, for instance). Others can then subscribe to your blog through an RSS feed and read it with an RSS aggregator. An RSS aggregator makes reading blogs like reading e-mail (highlighted new entries, conversation threads, and so on). RSS is a simple XML vocabulary for structuring the content of a blog. Most developers find this new form of communication superior to mailing lists and newsgroups because you control your subscriptions, so you only have to see posts from those you want to listen to, thereby reducing noise and frustration.
Blogging was commonplace at this year's PDC. There were many attendees blogging in real-time during conference sessions, publishing major announcements, key points, and personal reactions. This made it possible for developers everywhere to vicariously experience the exciting moments without risking their lives in L.A. A few developers put together a Web site called Professional Developer Community Bloggers (http://pdcbloggers.net
) to make it easier to find PDC-related posts. PDC Bloggers monitors numerous individual blogs and highlights pertinent posts for your convenience. If you didn't get a chance to go to the PDC, you should definitely spend some time on this site.
Send your questions and comments for Aaron to firstname.lastname@example.org.
teaches at Northface University in Salt Lake City. Aaron coauthored Essential XML Quick Reference
(Addison-Wesley, 2001) and Essential XML
(Addison-Wesley, 2000), and frequently speaks at conferences. Reach him at http://www.skonnard.com