From the April 2002 issue of MSDN Magazine

Article
10/23/2019

MSDN Magazine

A Quick Guide to XML Schema

Download the code for this article:XML0204.exe (35KB)

f all the XML technologies, XML Schema is most valuable to software developers because it finally makes it possible to add type information to XML documents. This column is the first in a two-part series that will cover the basics of XML Schema.
First, let's review what preceded XML Schema. The XML 1.0 specification came with a built-in syntax for describing XML vocabularies, called Document Type Definitions (DTD). DTDs have actually been around for quite some time considering XML 1.0 inherited the syntax from its predecessor, the Standard Generalized Markup Language (SGML).
DTDs allow you to describe the structure of XML documents. For example, say that you want to use the following XML vocabulary to describe employee information:

  <employee id="555-12-3434">
  

    <name>Monica</name>
  

    <hiredate>1997-12-02</hiredate>
  

    <salary>42000.00</salary>
  

  </employee>

The following DTD describes the structure of this document:

  <!-- employee.dtd -->
  

  <!ELEMENT employee (name, hiredate, salary)>
  

  <!ATTLIST employee
  

            id CDATA #REQUIRED>
  

  <!ELEMENT name (#PCDATA)>
  

  <!ELEMENT hiredate (#PCDATA)>
  

  <!ELEMENT salary (#PCDATA)>

This DTD can then be associated with the original document through a DOCTYPE declaration, as shown here:

  <!DOCTYPE employee SYSTEM "employee.dtd">
  

  <employee id="555-12-3434">
  

    <name>Monica</name>
  

    <hiredate>1997-12-02</hiredate>
  

    <salary>42000.00</salary>
  

  </employee>

      Validation is the main benefit of using DTDs. When a validating XML 1.0 parser reads this XML 1.0 file, it can also read the associated DTD and validate that it conforms to the definition. Using DTDs for validation can reduce the amount of error handling that you must build into the application.
      Although DTDs were well suited for many SGML-based electronic publishing applications, their limitations quickly became apparent when applied to modern software development domains such as those surrounding today's Web initiatives. The main limitations of DTDs are that DTD syntax is not XML-compliant, and DTDs don't support namespaces, typical programming language data types, or defining custom types.
      Since the DTD syntax itself is not XML, you can't use standard XML tools to process the definitions programmatically. Most XML 1.0 processors support DTD validation, but they don't support programmatic access to the information found in the DTD due to the complexity of the syntax.
      Because DTDs were created before XML namespaces even existed, it's no surprise that they don't work well together. In fact, using DTDs to describe namespace-aware documents is like trying to pound a square peg into a round hole. For more on the hideous details of making this work, check out the May 2001 installment of The XML Files column in which I provide a sample namespace-aware DTD. As a result, most developers choose to use either DTDs or namespaces, but not both.
      DTDs were also specifically designed for document-centric systems in which programmatic data types typically don't exist. As a result, only a handful of type identifiers exist for describing attributes (see Figure 1). These type identifiers aren't like anything you're used to working with in your programming language. They're really just special cases of text (CDATA). Again, these types cannot be applied to text-only elements, only to attributes.
      And finally, the DTD type system is not extensible. This means you're stuck with the types described in Figure 1. Creating custom types that make sense in your problem domain is out of the question with DTDs. These limitations are enough to make any XML developer run from DTDs when presented with the exciting new future offered by XML Schema.

XML Schema Basics

XML Schema is itself an XML vocabulary for describing XML instance documents. I use the term "instance" because a schema describes a class of documents, of which there can be many different instances (see Figure 2). This is analogous to the relationship between classes and objects in today's object-oriented systems. A class is to a schema what an object is to an XML document. Therefore, while using XML Schema, you'll typically be working with more than one document, as well as the schema and one or more XML instance documents.

Figure 2 Namespace Identifier Linkage

The elements used in a schema definition come from the https://www.w3.org/2001/XMLSchema namespace, which I'll bind to xsd throughout the rest of this column. The following is the basic schema template:

  <xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
  

    targetNamespace="https://example.org/employee/">
  

    <!-- definitions go here -->
  

  </xsd:schema>

Schema definitions must have a root xsd:schema element. There are a variety of elements that may be nested within xsd:schema including, but not limited to, xsd:element, xsd:attribute, and xsd:complexType, all of which I'll discuss.
The fact that a schema definition is an XML document solves the first DTD limitation. You can process schema definitions with standard XML 1.0 tools and services such as DOM, SAX, XPath, and XSLT. The intrinsic simplicity in doing so has fueled an onslaught of schema tools.

XML Schema and Namespaces

The definitions placed within the xsd:schema element are automatically associated with the namespace specified in the targetNamespace attribute. In the case of the previous example, the schema definitions would be associated with the https://example.org/employee/ namespace.
The namespace identifier is the key that links XML documents to the corresponding Schema definition (see Figure 2). For example, the following XML instance document contains the employee element from the https://example.org/employee/ namespace:

  <tns:employee xmlns:tns="https://example.org/employee/"/>

The employee element's namespace is the same as the targetNamespace in the schema definition.
In order to take advantage of the schema while processing the employee element, the processor needs to locate the correct schema definition. How schema processors locate the schema definition for a particular namespace is not defined by the specification. Most processors, however, will allow you to load an in-memory cache of schemas that it will use while processing documents. For example, the following JScript®-based code illustrates a simple way to do this with MSXML 4.0:

  var sc = new ActiveXObject("MSXML2.XMLSchemaCache.4.0);
  

  sc.add("https://example.org/employee/", "employee.xsd");
  

  var dom = new ActiveXObject("MSXML2.DOMDocument.4.0");
  

  dom.schemas = sc;
  

  if (dom.load("employee.xml")) 
  

    WScript.echo("success: document conforms to Schema");   
  

  else
  

    WScript.echo("error: invalid instance");

It works similarly in Microsoft® .NET and in most other XML Schema-aware processors.
You can download a command-line validation utility from the link at the top of this article to experiment with the principles discussed in this column. The validation utility allows you to specify the instance document you would like to validate along with as many schema definitions as you need. The command-line usage is as follows:

  c:>validate instance.xml -s schema1.xsd -s schema2.xsd ...

XML Schema also provides the schemaLocation attribute to provide a hint in the instance document as to the whereabouts of the required schema definitions. The schemaLocation attribute is in the https://www.w3.org/2001/XMLSchema-instance namespace, which was set aside specifically for attributes that are only used in instance documents. I'll bind this namespace to the xsi prefix from now on. The xsi:schemaLocation attribute takes a space-delimited list of namespace identifier and URL pairs, as shown here:

  <tns:employee xmlns:tns="https://example.org/employee/"
  

    xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
  

    xsi:schemaLocation="https://example.org/employee/ 
  

                        https://develop.com/aarons/employee.xsd"
  

  />

In this case, if the processor doesn't already have access to the appropriate schema definition for the https://example.org/employee/ namespace, it can download it from https://develop.com/aarons/employee.xsd.

Elements and Attributes

Elements and attributes can be defined as part of the targetNamespace by using the xsd:element and xsd:attribute elements, respectively. For example, let's say that you want to describe the following namespace-aware instance document:

  <tns:employee xmlns:tns="https://example.org/employee/"
  

    tns:id="555-12-3434">
  

    <tns:name>Monica</tns:name>
  

    <tns:hiredate>1997-12-02</tns:hiredate>
  

    <tns:salary>42000.00</tns:salary>
  

  </tns:employee>

The simplest way to accomplish this is through the following schema definition:

  <xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
  

    targetNamespace="https://example.org/employee/">
  

    <xsd:element name="employee"/>
  

    <xsd:element name="name"/>
  

    <xsd:element name="hiredate"/>
  

    <xsd:element name="salary"/>
  

    <xsd:attribute name="id"/>
  

  </xsd:schema>

Notice that just by placing the xsd:element/xsd:attribute declarations within the xsd:schema element automatically associates them with the https://example.org/employee/ namespace. These declarations are considered global in the schema since they are children of the root xsd:schema element.
Since this schema specifies that these elements/attributes are part of the https://example.org/employee/ namespace, they must be associated with that namespace in the instance document (as in the original instance that I listed earlier). Making subtle namespace changes to the instance will cause it to become invalid. For example, you should consider the following document that contains unqualified name, hiredate, and salary elements along with an unqualified id attribute:

  <tns:employee xmlns:tns="https://example.org/employee/"
  

    id="555-12-3434">
  

    <name>Monica</name>
  

    <hiredate>1997-12-02</hiredate>
  

    <salary>42000.00</salary>
  

  </tns:employee>

Since the previous schema definition states that these elements/attributes are from the https://example.org/employee/ namespace and this time they aren't associated with a namespace, this instance is invalid according to the schema.
An even subtler change would be to modify the original document so that it uses a default namespace declaration instead of a namespace prefix:

  <employee xmlns="https://example.org/employee/"
  

    id="555-12-3434">
  

    <name>Monica</name>
  

    <hiredate>1997-12-02</hiredate>
  

    <salary>42000.00</salary>
  

  </employee>

Although in this case all of the elements are associated with the default namespace (https://example.org/employee/), the id attribute is still unqualified because default namespaces don't apply to attributes. As a result, this document instance is also considered invalid according to the schema.
As you can see, XML namespaces are at the very heart of XML Schema. When using XML Schema, you must fully understand how namespaces work because if the instance document doesn't agree with what the schema specifies, it will be invalid.
You may have also noticed that this simple example does not constrain the content of any of the elements nor does it define the structural relationship between the elements in the namespace. It's equivalent to the following DTD (omitting the attribute declaration for now):

  <!ELEMENT employee ANY>
  

  <!ELEMENT name ANY>
  

  <!ELEMENT hiredate ANY>
  

  <!ELEMENT salary ANY>

Thus the following XML instance document would also be valid according to the schema, even though the document doesn't make any sense:

  <tns:name xmlns:tns="https://example.org/employee/">
  

    <tns:employee>
  

      <tns:hiredate>42.000</hiredate>
  

      <tns:salary tns:id="555-12-3434">Monica</tns:salary>
  

    </tns:employee>
  

  </tns:name>

XML Schema makes it possible to describe an element's structure through complex type definitions.

Defining Complex Types

With DTDs, an element's content model is defined within an ELEMENT declaration, as shown here:

  <!ELEMENT employee (name, hiredate, salary)>

This ELEMENT declaration states that an employee element contains a name element, followed by a hiredate element, followed by a salary element.
XML Schema makes it possible to define an element's content model in a similar fashion by nesting an xsd:complexType element within the xsd:element declaration, as shown here:

  <xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
  

    targetNamespace="https://example.org/employee/">
  

    <xsd:element name="employee">
  

      <xsd:complexType>
  

        <!-- employee's content model goes here -->
  

      </xsd:complexType>
  

    </xsd:element>
  

  </xsd:schema>

The XML Schema model is more like a programming language where you bind variables to formal type definitions. The xsd:complexType allows you to define the type of an element, which conveys its structure. Nesting xsd:complexType within the element declaration effectively binds it to that element (like a variable). Thinking in terms of type definitions is a major paradigm shift from the way DTDs work.
What you put inside of the xsd:complexType element is similar to what you put inside parentheses in a DTD ELEMENT declaration. The previous employee ELEMENT declaration specifies an ordered sequence of name, hiredate, and salary elements. Using a pipe (|) separator instead of a comma changes the meaning to a choice of one element:

  <!ELEMENT employee (name | hiredate | salary)>

In XML Schema, you specify the characteristics of the content model through a compositor element, which is nested as a child of the xsd:complexType element. XML Schema defines three compositor elements: xsd:sequence, xsd:choice, and xsd:all (shown in Figure 3).
The xsd:sequence and xsd:choice elements are equivalent to the DTD examples just shown. However, xsd:all is a new concept—it specifies that the content model consists of all items in any order. This wasn't designed into the DTD syntax, although you could define such semantics by explicitly specifying all of the possible permutations as follows:

  <!ELEMENT employee ( (name, hiredate, salary) |
  

                       (name, salary, hiredate) |
  

                       (hiredate, name, salary) |
  

                       (hiredate, salary, name) |
  

                       (salary, name, hiredate) | 
  

                       (salary, hiredate, name) ) >

As you can see, combinatorial mathematics begins to work against you quickly. The XML Schema approach is much cleaner since "all" is a first-class compositor like sequence and choice.
      Compositor elements may contain references to global element declarations, local element declarations, other compositors, and a few other constructs such as wildcards and group references. The schema example shown in Figure 4 shows how to define an xsd:complexType that references global elements defined elsewhere in the schema.
      Notice that the ref attribute takes a prefixed element name. Remember that once you declare a global element in the schema, it's automatically associated with the targetNamespace. When you reference global elements by name, they are treated as qualified names. Had I used ref="name" instead of ref="tns:name," the schema processor would have looked for the name element associated with no namespace (or the default namespace, had one been used) and it wouldn't have found one since the only name element declared in the schema is the name element from the https://example.org/employee/ namespace.
      If I had made https://example.org/employee/ the default namespace for the document, then I could have referenced the global element names without using a namespace prefix (for example, ref="name"), as shown in Figure 5.
      The sample schemas in Figure 4 and Figure 5 are logically equivalent—they've simply been serialized somewhat differently. Both sample schemas constrain the content of the employee element. Now, the employee element must contain a name, hiredate, and salary element, all of which must be associated with the https://example.org/employee/ namespace.

Local Element Declarations

Since employee is the only element that I plan to use as a top-level element in the instance documents, there is really no reason to define name, hiredate, and salary as global elements. Instead, I can just define the name, hiredate, and salary elements locally within the employee element's content model.
For example, the schema shown in Figure 6 contains an employee element declaration that contains a sequence of local element declarations. In this example, the name, hiredate, and salary elements are actually declared as part of the employee element and cannot be used elsewhere in an instance. The employee element declaration is the only one that appears globally as a child of the root xsd:schema element. This brings up an interesting question: should local elements be associated with the target namespace?

Local Scoping and Namespaces

To help understand the answer to this question, let's consider a similar example in a programming language that supports namespaces, like C#. Review the following C# class definition that has been defined within the "example" namespace:

  namespace example {
  

    public class employee {
  

      public string name;
  

      public string hiredate;
  

      public double salary;
  

    }
  

  }

What identifiers are actually visible within this namespace? Only one—employee. The name, hiredate, and salary identifiers are only visible within the employee class. Hence, you need to qualify employee with the namespace identifier but not the local member names, as shown here:

  // employee is namespace-qualified
  

  example.employee c = new example.employee();
  

  // local members are unqualified
  

  c.name = "Monica";
  

  c.hiredate = "1997-12-02";
  

  c.salary = 42000.00;
  

  // this does not work, nor make sense
  

  // c.example.name = "Monica";

The XML Schema designers seemed to be thinking along the lines of local scoping because by default it works the same way. In XML Schema, only global elements need to be qualified with the target namespace in an instance document, and local elements must remain unqualified. An XML document that is a valid instance of the complete schema that was shown in Figure 6 is shown here:

  <!-- global element qualified -->
  

  <tns:employee xmlns:tns="https://example.org/employee/">
  

    <!-- local elements unqualified -->
  

    <name>Monica</name>
  

    <hiredate>1997-12-02</hiredate>
  

    <salary>42000.00</salary>
  

  </tns:employee>

If you modify this instance so that the name, hiredate, and salary elements are qualified by the https://example.org/employee/ namespace, it becomes invalid according to the schema. Remember, even the subtle change to a default namespace declaration will cause this to happen.
Since not everyone was happy with this approach, the XML Schema designers made it possible to control whether local elements should be qualified or unqualified in the instance. You can control this on an element-per-element basis through the form attribute, as the following illustrates:

  <xsd:element name="employee">
  

    <xsd:complexType>
  

      <xsd:sequence>
  

        <!-- local element declarations -->
  

        <xsd:element name="name" form="qualified"/>
  

        <xsd:element name="hiredate"/>
  

        <xsd:element name="salary" form="qualified"/>        
  

      </xsd:sequence> 
  

    </xsd:complexType>
  

  </xsd:element>

A valid instance of the employee element would now have a qualified child name element and an unqualified child hiredate element, followed by a qualified child salary element, as in this valid instance:

  <tns:employee xmlns:tns="https://example.org/employee/">
  

    <tns:name>Monica</tns:name>
  

    <hiredate>1997-12-02</hiredate>
  

    <tns:salary>42000.00</tns:salary>
  

  </tns:employee>

You can also toggle the default setting for all local element declarations in the schema through the elementFormDefault attribute, as shown here:

  <xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
  

    targetNamespace="https://example.org/employee/"
  

    elementFormDefault="qualified">
  

   •••
  

  </xsd:schema>

Now by default all local elements must be qualified in the instance (assuming the form attribute hasn't been used to override the setting for a particular element), as in this valid instance:

  <tns:employee xmlns:tns="https://example.org/employee/">
  

    <tns:name>Monica</tns:name>
  

    <tns:hiredate>1997-12-02</tns:hiredate>
  

    <tns:salary>42000.00</tns:salary>
  

  </tns:employee>

Since all elements are qualified in this case, you could now choose to use a default namespace declaration and the instance would still be valid:

  <employee xmlns="https://example.org/employee/">
  

    <name>Monica</name>
  

    <hiredate>1997-12-02</hiredate>
  

    <salary>42000.00</salary>
  

  </employee>

Occurrence Constraints

      In a DTD you can control how many times an element occurs in a content model through the *, +, and ? modifiers. XML Schema did away with those modifiers and simply defined two attributes, minOccurs and maxOccurs, which can be used on element declarations, compositors, and a few other schema constructs.
      The minimum number of times an item must appear and the maximum number of times it can appear are specified by minOccurs and maxOccurs, respectively. The default value for both attributes is 1. You can also use the value "unbounded" in maxOccurs to specify that unlimited occurrences are acceptable.
      Consider the following DTD ELEMENT declaration:

  <!ELEMENT employee ( (fname, (middle | mi)?, lname, lname?), 
  

                       (project, role)* )>

This ELEMENT declaration can be rewritten in XML Schema using several nested compositors along with minOccurs/maxOccurs, as shown in Figure 7.

Complex Types and Attributes

With DTDs, attributes were defined for a particular element. The following ATTLIST declaration associates the id attribute with the employee element:

  <!ELEMENT employee (name, hiredate, salary)>
  

  <!ATTLIST employee id CDATA #REQUIRED>

It was not possible to define global attributes in DTDs. They always had to be associated with a particular element, as just shown.
XML Schema makes it possible to define both global and local attributes just as with elements. Global attributes are defined using the xsd:attribute element within the root xsd:schema element. The first schema example shown defined a global attribute named id. Global attributes are meant for situations where you cannot anticipate where they will be used.
Attributes can also be included in an xsd:complexType definition, which makes them local to that specific type. When used within an xsd:complexType element, the xsd:attribute elements must follow the child compositor, as shown in the following global element declaration:

    <xsd:element name="employee">
  

      <xsd:complexType>
  

        <xsd:sequence>
  

          <xsd:element name="name"/>
  

        </xsd:sequence>
  

        <xsd:attribute name="id"/>
  

      </xsd:complexType>
  

    </xsd:element>

As with elements, you must qualify global attributes and must not qualify local attributes in the instance document by default. The following is a valid instance of the employee element defined previously:

    <tns:employee xmlns:tns="https://  
  

    example.org/employee/"
  

         id='555-12-3434'>
  

      <name>Monica</name>
  

    </tns:employee>

However, as with elements, you can change this behavior through the form or attributeFormDefault attributes if you would rather make local attributes qualified.

Named Types and Reuse

Up to this point, I've been defining the type (or structure) of elements through xsd:complexType definitions. These type definitions, however, were not named as they were tied to the newly declared element. This approach is similar to using anonymous types in C++. For example, the following C++ code maps the pt variable to the anonymous struct that precedes it:

  struct {
  

    double x;
  

    double y;
  

  } pt;

The obvious downside to using anonymous types is that you cannot reuse the type definitions. Therefore, most C++ developers use named types:

  struct Point {
  

    double x;
  

    double y;
  

  };
  

  Point pt1;

XML Schema supports named types as well, and it's actually the approach preferred by most developers because of the valuable reuse potential. Not only can you reuse named types within a single schema, you can also reuse named types across schemas through the xsd:include and xsd:import elements.
You name an xsd:complexType through the name attribute. Then you can bind element declarations to a named type through the type attribute. The following schema illustrates how this element binding works:

  <xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
  

    xmlns:tns="https://example.org/employee/"
  

    targetNamespace="https://example.org/employee/">
  

    <xsd:complexType name="EmployeeType">
  

      <xsd:sequence>
  

        <xsd:element name="name"/>
  

        <xsd:element name="hiredate"/>
  

        <xsd:element name="salary"/>        
  

      </xsd:sequence> 
  

    </xsd:complexType> 
  

    <xsd:element name="employee" type="tns:EmployeeType"/>
  

  </xsd:schema>

Notice that since the xsd:complexType definition is now a global construct in the schema, it's automatically associated with the targetNamespace. This means you must use a qualified name when referring to EmployeeType in the type attribute. You could go back through all of the sample schemas shown in this article and convert the anonymous xsd:complexTypes into named types. It's easy to translate between the two techniques.
Another benefit of using named types is that you don't actually have to use global element declarations in the schema if you don't want to. Instead, you can explicitly specify the type of an element in the instance document through the xsi:type attribute, which also comes from the https://www.w3.org/2001/XMLSchema-instance namespace. For example, you should consider the following instance to accomplish this task:

  <foo xsi:type="tns:EmployeeType"
  

     xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
  

     xmlns:tns="https://example.org/employee/"
  

  >
  

    <name>Monica</name>
  

    <hiredate>1997-12-02</hiredate>
  

    <salary>42000.00</salary>
  

  </foo>

The foo element is not declared anywhere in the schema, but I've explicitly specified its type, which is enough for the processor to figure out how to treat its content. This technique is similar to a cast in most programming languages.

Introducing Data Types

I've covered the main details of describing a document's structural details through xsd:element, xsd:attribute, and xsd:complexType definitions. However, the schemas you've seen so far haven't described the characteristics of the name, hiredate, and salary elements nor those of the id attribute. At this point, they could legally contain anything. The name, hiredate, and salary elements as well as the id attribute should really only contain text, and each should be in a specific format.
In a DTD, you can specify that an element must only contain text through the #PCDATA token:

  <!ELEMENT name (#PCDATA)>

Unfortunately, you cannot specify anything about the format of the text contained within the element.

Figure 8 XML Schema Built-in Data Types

This is where XML Schema takes a significant stride past DTDs, especially for software developers. XML Schema defines a set of built-in data types that may be used to constrain the content of text-only elements and attributes (see Figure 8). Each data type has an explicitly defined value space as well as an explicitly defined lexical space (or in other words, the possible string formats used in the XML document). For example, the double value 4200 can be represented lexically in a variety of ways (see Figure 9). For more details on the value and lexical space details for a given data type, see Part 2 of the XML Schema specification (relevant URLs can be found in the "Recommended Reading" sidebar).

Figure 9 Lexical Representations

Therefore, to constrain the text that's used within an element or attribute, simply choose the data type with the appropriate value/lexical space and use it in the declaration's type attribute, as shown in the following code:

  <xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema"
  

    xmlns:tns="https://example.org/employee/"
  

    targetNamespace="https://example.org/employee/">
  

    <xsd:complexType name="EmployeeType">
  

      <xsd:sequence>
  

        <xsd:element name="name" type="xsd:string"/>
  

        <xsd:element name="hiredate" type="xsd:date"/>
  

        <xsd:element name="salary" type="xsd:double"/>        
  

      </xsd:sequence> 
  

      <xsd:attribute name="id" type="xsd:string"/>
  

    </xsd:complexType> 
  

    <xsd:element name="employee" type="tns:EmployeeType"/>
  

  </xsd:schema>

When a schema processor validates an instance of the previous schema, it will ensure that the text contained within each of the elements/attributes conforms to a legal lexical representation of its defined type.
As you can see in Figure 8, there is a data type for just about every occasion. Nevertheless, you'll surely encounter situations where the built-in data types aren't precise enough for your needs. For example, in the previous schema definition the id attribute is defined to be of type string, but it really needs to be in the format of a Social Security Number. XML Schema does make it possible to define custom simple types for these situations, but I'll save that topic for Part 2 of this column.

Where Are We?

      XML Schema overcomes all of DTD's limitations and weaknesses. The XML Schema syntax is XML 1.0. XML Schema was completely designed around namespaces. And, most importantly, XML Schema supports typical programming language data types as well as custom simple and complex types.
      Although the W3C recently published the final XML Schema Recommendation (in May 2001), there is already widespread support for the specification throughout a variety of XML and Web Service-related infrastructure. This infrastructure relies on XML Schema to automatically generate XML processing code, build dynamic proxies/stubs on the fly, provide IntelliSense® within editors and other tools, and simplify error handling through schema validation techniques. MSXML 4.0, the SOAP Toolkit 2.0, and .NET are all great examples of what you can accomplish by using XML Schema.
      This month's column only covers the basics of XML Schema. I've covered all the functionality provided by DTDs plus a bit more. In the second part of this series, I'll show you the more advanced features of the language.

Send questions and comments for Aaron to xmlfiles@microsoft.com.

Aaron Skonnard is an instructor/researcher at DevelopMentor, where he develops the XML and Web service-related curriculum. Aaron coauthored Essential XML Quick Reference (Addison-Wesley, 2001) and Essential XML (Addison-Wesley, 2000). Get in touch with Aaron at https://staff.develop.com/aarons.