
Part 1: Introduction to XML and Schemas
BizTalk schemas are documents that define the structure of the XML (eXtended Markup Language) data in BizTalk messages, and their purpose is to create templates for processing and validating XML messages. To create BizTalk schemas, you need a basic understanding of XML. The first part of this document outlines the basic concepts of XML and then discusses how schemas are used as template definitions for XML messages.
What Is XML?
Before the creation of XML, markup languages such as HTML were primarily used to define how text is displayed on a computer screen or printed page. XML can do much more than any previous markup language. Soon after XML was created, people realized that instead of just marking up text, they could use XML as a way to structure and identify information in new ways. Previous database formats only contained raw data, but XML revolutionized the database world by making it possible to embed a description with each portion of the data. Having the description accompanying the data made it possible to reorganize that information into new categories more easily.
XML provides a way to structure data in a flexible and efficient manner. XML quickly became popular because it enabled people to create documents that included both data and the definition of that data.
XML structures and identifies information by using markup codes to enclose fields of data in a hierarchical format. XML uses a handful of standardized symbols to set off information. A typical inventory file written in XML might look like this:
<Chair>
<Name>Straight Back</Name>
<Number>040754</Number>
<Price>79.99</Price>
</Chair>
In this example, the information about a particular chair is contained in XML markup codes that define the name, number, and price. The words "Chair", "Name", "Number", and "Price" are surrounded by the standard XML code characters "<", ">", and "/". These standard codes are used to mark up the data so that a database program can read and write the information efficiently.
Each set of enclosing codes, and the word inside, is called an element. These elements surround and define the corresponding data. In the previous example, the cost of the chair, 79.99, is surrounded by the XML elements of "<Price>" and "</Price>". This combination of XML elements and enclosed data is the basic format for all XML documents, and provides an easy-to-read and easy-to-program way to store the description of the data together with the data itself.
What Are Schemas?
XML not only structures and identifies information with standardized markup codes, but also has the ability to use schemas. A schema is an XML document that works like a dictionary and is used as a reference by other XML documents. The schema code defines the spelling of XML elements and the type of data enclosed by those elements. Using schemas provides an easy way for a program to process XML documents and ensures that the structure and type of information is correct.
In the previous chair inventory example, an XML schema document would need to be created in order to define the spelling and data type of each element. For example, it would define the spelling of "Chair," "Name," "Number," and "Price." The schema would not only include the exact spelling of these words, but would also define the requirements of the data in each element. In the case of the price, the schema would require that the price be a number, and not a text string.
A typical XML schema for the chair inventory example would look like this:
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<element name="Chair" type="chairType">
<complexType name="chairType">
<sequence>
<element name="Name" type="text">
<element name="Number" type="text">
<element name="Price" type="decimal">
</sequence>
</complexType>
</element>
</schema>
This chair inventory schema creates code that corresponds to each element of the XML document that the schema references. It defines the spelling and content of each element. For example, in the chair inventory example, the price is defined with this line:
The chair inventory schema uses the following line to make sure that the element called "Price" is spelled correctly and that the data type of the information is "decimal".
<element name="Price" type="decimal">
If an XML document had the value of "Free" for the price instead of a decimal number, an error would be generated when the document was processed with the schema.
This is the basic principle of how schemas are structured. "Part 3: Creating Schemas" provides more detail about how to build schemas.