Troubleshooting XML Schemas in InfoPath 2003
Summary: Microsoft Office InfoPath 2003 allows you to create XML form solutions by loading an externally authored XML Schema (XSD) definition file into the InfoPath design environment. Learn how to take advantage of InfoPath support for using externally authored XSD files to create custom form templates, and find out how to troubleshoot common problems. (16 printed pages)
Andrew Begun, Alessandro Catorcini, Mark Roberts
Applies to: Microsoft Office InfoPath 2003, Microsoft Office InfoPath 2003 Service Pack 1
Introduction to Troubleshooting Schemas
A form template that you create with Microsoft Office InfoPath 2003 uses an XML Schema (XSD) to perform structural and data validation on the XML that is input, edited, and output from an InfoPath form. Every form template created in InfoPath design mode contains at least one XSD schema file (.xsd) that is used for validation at run time. This article discusses InfoPath support for externally authored XML Schemas used to create form templates.
To load an XSD schema file that was authored outside of InfoPath
On the File menu, click Design a Form.
In the Design a Form task pane, click New from XML Document or Schema.
You can specify any XSD file you want to use by manually editing your template files outside of design mode. The XSD file you specify, however, must be one that Microsoft XML Core Services (MSXML) 5.0 considers valid.
Microsoft Office InfoPath 2003 Service Pack 1 (SP 1) includes support for loading schema files that you previously could not load, as well as a number of other product enhancements to improve XSD support. For example, with InfoPath SP 1 installed, you can change the underlying schema while you are designing the form template. New controls such as the Choice Group, Repeating Choice Group, and Repeating Recursive Section are also added.
For additional information about XML Schema standards and implementation, see the following resources:
XML Schema Resources
W3C XSD Page
W3C Schema Primer
W3C Schema Structure Reference
W3C Schema Datatypes Reference
XML Schema Tutorial
MSXML 5.0 XML Schema Documentation. See the "XML Schemas" book in the MSXML 5.0 SDK documentation, which the InfoPath 2003 setup program installs in drive:\Program Files\Microsoft Office\OFFICE11\1033\XMLSDK5.CHM
Unsupported XSD Constructs
The following sections describe XSD constructs that InfoPath cannot handle at run time. Avoid these constructs when creating a form template in InfoPath design mode. Differences are called out between support in InfoPath 2003 and InfoPath 2003 SP 1.
ENTITY and ENTITIES Types
The ENTITY and ENTITIES types require a document type definition (DTD) for validation, which InfoPath does not support, with or without InfoPath SP 1. InfoPath does not allow you to design a form template against such a schema and displays an error message that recommends changing the ENTITY type to the NCName type from which ENTITY derives.
If you manually author an InfoPath form template outside of design mode, and it uses an XSD that includes ENTITY and ENTITIES types, the form template may work at run time if the Template.xml file contains the required DTD for these types.
Required xsd:any Element
An occurrence of an xsd:any wildcard element—that is, an occurrence of an xsd:any element with a minOccurs attribute value greater than zero ("required any")—prevents InfoPath from deterministically creating a valid instance for this schema fragment. Because InfoPath must be able to create a valid instance when generating a form that uses this schema fragment, schemas with required xsd:any elements are not supported. As a workaround, you can modify the schema to specify exactly which elements you want to declare as choices at that point in the schema.
InfoPath 2003 SP 1
When you design a new form template from an XML Schema, as part of running the Data Source Wizard, InfoPath SP 1 prompts you to choose which schema element you want to use in place of the required xsd:any element.
Elements with an Abstract Complex Type
If an element is defined as having an abstract type, then the element cannot appear in the instance document as an instance of that type; rather, it must be specified as a type deriving from the abstract type. For the schema to validate correctly, the instance of that element must use a different type and must specify that type with an xsi:type attribute. The type referenced in this attribute must derive from the abstract type.
Without InfoPath SP 1 installed, you cannot design a form against a schema that uses this construct. One workaround is to change the element type to a derived type of the abstract type in the schema itself.
InfoPath 2003 SP 1
With InfoPath SP 1, design mode supports designing a form template against schemas that use abstract complex types. For example, if an element named
shippingAddresshas an abstract complex type named
addressthat has two derivations,
CanadianAddress, then InfoPath treats any instance of
shippingAddressas a choice between
In this example, if the provided schemas contain no types that derive from
address, then InfoPath requests an additional schema that fulfills this requirement.
XSD Constructs with Reduced Functionality
The following sections describe XSD constructs that have reduced functionality when used to create a form template in InfoPath design mode. Differences are called out between support in InfoPath 2003 and InfoPath 2003 SP 1.
When you design a form template against an XSD that contains substitution groups, InfoPath ignores the substitution group members and does not display them in the Data Source task pane. The only exception is for abstract elements. In this case, a member of the substitution group for the abstract element takes the place of the element in the Data Source task pane.
If the substitution group for an abstract element contains no elements, InfoPath design mode fails to work against the provided XSD.
InfoPath 2003 SP 1
With InfoPath SP 1, all members of the substitution group appear in the Data Source task pane. InfoPath represents the substitution possibilities as a choice of all the substitution groups (including the defining element, if it is not abstract).
If there are no substitution groups for an abstract element, InfoPath prompts you to provide a schema that contains at least one element that is a substitution group.
Unbounded Choice Elements
The following schema fragment shows an unbounded choice element:
<xsd:choice maxOccurs="unbounded"> <xsd:element name="my_element_1"/> <xsd:element name="my_element_2"/> </xsd:choice>
InfoPath treats unbounded choice elements the same as a sequence of unbounded elements in the Data Source task pane. The contents of the choice element each can have an individual list control bound to them.
InfoPath 2003 SP 1
InfoPath SP 1 displays repeating choice elements as repeating choices in the Data Source task pane. There is also a Repeating Choice Group control that you can use to represent the heterogeneous list defined by the repeating choice element in the XSD.
The following schema fragment shows a repeating sequence:
<xsd:sequence maxOccurs="unbounded"> <xsd:element name="my_element_1"/> <xsd:element name="my_element_2" minOccurs="0"/> </xsd:sequence>
You must be particularly careful when writing XPath expressions for repeating sequences. You can see, in the previous example of an XSD construct, that it is possible to have a list of elements with name
my_element_1with an optional
my_element_1.The InfoPath list controls do not work well with XML such as this. In design mode, InfoPath prompts you to choose how many instances of the sequence you want to design against. If you choose three instances, then three pairs of my_name and option_name to which to bind controls appear in the Data Source task pane.
InfoPath 2003 SP 1
As long as the repeating sequence contains a required element, InfoPath SP 1 loads the XSD without modifying it, and allows you to bind repeating section controls to the repeating sequence.
Choice of Model Groups
The following schema fragment shows the choice element containing several model groups.
<xsd:choice> <xsd:element name="my_element_1"/> <xsd:sequence> <xsd:element name="my_element_2"/> <xsd:element name="my_element_3"/> </xsd:sequence> </xsd:choice>
If InfoPath encounters such a construct in the XSD, it prompts the form designer to choose which branch of the choice element to design against.
InfoPath 2003 SP 1
With InfoPath SP 1, design mode supports such XSD constructs without requiring any modification by the form designer. While it does not modify the meaning of the schema, InfoPath SP 1 simplifies a choice of a choice into an equivalent collapsed single choice in the Data Source task pane.
Optional Sibling with Same Qualified Name
The following schema fragment shows an optional sibling with same qualified name (QName).
<xsd:sequence> <xsd:element name="my_element_1" minOccurs="0"/> <xsd:element name="my_element_2"/> <xsd:element name="my_element_1"/> </xsd:sequence>
Because every potential XML instance must be accounted for in InfoPath design mode, the XPath expressions for these nodes can quickly become nontrivial. Design mode does not expose parts of the schema for which it may have difficulty creating correct XPath bindings, and it warns the user about the portions of the schema it ignored. Support for these constructs is the same in InfoPath and InfoPath SP 1.
XSD Constructs with Special Meaning in InfoPath
The following sections describe XSD constructs that have special meaning when used in creating a form template in design mode. These sections describe how you can use the constructs in your schema to enable certain behaviors. Support for these constructs is the same in InfoPath 2003 and InfoPath 2003 SP 1.
Adding New Element Fields and Groups with the Data Source Task Pane
You can construct your schema so that you can use the Data Source task pane to add new element fields and groups to an element at design time. To do so, you declare an element in your schema with an optional, unbounded xsd:any element that specifies the namespace attribute with the any wildcard. Then, in design mode, you can use the Data Source task pane to add new element fields and groups to that element. For example, you could add new content to the following element:
Adding New Attribute Fields with the Data Source Task Pane
Similarly to the element case, you can declare an attribute with an anyAttribute element that has the namespace attribute specified as the any wildcard. At design time, you can use the Data Source task pane to add new content to that schema attribute.
Storing XML Signatures in the Data Source
To enable users to digitally sign a form at run time, the schema of the data source must declare an element named
signature for storing the XML Signatures (digital signature) information that is created when a user signs the form. You make this declaration by using the xsd:any element with the namespace attribute specified as the XML Signatures namespace with a wildcard character, as follows:
Binding a Field to a Rich Text Box Control
Rich Text Box controls in InfoPath generate generic XHTML; consequently, your schema must specify that any number of text and XHTML nodes is valid in the XML of the form instance. You can achieve this specification with the following XSD construct.
<xsd:element name="xhtml"> <xsd:complexType mixed="true"> <xsd:sequence> <xsd:any minOccurs= "0" maxOccurs="unbounded" namespace= "http://www.w3.org/1999/xhtml" processContents="lax"/> </xsd:sequence> </xsd:complexType> </xsd:element>
InfoPath never modifies the content of the schema file (.xsd), but it may logically infer a subset of it for design purposes. The schema file is always untouched within the form template at both design time and run time.
Debugging Common XSD Errors
If you load externally authored XSD files to create form templates in InfoPath design mode, you may receive either of two types of error messages: MSXML error messages or InfoPath error messages. MSXML error messages appear in the Details section of an InfoPath error message dialog box, and they always begin with a reference to the name or path of the schema file that is raising the error. Some valid XSD schema constructs are not supported by InfoPath; these are discussed in Unsupported XSD Constructs. This section describes some common errors that can cause schemas to fail to load successfully in InfoPath.
XSD Namespace Declaration
Similar to all W3C standards, XML Schemas (XSD) went through a lengthy review process on its way to becoming a recommendation. There were many working drafts, and consequently, many XSD files were written based on these evolving standards. During this process, Microsoft created a proprietary schema language called XML-Data Reduced (XDR) that was included with MSXML 3.0. With the release of MSXML 4.0 and later, Microsoft XML Core Services supports the full recommendation of XSD. Many programs for creating schemas did not wait for XSD to become a full recommendation. Older versions of these programs may produce outdated XSD files that the MSXML 5.0 infrastructure on which InfoPath depends does not support.
To ensure that an XSD file supports the full XSD recommendation, it should contain the following XML namespace declaration in the <schema> tag.
Similar to all XML namespace declarations, the XML prefix (in this case 'xsd') can be any valid prefix string. Some common prefixes you may see in practice are 'xsd', 'xs', and '' (no prefix). MSXML 5.0 usually reports an error about the root not being properly defined if this namespace declaration is missing.
Importing and Including Schemas
XSD schemas are extensible and can import and include other schemas. Generally, you should import a schema if the schema specified in the targetNamespace attribute differs from the current schema. You should include it if the schema specified in the targetNamespace attribute is the same as the current schema.
The semantics for importing and including schemas are as follows.
<xsd:import namespace = "[anyURI]" schemaLocation = "[anyURI]"/> <xsd:include schemaLocation = "[anyURI]"/>
If the schemaLocation attribute is missing (as happens with some converters), then MSXML 5.0 raises an error because it cannot find the file. If you get this error, also check to make sure that the resource or location specified in the schemaLocation attribute is accessible by users of the form template. Obviously, errors occur if the schemaLocation attribute references a server or directory that is down or nonexistent or if users do not have access permissions. Also, be sure to examine all imported and included schemas to make sure they are valid.
Errors caused by problems with the schemaLocation attribute are an issue only when InfoPath first imports the schemas; that is, when you first start designing a form based on an existing schema. After that, InfoPath works with cached versions of the schema files that are stored in the form template.
An empty namespace attribute is allowed when importing a schema, if that schema does not specify a targetNamespace attribute. In general, the namespace on the import must match the targetNamespace specified in the schema that you import.
The MSXML 5.0 infrastructure that InfoPath depends upon can reliably detect and raise errors to alert you to nondeterministic schemas, but the resultant error message does not provide a line number to tell you which part of the schema is raising the error. This section discusses why it is important for XSD schema files to be deterministic and what it means to be nondeterministic, and it shows some common errors to avoid.
XSD schemas exist for the purpose of validating XML data structure and type semantics. To accomplish this task, the validating system (in this case, MSXML 5.0) must map XML nodes to XSD declarations. Without this mapping, the validating system cannot accomplish its task. If a mapping can be guaranteed, then the schema is deterministic. If there is a single XML instance that makes this mapping impossible, then the schema is nondeterministic.
The following example schema is nondeterministic.
<xsd:element name="file_Information"> <xsd:complexType> <xsd:sequence> <xsd:element name="file_name"/> <xsd:choice> <xsd:element name="file_path"/> <xsd:sequence> <xsd:element name="file_path" minOccurs="0"/> <xsd:element name="URI"/> </xsd:sequence> </xsd:choice> </xsd:sequence> </xsd:complexType> </xsd:element>
So you can see why this XSD segment is nondeterministic, assume you have the following XML fragment that you want to validate with this schema.
<file_Information> <file_name>my_Schema.xsd</file_name> <file_path>c:\xsd</file_path> </file_Information>
In this XML fragment, it is not clear whether the <file_path> element is the required node from the first part of the choice declaration or the optional one from the second part of the choice declaration. This distinction is important because:
If the XML fragment is validated against the first part of the choice declaration, then the XML is valid against the schema.
If the XML fragment is validated against the second part of the choice declaration, then the schema is not valid, because the required <URI> node is missing.
Some XSD validation systems err toward validating against this schema because there is a valid path. MSXML 5.0 is stricter and raises an error stating that the schema is nondeterministic.
Following are a few more examples of nondeterministic schemas. The first deals with optional elements. Often, these cases arise from XDR to XSD converters because of differences in the default cardinalities in the two schema languages. The first case to consider is optional elements declared with xsd:choice and xsd:sequence elements. Optional elements declared in an xsd:sequence element usually validate properly, as long as you do not have elements with the same name more than once, with only optional elements in between. For example:
<xsd:element name="container"> <xsd:complexType> <xsd:sequence> <xsd:element ref="aNode" /> <xsd:element ref="anotherNode" minOccurs="0"/> <xsd:element ref="aNode" /> </xsd:sequence> </xsd:complexType> </xsd:element>
To understand why this schema segment is nondeterministic, assume you have the following invalid XML fragment:
Just by glancing at this fragment, you can see why it is invalid: there are two <aNode> elements before the <anotherNode> element, when only one is allowed.
Now assume that you have the following XML instance to validate:
The challenge is to determine whether this instance is valid. Do you have two <aNode> elements where only one is allowed, or do you have an <aNode> element where it is allowed and another where it is allowed? Because there is no way to know, the schema is nondeterministic.
Similarly, optional elements declared in an xsd:choice element usually are problematic. In the following simplified example, there is no way to determine whether the choice occurred once with the optional element not being there or whether it never occurred at all.
The final questionable practice is using an xsd:any element without a namespace definition, as in
<xsd:any namespace="other"/>, after an xsd:sequence element. This construct is especially troublesome when it follows an optional element. If you revisit the previous example and change just the last node to an xsd:any element, you can see that all the previous arguments about nondeterminism still apply, as follows:
Illegal Enumeration Values
Usually, XSD schemas do not perform any type validation until you validate an actual instance document. An exception to this is when you have an enumeration in your schema. In this case, the schema validates the enumeration values against the enumeration types to ensure they are proper node values. Here are two examples.
<xsd:simpleType name="showTimes"> <xsd:restriction base="xsd:time"> <xsd:enumeration value="18:30:00"/> <xsd:enumeration value="20:45:00"/> <xsd:enumeration value="eleven o'clock"/> </xsd:restriction> </xsd:simpleType>
This schema is invalid because "eleven o'clock" is not a valid value for an element of type xsd:time.
The following is a more complex example:
<xsd:simpleType name="concession"> <xsd:restriction base="xsd:NMTOKEN"> <xsd:enumeration value="GummyBears"/> <xsd:enumeration value="SnowCaps"/> <xsd:enumeration value="M&Ms"/> </xsd:restriction> </xsd:simpleType>
To understand why this example is invalid, you must understand how the type xsd:NMTOKEN is defined. The W3C datatypes specification defines the NMTOKEN type as follows: "An NMTOKEN (name token) is any mixture of name characters."
If you investigate further, you find that '&' is not a valid name character, and therefore "M&Ms" does not validate as an NMTOKEN type.
Empty Sequence or Choice Elements
MSXML 5.0 sometimes raises errors about schema declarations that contain empty xsd:choice or xsd:sequence elements, as shown in the following example:
<xsd:element name="emptyContainer"> <xsd:complexType> <xsd:choice /> </xsd:complexType> </xsd:element>
Simply removing the empty <xsd:choice /> tag should fix this problem.
MSXML 5.0 can have problems validating regular expression patterns on load. Regular expressions can be complicated, and you should be careful when using them. Every XSD parser seems to have flexible regular expression languages; that is, they implement the official XSD regular expression language plus elements from other regular expression languages. If InfoPath design mode has a problem parsing a regular expression, then the sample data InfoPath generates might be invalid or might not be generated at all. This is acceptable at design time, because InfoPath uses only sample data for formatting. If, however, you use a regular expression that MSXML 5.0 does not support, then InfoPath cannot validate a value against it when a user is filling out a form in edit mode. The W3C tutorials completely describe what is supported in XSD regular expressions. For more information about XSD regular expressions and Unicode level 1 regular expressions, see the Unicode Home Page.
targetNamespace Attribute Issues
XSD is interesting in that the targetNamespace attribute by default refers only to top-level declarations, although you can set
elementFormDefault=qualified to override this default behavior. As an example, assume you have the following XSD:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://ns" > <xsd:element name="root"> <xsd:complexType> <xsd:sequence> <xsd:element name="local"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
Your XML instance doc looks like the following:
Because qualification is turned off by default, local definitions do not require the target namespace. However, if you change your local definition to be global, then your reference must be qualified with the namespace prefix. For example, the following schema is invalid:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://ns" > <xsd:element name="root"> <xsd:complexType> <xsd:sequence> <xsd:element ref="global"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="global"/> </xsd:schema>
This schema is invalid because "global" is in the namespace "http://ns". The simple
ref="global" is not recognized because the default namespace is not "http://ns". To fix this, you must add a prefix for the target namespace and use that for all global references and type uses. The corrected schema looks like the following:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:ns="http://ns" targetNamespace="http://ns" > <xsd:element name="root"> <xsd:complexType> <xsd:sequence> <xsd:element ref="ns:global"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="global"/> </xsd:schema>
Be sure that, if your schema has the targetNamespace attribute specified, all global references are qualified with the correct namespace prefix.
XML Processing Instruction Encoding (Unicode vs. ANSII)
Because XML fully supports only Unicode character sets, you may lose information if you save files that use ANSII characters. On the other hand, saving files as UTF-16 may be excessive for your particular use. To reduce the implementation cost of an XML reader, the XML author must state which encoding they are using in the top-level XML processing instruction. You may recognize something that looks like this:
This processing instruction tag is specifying that the encoding of the file is UTF-8. You must ensure that the file encoding is the same as the encoding stated in the processing instruction tag. You can determine the encoding by looking at the bytes of the file and looking for the Unicode byte order marks, but there is an easier way. If you are having problems opening an XSD schema, specify the encoding as "UTF-8", open it in a text editor such as Notepad, and then save the file using UTF-8 encoding (Notepad provides the Encoding drop-down list to specify this in the Save As dialog box). If you still have problems opening the file, it is not an encoding issue.
maxOccurs Attribute Inside the xsd:all Element
Because of the way nondeterminism is defined in the XML Schema recommendation, the only legal value for the maxOccurs attribute of an xsd:element element inside of an xsd:all element is 1. For example, the following is valid:
<xsd:all> <xsd:element name="x" minOccurs="0"/> <xsd:element name="docs" minOccurs="0"/> </xsd:all>
However, this example is not valid:
<xsd:all> <xsd:element name="x" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="docs" minOccurs="0" maxOccurs="unbounded"/> </xsd:all>
This example is invalid because the validation system cannot determine whether two occurrences of <x/> map to the single declaration or to the declaration and another illegal definition. Along the same lines, you cannot have two elements of the same name in an <xsd:all> tag either.
This example is also interesting because it allows you to have any number of <x/> and <docs/> nodes inside a containing element in any order. Although this construct is illegal, there is a workaround. By using the xsd:choice element, you can achieve the same result, as demonstrated in the following example:
How to Edit or Author an XSD for InfoPath
The two examples in this section demonstrate how to edit or author a schema to produce specific results in InfoPath.
Allowing User-defined Elements to Be Inserted in the Data Source Task Pane
To allow user-defined elements to appear under a parent element in the Data Source task pane, you must insert an xsd:any element under the parent element. To allow user-defined elements to be inserted inside <your_node_name>, the XSD declaration should look something like the following:
<xsd:element name="your_node_name"> <xsd:complexType> <xsd:sequence> <xsd:any namespace="any | other" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element>
If you want to allow user-defined attributes as well, then you need to add
<xsd:anyAttribute namespace="any | other"/> to the element declaration.
Allowing Rich Text Elements to be Bound in InfoPath Design and Edit Modes
If you want to declare an element that can be bound to a Rich Text Box control, then it should have the following form, which includes the xsd:any element with a namespace attribute set to "http://www.w3.org/1999/xhtml".
By taking advantage of InfoPath support for designing XML form solutions based on externally authored XML Schema (.xsd) files, you can create a form template that works with an industry-standard schema or custom schema created by your company or organization. With the information in this article, you can create custom XSD schema files that are compatible with InfoPath, and you can troubleshoot common issues you may encounter when loading externally authored XSD files into the InfoPath design environment.