How XML Standards Are Used in InfoPath 2003

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.

Summary: Learn how Microsoft Office InfoPath 2003 is built using XML standards, including integrated support for XML Web services, providing full interoperability with XML-enabled systems. (12 printed pages)

Michael Hoffman, Microsoft Corporation

June 2003

Applies to: Microsoft Office InfoPath 2003

Contents

  • Introduction to XML Standards in InfoPath 2003

  • Extensible Markup Language (XML) 1.0 Second Edition

  • Namespaces in XML

  • XML Schema (XSD) 1.0

  • XML Path Language (XPath) 1.0

  • Extensible Stylesheet Language Transformations (XSLT) 1.0

  • Extensible Hypertext Markup Language (XHTML) 1.0

  • Cascading Style Sheets

  • Document Object Model (DOM) 1.0

  • XML Digital Signatures (XML DSig)

  • XML Web Services

  • Universal Description, Discovery, and Integration (UDDI) 1.0

  • Web Services Description Language (WSDL) 1.1

  • Simple Object Access Protocol (SOAP) 1.1

  • Hypertext Transfer Protocol (HTTP) 1.1

  • Terminology

  • Conclusion

  • Additional Resources

Introduction to XML Standards in InfoPath 2003

The following sections describe the use of each XML standard in Microsoft Office InfoPath.

InfoPath is built from the ground up on XML standards, including the following:

  • Extensible Markup Language (XML) 1.0 Second Edition

  • Namespaces in XML

  • XML Schema (XSD) 1.0 Part 1: Structures, and Part 2: Datatypes

  • XML Path Language (XPath) 1.0

  • Extensible Stylesheet Language Transformations (XSLT) 1.0

  • Extensible Hypertext Markup Language (XHTML) 1.0

  • Cascading Style Sheets (CSS)

  • Document Object Model (DOM) 1.0

  • XML Digital Signatures (XML DSig)

  • Universal Description, Discovery, and Integration (UDDI) 1.0

  • Web Services Description Language (WSDL) 1.1

  • Simple Object Access Protocol (SOAP) 1.1

  • Hypertext Transfer Protocol (HTTP) 1.1

InfoPath uses and produces input and output files that conform to XML standards, providing interoperability with other systems including applications, operating systems, and middle-tier and back-end systems. The integrated use of standards such as SOAP makes it easy to share data with systems that are enabled for XML Web services.

XML standards are used for authoring individual XML documents belonging to your custom-defined XML schema and for designing InfoPath form templates. For example, as shown in Figure 1, the use of XML Schema provides validation and structured editing, and the use of XSLT provides multiple views that can present XML data in a different arrangement than in the XML document.

Figure 1. How XML standards are used during editing

Extensible Markup Language (XML) 1.0 Second Edition

The tags used in an XML document are not predefined. Instead, the World Wide Web Consortium (W3C) XML recommendation specifies a set of rules for creating your own meaningful set of elements and attributes. For example, sales reports, legal forms, and reports in the health industry each have different requirements for data content and structure. XML enables defining this data content and structure appropriately for each business need.

InfoPath uses XML as its native data format for input and output. When you edit an XML document, InfoPath enables you to add and remove valid XML elements and attributes belonging to your custom-defined XML schema. When you save or submit an InfoPath form, the XML document remains valid following the schema. Using the XML standard enables opening and editing XML documents used by XML Web services and XML-enabled systems.

An XML input or output document can be an XML file or part of a SOAP packet. The XML document specifies the location of the form template it is based on. The form template contains all the information that is needed to work with the XML document as a form.

The XML document can be saved locally or e-mailed as an attachment to another user. For example, you can get an InfoPath form from a server, add to or edit the data, save the form and e-mail it to someone for review, and then submit the results to a server. This enables both centralized and peer-to-peer workflow, supporting various business scenarios.

Namespaces in XML

InfoPath supports the use of multiple namespaces inside an XML document. For example, the following XML document is a simplified sales report for a fictitious company called Contoso, showing the use of two namespaces. The <customer> and <product> elements are defined in the default namespace, which is mapped to the URI http://schemas.contoso.com/salesReport. The <pricing:unitsSold> and <pricing:pricePerUnit> elements are defined in a separate namespace that uses the pricing prefix and is mapped to the URI http://schemas.contoso.com/pricing.

<salesReport xmlns="http://schemas.contoso.com/salesReport" xmlns:pricing="http://schemas.contoso.com/pricing">
   <customer>
      <product>Pentosel</product>
      <pricing:unitsSold>100</pricing:unitsSold>
      <pricing:pricePerUnit>35</pricing:pricePerUnit>
   </customer>
</salesReport>

As another example, XML namespaces are used for rich text fields. A rich text field is bound to an XML data element that references the XHTML schema namespace. For more information, see the Extensible Hypertext Markup Language (XHTML) 1.0 section of this article.

XML Schema (XSD) 1.0

InfoPath supports XML Schema 1.0, including both Part 1: Structures and Part 2: Datatypes. InfoPath can read and use custom-defined XML schemas. When designing a form template, there are three scenarios in which InfoPath uses an existing custom XML schema or creates a custom XML schema:

  • Using an existing schema. You can start designing a form by pointing to a custom-defined XML schema that has already been created by a tool that follows the XML Schema standard. The XML schema is shown in the Data Source task pane. When you drag and drop from a schema node to the layout area (called the form area), an appropriate user interface (UI) control is automatically suggested. Based on the schema, InfoPath generates rules for structured editing and validation.

  • Using an existing schema from a Web service. You can design a form template by starting with an XML schema that is read from WSDL information, by using the Data Source Setup Wizard. After the XML schema is read, the user interface enables you to create a form template that can generate XML that follows the schema in the SOAP message. When an end user creates a form based on the form template, the form generates schema-valid SOAP messages.

  • Deriving or defining a schema. If you open an XML file using the Data Source Setup Wizard, InfoPath generates an XML schema that describes the XML file. You can then create a form template based on this schema. Or, you can start creating a form template from scratch, defining the schema while you define views. When you map a UI control to a node of a schema you are designing, InfoPath automatically suggests an appropriate data type for the schema node. InfoPath generates simple, standard XML schemas that can be used by other business processes.

When entering data into a form, the XML Schema standard is used to support validation of XML data and to enable structured editing, as described in the following paragraphs.

Validation against a custom-defined XML schema during editing helps users create structured XML data that is ready for reuse by systems that require schema-validated XML data. InfoPath interactively validates the XML document against the schema, and prevents the user from submitting it to a Web service or other data source in an invalid state. To submit an XML document, it must be fully valid, including the data types. A data validation error is indicated by a dashed red border around a field, a validation ScreenTip (called an inline alert), or a validation error dialog box (called a dialog box alert).

Structured hierarchical editing based on the custom-defined XML schema provides an easy user interface for adding and removing XML elements and attributes, without showing the elements and attributes. The InfoPath user interface provides a natural way to edit the DOM tree, including inserting an optional subtree, repeating a subtree, or replacing a subtree with another subtree (where the schema uses <xsd:choice>). In InfoPath, the structure of the DOM tree is always valid. Based on the XML schema, InfoPath shows the editing actions that are valid for the selected field or field group. The user edits the XML document by adding a repeating or optional field group, entering values in fields, or entering rich text. If the schema allows adding nodes to a node of the DOM tree, the field group in the view has a drop-down menu that enables the user to add a field group or field.

XML Path Language (XPath) 1.0

The XPath standard is used throughout InfoPath, including for custom validation, XSLT views, structured editing, and scripting the DOM.

InfoPath has three levels of validation. When you are editing an XML document, InfoPath ensures that the document is always valid according to the associated custom-defined XML schema. In addition to this schema-based validation, InfoPath enables you to define additional, custom validation rules that use XPath. You can also use scripting to define additional rules and business logic.

As an example of using the XPath standard to define custom validation, suppose that in a sales report, you want to require that a value in the Price field must not be greater than the value in the Maximum Price field. Suppose the underlying XML data is the following.

<salesReport>
   <customers>
      <customer>
         <price>1230.00</price>
      </customer>
      <sales>
         <maxPrice>1000.00</maxPrice>
      </sales>
   </customers>
</salesReport>

You can automatically define the XPath expression for custom validation on the Price field. Custom validation rules are stored in the manifest file for the form template. To define this custom validation, select the Price field in the view, then use the Data Validation dialog box to select the <maxPrice> schema node. The XPath expression for custom validation is then constructed automatically and transparently, as shown here.

<xsf:customValidation>
   <xsf:errorCondition match="/salesReport/customers" "
>
      <xsf:errorMessage type="modeless" ></xsf:errorMessage>
   </xsf:errorCondition>
</xsf:customValidation>

Extensible Stylesheet Language Transformations (XSLT) 1.0

XSLT is used for defining and displaying multiple views of the XML data in an InfoPath form. A form can include multiple views, such as overview and detailed views. A view contains field groups, which contain fields, rich text fields, and other field groups. Field groups are presented as nested sections, and fields are presented as UI controls such as a text box, check box, or drop-down list. Each view produced by InfoPath is stored as a separate, standard XSLT file that can be reused by other business processes.

A view is an XSLT-based view of the DOM data tree. When an end user opens a form, XSL Transformations (XSLTs) are applied to the DOM tree, producing views that show an appropriate presentation of the XML document to the user. Elements at the beginning of the XML document could be displayed at the bottom of a view and also in a different arrangement in another view.

When the end user edits the XML document, such as adding an optional or repeating field group, the data in the DOM is modified. InfoPath redisplays the changed part of the view in an optimized way, by applying only the required part of the XSLT to the DOM.

Because the XSLT that is generated by InfoPath strictly conforms to the XSLT standard, any standard XSLT processor can be used to process the XSLT file on the server and provide a read-only view of the InfoPath form as an HTML document, which can be displayed in any Web browser.

XSLT as an Ideal Basis for Views

The content of the views can be organized very differently than the structure of the XML data. To present the data in a way that makes the most sense for the user and enables easily reading and editing the data, the designer of a form template must be able to display data in a different sequence than in the DOM data tree, omit some data from a view, reorganize adjacent data tree nodes into separate views, and gather data from different parts of the data tree into a single view.

The order and structure of the content of the views must therefore be independent of the order and structure of the DOM tree nodes. This structural independence of presentation and data requires a complex, dynamic binding, or mapping, between the grouped fields in views and the nodes in the DOM tree.

To provide this complex mapping between views and data, InfoPath uses XSLT extensively. XSLT is a powerful stylesheet language that supports complex data transformations, providing rich views with dynamic, flexible presentation of content. Using a stylesheet is a common, well-established design approach in SGML and XML authoring tools, and XSLT is the W3C standard for stylesheets that are used for this type of complex transformation.

Partial Transformation, XSLT Generation, and Aggregated Documents

To avoid running the entire XSLT transformation every time the end user enters data in a view or clicks a formatting control for rich text, algorithms are used to determine which portion of the view needs to be refreshed. Then only the relevant portion of the XSLT stylesheet is applied to the DOM, and the affected portion of the view is refreshed.

You can easily design XSLT views of XML information. InfoPath automatically generates the XSLT code that maps between complex XML data and useful views of that data. When you drag and drop UI controls onto the form area, InfoPath suggests appropriate transformations between the data structure and the views. This helps prevent XML transformation from being a difficult hurdle when implementing forms.

You can define XSLT views that combine multiple XML documents into an aggregated XML document, such as a managerial summary. This merged document can be accessed directly within InfoPath or by using a form library in Microsoft Windows SharePoint Services.

Extensible Hypertext Markup Language (XHTML) 1.0

InfoPath uses the XHTML standard to enable end users to add formatted text such as detailed notes to an XML document using a rich text box. When the end user selects the rich text box, the related editing features in the user interface become available, allowing the end user to enter rich data such as font formatting, bulleted lists, hyperlinks, tables, and images. The data entered into the rich text box is stored as XML elements that import the xhtml namespace.

To enable a node in an XML document to contain XHTML content, its schema definition must import the xhtml namespace. The document node can then be bound to a rich text box, which enables the user to edit the XHTML content.

For example, the following XML document fragment shows how a Notes rich-text field containing a numbered list is stored.

<notes>
   <ol xmlns="http://www.w3.org/1999/xhtml">
      <li>Identify and check with the primary contacts.</li>
      <li>Schedule Paul to detail the issues.</li>
   </ol>
</notes>

In the salesReport schema, the <notes> element is of type xhtml, which can contain any elements that import the xhtml namespace.

Cascading Style Sheets

When you design a form template, the cascading style sheet (CSS) standard is used in conjunction with XSLT to describe the formatting of the nested field groups and fields in views. To display and format a view, the XSLT file produces XHTML output that includes the CSS <style> element.

For example, if you change the color scheme of a view, the CSS tagging in the XSLT file changes accordingly. When you select the "Blue" color scheme in the Color Schemes task pane, the XSLT file for the view includes the following tagging in the <style> element.

         {
   MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; 
}
. {
   COLOR: white; 
}
. {
   COLOR: black; 
}

If you change the color scheme to "Red," the color values in the XSLT file change accordingly.

Document Object Model (DOM) 1.0

InfoPath uses the DOM when you open, edit, save, or submit an XML document. When you open an XML document as an InfoPath form, an internal representation of the XML document is created in memory as a DOM tree. As you edit an XML document, the changes are validated against the XML schema and then saved in the DOM. When you save an XML document, the DOM tree is read out and saved as an XML file. When you submit an XML document, the DOM is read out and sent as an XML document.

You can programmatically access XDocument.DOM to change the values in the form, as well as adding or removing field groups and fields that conform to the schema. To write a script that accesses the XML data, you use the familiar W3C DOM to access the data tree. The DOM provides a representation of the XML tree and an object model that you can script to, using either Microsoft Visual Basic Scripting Edition (VBScript) or Microsoft JScript. For example, the following code example sets the LastName attribute to "Smith".

XDocument.DOM.documentElement.setAttribute("LastName", "Smith")

XML Digital Signatures (XML DSig)

InfoPath supports XML documents that are signed using the W3C XML Digital Signatures standard. An InfoPath form can be signed using one or more XML digital signatures. The form is signed at the level of the entire XML document. A signed form is opened as read-only; InfoPath does not allow editing any data in the XML document, as long as it is signed.

When opening an XML document, InfoPath checks whether it is digitally signed. If the XML document doesn't contain a signature, the form is immediately opened and presented to the user. If the XML document contains a signature, InfoPath decodes the document to see if it is consistent per the signature. If the XML document is valid per the signature, the form opens. If the XML document is not valid per the signature, a warning dialog box appears and the form is opened.

The user can sign an XML document before submitting it. When designing a form template, you can specify whether to allow the forms to be signed. You can also specify whether a dialog box will appear when the user submits the form, to prompt the user to sign the form, if the form lacks a signature. To sign a form, the user must have a certificate installed on his or her computer.

The digital signature is located in a node inside of the XML document. When designing the form template, you specify which node the signature is stored in. The location of the signature node is stored as an XPath expression in the manifest file.

XML Web Services

InfoPath is designed using XML Web services standards, including UDDI, WSDL, and SOAP. Integrated support for Web services standards enables you to easily define views and forms for editing XML documents that will be exchanged in conformance with a Web service definition. Figure 2 illustrates the integrated support for Web services.

Figure 2. Integrated support for XML Web services in InfoPath

InfoPath helps you integrate XML forms with Web services-enabled back-end and middle-tier systems such as databases, workflow systems, enterprise resource planning (ERP), and customer relationship management (CRM) systems. Support for the UDDI, WSDL, and SOAP standards in InfoPath is described in the following sections.

Universal Description, Discovery, and Integration (UDDI) 1.0

The InfoPath user interface provides easy-to-use, integrated support for using UDDI to locate available XML Web services. InfoPath can interoperate with any UDDI server that implements UDDI 1.0 on any platform.

When you design a new form template and use the Data Source Setup Wizard, you can click Search UDDI to search for Web services. The Search Web Service dialog box, shown in Figure 3, enables you to search a specified UDDI Web services registry.

Figure 3. Search Web Service dialog box

The Search the following UDDI server list enables you to select or enter the URL of a UDDI server, such as http://uddi.microsoft.com/inquire or any other server that uses UDDI 1.0. The Search in the following field list specifies whether you want to search by the service provider description or by the Web service description. In the Search for box, you enter the search string to use.

A UDDI request is then automatically built by InfoPath and sent to the UDDI server, returning a list of matching Web services to select from. The matching Web services are shown in the format Service provider::Web service name. When you select a Web service, the WSDL file location appears in the Data Source Setup Wizard, such as http://www.contoso.com/Service.asmx?WSDL.

Web Services Description Language (WSDL) 1.1

InfoPath uses WSDL to read the description of a Web service, enabling you to design a form that can load initial data from a Web service, submit edited data to a Web service, or both. The WSDL file defines what operations are provided by a Web service and what parameters are used by each operation. The WSDL file also describes how to use SOAP and the schema for the Web service.

Use of Web Service Operations when Filling Out a Form

When you design a form, the Data Source Setup Wizard reads the WSDL file and enables you to specify how the form calls two Web service operations, which are used to retrieve initial data and submit the form. When an end user creates a form based on your form template, the first operation you selected is used by the query view, which is the default initial view. When the user enters or selects some values in the query view and clicks the query button, values from the view are sent as input parameters to the first Web service operation, which returns initial data as the output result.

The data entry view is then automatically displayed, containing the resulting initial data. The user can edit the data, enter additional data, and submit the form data to the Web service as an XML document. When the user submits the form, XML data from the form is sent as a parameter to the second Web service operation you selected when designing the form. The submitted XML document conforms to the schema for this operation.

For example, in a sales report that returns monthly sales, the user can set a value such as the sale price in the query view, edit the resulting data in the data entry view and add an additional sale entry, and then submit the form.

Selecting Operations and Schema Nodes for Their Parameters

To design a form based on a Web service, you start by using the Data Source Setup Wizard to specify the location of the Web services to use, either by searching a UDDI registry to find each WSDL file location, or by directly typing the WSDL locations, such as http://www.contoso.com/Service.asmx?WSDL.

The wizard enables you to select which operation of each Web service to use. Using WSDL, InfoPath extracts the XML schema that is associated with the selected Web service operation, and uses that schema as the data source for the form you are designing. You can select which schema node is associated with a parameter of an operation, to provide the data for the parameter. In some cases, InfoPath prompts you for sample values and uses the values to automatically call the selected operation and design the data structure from the output received.

When you have selected the Web services, operations, and schema nodes for parameters in the wizard, InfoPath automatically creates an empty query view and data entry view. The Data Source task pane shows the schema subtree for each Web service operation, including the XML structure for retrieving initial data and the XML structure for submitting a form.

Simple Object Access Protocol (SOAP) 1.1

InfoPath fits well with the loosely coupled model of Web services, in which data is sent between computers as entire XML documents. This coarse-grained communication model fits well with the asynchronous nature of the Web. As a high-level authoring tool for XML documents, InfoPath supports the document/literal SOAP encoding rather than Remote Procedure Call (RPC) SOAP encoding.

InfoPath enables users to view, visualize, and validly edit SOAP packets that move through business workflows. InfoPath is an ideal client for Web services, because it can natively read an XML schema specified by WSDL and then create a UI based on the schema, enabling end users to easily view and edit XML documents that are generated or received by the corresponding Web service.

InfoPath uses SOAP packets when requesting and receiving initial data from a Web service and when submitting an XML document to a Web service. For example, to retrieve initial data, the end user could enter a sales employee's name into a text box in the initial view. InfoPath wraps the resulting XML fragment in a SOAP envelope, as an input parameter for the Web service operation, and then sends the SOAP packet to the operation. The XML fragment conforms to the XML schema for the Web service operation that retrieves the initial data.

The Web service operation responds by returning a SOAP packet that contains the initial data as an XML fragment. The XML fragment conforms to the XML schema for the operation. The returned XML fragment is transformed using XSLT and presented to the user in the data entry view. The user can then edit the data and submit the form.

A form can submit XML data to a Web service through the SOAP protocol. The manifest file stores the definitions of how the form requests initial data from a Web service and submits an XML document to a Web service.

Hypertext Transfer Protocol (HTTP) 1.1

In addition to using the HTTP protocol to read an XML document from the Internet, InfoPath can submit an XML document to an HTTP server using the POST method. When designing a form template, you add a Submit button in the view or enable the Submit command on the File menu, and specify a URL to submit the XML document to.

The URL can point to server code such as a CGI script that can process HTTP POST requests. When an end user submits a form based on the form template, InfoPath packages the XML document as the body of an HTTP POST request that is sent to the specified URL.

Terminology

field group: A section, repeating section, optional section, or repeating table. Sections and repeating tables are controls on a form that contain other controls and that can be added dynamically by users as needed. Users can insert multiple sections or rows when filling out the form.

DOM tree: The structure of the data source of the form. In particular, the collection of fields and groups that define and store the data for an InfoPath form.

Conclusion

InfoPath is built from the ground up using XML standards to provide flexible yet structured XML editing for data gathering. For example, InfoPath form templates are created by using open standards such as XML, XML Schema, XSLT, and XPath. The XML format, including namespaces, is used for input and output. XML DSig provides security when opening or submitting a form.

XPath is used throughout InfoPath, such as for custom validation rules that reference nodes in the XML data structure. The use of XML Schema provides validation and structured editing, enabling ordinary end users to create valid XML documents belonging to a custom-defined XML schema.

XSLTs enable the content of the UI views to be organized differently than the structure of the XML data. An advantage of using XSLT for views is that any standard XSLT processor can be used to process the XSLT file on the server and provide a read-only view of the InfoPath form as an HTML document, which can be displayed in any Web browser. CSS is used in conjunction with XSLT to describe the formatting of the views.

Rich text fields are implemented using XHTML. Elements and attributes of the DOM tree are mapped through XSLT to nested field groups containing text fields and other UI controls, providing an easy user interface for visualizing and editing hierarchical XML data.

Support for XML Web services, including UDDI, WSDL, and SOAP, enables easily defining views for authoring XML documents that conform to a Web services schema. XML documents can be submitted to a Web service using SOAP, or to an HTTP server using the HTTP POST method.

The integrated use of XML standards throughout InfoPath provides full interoperability with XML-enabled systems including applications, operating systems, and middle-tier and back-end systems. All XML documents produced by InfoPath conform to your existing custom-defined XML schemas. This support for custom-designed XML schemas enables easy integration with back-end systems, using the existing schemas for the back end without requiring you to modify them.

Additional Resources

For more information, see the following: