InfoPath

Turn User Input into XML with Custom Forms Using Office InfoPath 2003

Aaron Skonnard

This article assumes you're familiar with Microsoft Office and XML

Level of Difficulty123

SUMMARY

Office InfoPath 2003 is a new Microsoft Office product that lets you design your own data collection forms that, when submitted, turn the user-entered data into XML for any XML-supporting process to use. With an InfoPath solution in place, you can convert all those commonly used paper forms into Microsoft Office-based forms and end the cycle of handwriting and reentering data into your systems. Today organizations are beginning to realize the value of the mountains of data they collect every day, how hard it is to access it, and are striving to mine it effectively. InfoPath will aid in the design of effective data collection systems. Here the author shows you how to get started.

Contents

InfoPath Overview
Designing Form Templates
Starting from XML Schema or WSDL
Saving and Publishing Forms
Filling Out Forms
Submitting Forms
Advanced Features
Summing it Up

Organizations depend on information to drive business processes and decision-making. The quality and accuracy of this information is key, as is the ability to gather and analyze it quickly.

Information is recorded in a variety of formats, the most primitive (and probably the most common) is a paper form. Just think of all those expense reports, customer evaluations, sales trip reports, and time cards. Forms may be easy to fill out, but eventually the data collected in them must be rekeyed into the computer system (into yet another data format) so it can be used by internal business processes, risking redundancy and human error, both of which reduce productivity.

Word processing documents like those created in Microsoft® Word or Excel are also common. Working with these formats is challenging as well because it's difficult to extract the information and metadata from them programmatically.

The flexibility of XML allows organizations to define their own XML schemas for handling their specific data representation needs. With well-known XML schema definitions in place, any business process can be programmed to consume an XML document and understand its meaning thanks to XML support in virtually all platforms and programming languages.

The Web Services platform builds on XML by using it for information exchange over protocols like TCP, HTTP, SMTP, and potentially many others. Combining XML with these open protocols makes it possible to build an infrastructure for sharing information between business processes in a standard way.

All that is needed to reap the benefits across the enterprise is an easy way to get previously hand-written data into XML. InfoPath, previously known as XDocs, is a new member of the Microsoft Office System of products that let's you do just that.

InfoPath provides an environment for designing forms built around XML Schema or Web Services Description Language (WSDL) definitions. In a matter of seconds, you can use InfoPath to build a new form that's capable of outputting XML documents conforming to an XML Schema Definition (XSD) or communicating with a Web Service conforming to a WSDL definition. XML Web Services and InfoPath can be used together to replace their legacy information-gathering techniques.

InfoPath is chock-full of functionality, including rich client functionality and off-line capabilities that surpass those of traditional Web Forms. Best of all it's much easier to use than traditional Web Services development environments. This article will focus on the main features of InfoPath.

InfoPath Overview

InfoPath supports two main activities: designing form templates and filling out forms. You must have InfoPath installed on your machine to perform either task. When you first run InfoPath, you'll see a window like the one shown in Figure 1. Notice that both the File menu and task bar (on the right) provide an option for each of these main tasks.

Figure 1 InfoPath Environment

Figure 1** InfoPath Environment **

InfoPath provides an easy-to-use WYSIWYG interface for designing new forms that can be based on XSD or WSDL. After you design a new form template, InfoPath allows you to publish it to a centralized location so others can access it. After a user browses to a form template, InfoPath opens the form and allows the user to fill it out. When the user is finished, the form can be saved or submitted for processing. The InfoPath save functionality lets you work with forms without being connected to the network, effectively providing an off-line mode. Submitting the form can include posting the XML file to a Web site directory or to a Web Service endpoint using SOAP.

Designing Form Templates

When you design a new form template, you can either start from scratch by selecting New Blank Form or base it on an existing data source by selecting New from Data Source (see Figure 1). The task list in the righthand pane of a blank form provides links to the various tasks you might want to perform while designing a new form. Specifically, you'll want to control the form layout, place controls on the form for capturing different types of information, define how those controls map to your underlying data source, maybe define different data views of the form, and ultimately publish it. After selecting New Blank Form, InfoPath provides a blank design surface that you can drag controls onto and arrange in a variety of ways to suit your needs.

Alternatively, you can choose from a collection of built-in layout options including various tables and sections. The designer also makes it easy to change color schemes, fonts, and other aesthetics as you're building your form and provides a nice palette of built-in controls that fit a wide range of data representation needs. As you're designing, you can select Preview Form to see what it will look like when the user fills it out.

As you design a form from scratch, InfoPath automatically builds an XML Schema definition behind the scenes to represent the information that will be captured by your form. You can explicitly define how each control maps to a corresponding XML Schema datatype by manipulating the layout and properties of each control placed on the form. For example, consider the form shown in Figure 2, which captures employee information.

Figure 2 New Employee Form

Figure 2** New Employee Form **

When I designed this form, I defined the Name textbox to be of type xsd:string, the Salary textbox to be of type xsd:double, and the Date of birth textbox to be of type xsd:date. The schema produced by InfoPath to represent this information is shown in Figure 3. Unfortunately, you don't have much control over the schema when using this approach. The order in which you place the controls determines the hierarchy of the schema. Your only mechanisms for influencing the schema are the various layout and control properties available through the WYSIWYG designer.

Figure 3 Employee Form XML Schema Definition

<xsd:schema targetNamespace="https://schemas.microsoft.com/office/ infopath/2003/myXSD/2003-06-04T16:51:19" xmlns:my="https://schemas.microsoft.com/office/infopath/ 2003/myXSD/2003-06-04T16:51:19" xmlns:xsd="https://www.w3.org/2001/XMLSchema"> <xsd:element name="myFields"> <xsd:complexType> <xsd:sequence> <xsd:element ref="my:name" minOccurs="0"/> <xsd:element ref="my:salary" minOccurs="0"/> <xsd:element ref="my:birthdate" minOccurs="0"/> </xsd:sequence> <xsd:anyAttribute processContents="lax" namespace="https://www.w3.org/XML/1998/namespace"/> </xsd:complexType> </xsd:element> <xsd:element name="name" type="xsd:string"/> <xsd:element name="salary" nillable="true" type="xsd:double"/> <xsd:element name="birthdate" nillable="true" type="xsd:date"/> </xsd:schema>

If you fill out this employee form and save it to disk, InfoPath will produce an XML document that conforms to the schema, like the one shown in Figure 4. Although this start-from-scratch approach limits its use as an XML Schema designer, this was never intended to be its primary role. The fundamental purpose of InfoPath is to be able to dynamically build forms from already existing XML Schemas or WSDL definitions.

Figure 4 Saved Employee Form

<?mso-infoPathSolution solutionVersion="1.0.0.5" productVersion="11.0.4920" PIVersion="0.9.0.0" language="en-us" href="https://file:///C:\Samples\SimpleEmployee.xsn"?> <?mso-application progid="InfoPath.Document"?> <my:myFields xmlns:my="https://schemas.microsoft.com/office/infopath/2003/ myXSD/2003-06-04T16:51:19" xml:lang="en-us"> <my:name>Bob</my:name> <my:salary>1000</my:salary> <my:birthdate>1965-12-31</my:birthdate> </my:myFields>

Starting from XML Schema or WSDL

When you design a form from an existing data source, you're essentially defining a mapping between the existing data source and the form template. InfoPath supports the following data sources: XML documents, XSD, databases (SQL Server™ or Microsoft Access), and Web Services. When you select New from the Data Source menu, the Data Source Setup Wizard appears and prompts you to select the type of data source you want to use.

Regardless of the type of data source you ultimately decide to use in your project, InfoPath reads the available metadata from the data source and displays it in the Data Source view. You can then drag and drop data source fields onto the form as you're designing it. This approach allows the data source to drive the forms creation in a very dynamic way.

For example, consider the schema in Figure 5 which represents author information in a publication system. If you created a new form based on this schema, InfoPath would display all of the author schema fields in the data source view to the right of the design surface. You could then drag and drop the schema fields onto the form and arrange them as you like (see Figure 6).

Figure 5 Author Schema

<xs:schema id="Author" targetNamespace="https://www.develop.com/authors" elementFormDefault="unqualified" xmlns:tns="https://www.develop.com/authors" xmlns:xs="https://www.w3.org/2001/XMLSchema" > <xs:simpleType name="SSNumber"> <xs:restriction base="xs:string"> <xs:pattern value="\d{3}\-\d{2}\-\d{4}" /> </xs:restriction> </xs:simpleType> <xs:simpleType name="PhoneNumber"> <xs:restriction base="xs:string"> <xs:pattern value="\d{3}\-\d{4}" /> </xs:restriction> </xs:simpleType> <xs:simpleType name="TwoLetterState"> <xs:restriction base="xs:string"> <xs:minLength value="2" /> <xs:maxLength value="2" /> </xs:restriction> </xs:simpleType> <xs:simpleType name="ZipCode"> <xs:restriction base="xs:string"> <xs:pattern value="\d{5}" /> </xs:restriction> </xs:simpleType> <xs:complexType name="NameType"> <xs:sequence> <xs:element name="first" type="xs:string" /> <xs:element name="last" type="xs:string" /> </xs:sequence> </xs:complexType> <xs:complexType name="AddressType"> <xs:sequence> <xs:element name="street" type="xs:string" /> <xs:element name="city" type="xs:string" /> <xs:element name="state" type="tns:TwoLetterState" /> <xs:element name="zip" type="tns:ZipCode" /> </xs:sequence> </xs:complexType> <xs:complexType name="AuthorType"> <xs:sequence> <xs:element name="name" type="tns:NameType" /> <xs:element name="phone" type="tns:PhoneNumber" /> <xs:element name="address" type="tns:AddressType" /> <xs:element name="contract" type="xs:boolean" /> </xs:sequence> <xs:attribute name="id" type="tns:SSNumber" /> </xs:complexType> <xs:element name="author" type="tns:AuthorType" /> </xs:schema>

Figure 6 New Author Form

Figure 6** New Author Form **

When you fill out the form and save it, InfoPath creates an XML document that conforms to the existing schema definition that you started from. In this case, it would create an XML document that looks something like the document shown in Figure 7.

Figure 7 Saved Author Form

<?mso-infoPathSolution solutionVersion="1.0.0.5" productVersion="11.0.4920" PIVersion="0.9.0.0" language="en-us" href="https://file:///C:\Samples\NewAuthor.xsn"?> <?mso-application progid="InfoPath.Document"?> <tns:author id="333-33-3333" xmlns:tns="https://www.develop.com/authors"> <name> <first>Mary</first> <last>Smith</last> </name> <phone>555-0444</phone> <address> <street>123 Main</street> <city>Layton</city> <state>UT</state> <zip>84041</zip> </address> <contract>true</contract> </tns:author>

Designing a form for a Web Service follows the same process, but you must also specify the location of the Web Service and provide the WSDL definition. InfoPath extracts the schema definitions from the WSDL definition in order to build the data source view for you to work with. You must also indicate how the form will interact with the Web Service, indicating options such as whether it receives and submits data to the service. You can build a form that only sends data, one that only receives data, or one that does both.

If you choose "Receive and submit data," InfoPath will create two views for your form: one view for submitting data to the service and another view for the data returned from the service. In this case, the data source view also contains two groups of fields. The queryFields contain the data that needs to be supplied when invoking the service. The dataFields contain the data returned from the Web Service. You drag the queryFields onto your Query view and the dataFields onto your Data view. When the user fills out the Query view and presses Calculate, InfoPath invokes the Web Service with the supplied information and displays the result in the Data view.

Saving and Publishing Forms

Once you've designed a form template, you need to save and/or publish it. You should note that clicking the Save button simply saves the form template to a file. You can return to the form template at any time and continue working on it. The form template is a standalone file that contains all the information necessary for another user to fill it out. You can extract the various files that constitute a form by selecting File | Extract Form Files. This will write various files to your hard drive including the XML Schema and XSLT files used internally.

Selecting Publish displays the Publishing Wizard, which allows you to distribute the finished form to a centralized location, accessible by other users. You can publish a finished form to a number of places: a network share, a SharePoint form library, or a virtual directory on a Web server, all of which users can easily browse to using their Web browser.

Filling Out Forms

The whole purpose of designing a form is to have users fill it out. Once a form has been published, users can access the form directly from InfoPath (by selecting Fill Out a Form) or by simply browsing to the file in Windows® Explorer or Microsoft Internet Explorer. When you browse to a form, InfoPath opens it in "fill out" mode and enables the user to enter data (see Figure 8). As I mentioned earlier, users can save forms to the local hard drive and return to work on them later even if disconnected from the network, then submit them when reconnected.

Figure 8 Filling Out a Form

Figure 8** Filling Out a Form **

A saved form is just an XML document that conforms to the schema or Web Service it was designed from. For example, the XML document shown in Figure 7 is the saved form shown in Figure 8. As you can see, InfoPath injects a few processing instructions (PIs) into the XML documents that allow the loader to figure out that this document should be loaded back into InfoPath, not into the default application for XML documents (usually Internet Explorer). When you double-click an XML file that contains these PIs, Windows automatically launches InfoPath and allows the user to continue filling out the saved form.

One of the main benefits of using an InfoPath form over a traditional Web form is the rich functionality offered by the runtime environment. For example, InfoPath provides automatic spell checking while you fill out a form, very much like you would find available in Word. If InfoPath detects a spelling error while you're entering data, it highlights the word with a red squiggle and offers suggestions for possible changes.

InfoPath offers a range of validation features to help ensure the quality of your data. It provides real-time validation against the form's underlying XML Schema definition including custom simple type definitions like those used in Figure 5. If the user enters a value that doesn't conform to the control's underlying XML Schema type, it's also highlighted with a red squiggle and a helpful error message is provided in the control's tooltip. Validation against the schema also happens at submission time.

InfoPath also offers other advanced features like autocomplete, find and replace, drag and drop, and complete printing support. Generally speaking, working in InfoPath feels just like you're working in any other Microsoft Office product.

Submitting Forms

The final step is to submit the filled-in form. This step depends somewhat on how the form was designed. In some cases, the designer may have only intended for the form to be saved and placed in a shared directory somewhere. However, the form will usually need to contain some kind of Submit button. You can create your Submit button by dragging a button control onto the form and double-clicking it, and setting its action to Submit in its Properties window. Doing so displays a dialog that allows you to specify what should happen when the user presses the button.

Figure 9 Sending a Form Via E-mail

Figure 9** Sending a Form Via E-mail **

The form can be submitted to a Web Service, to a virtual directory on a Web server, or via custom script code. It's also possible to submit the completed form via e-mail by selecting File | Send to Mail Recipient. When you do this, InfoPath attaches the XML document to the e-mail and adds the HTML of the view you sent to the body of the e-mail (see Figure 9).

Advanced Features

InfoPath has a variety other advanced features that I'll cover only briefly. One such feature is the rich text control that can be used to input formatted text (such as text annotated with different fonts, bold, or italic). Rich text controls are mapped to XHTML in the underlying XML document. The rich text control makes it possible to enter free-form text, like you would in Word, without sacrificing the benefits of working with XML or losing the formatting information provided by the user.

Another advanced feature enables optional and repeating sections. Optional sections can be hidden and displayed on demand, making it possible to simplify forms and decrease clutter. Repeating sections (or repeating lists/tables) let you repeat blocks of information that occur multiple times in a schema. InfoPath also enables you to bind repeating controls to external data sources for automatic population of the control.

Figure 10 Creating a Validation Rule

Figure 10** Creating a Validation Rule **

Finally, InfoPath provides more advanced validation features than XML Schema provides. Pressing the Data Validation button on the Properties page for any form control gives you two options for additional validation. One option is to write custom validation code using script to respond to one of three events: OnBeforeChange, OnValidate, and OnAfterChange. The other option is to create a new validation rule using a set of predefined expressions and constraints, as illustrated in Figure 10.

Summing it Up

InfoPath is a new Microsoft Office product that facilitates the process of gathering information from heterogeneous sources through dynamic XML-based forms. Overall, InfoPath makes it easy for anyone to design, publish, and fill out electronic forms based on XML and Web Services technology, which offers many advantages over traditional techniques used today. By using InfoPath, organizations will find it much easier to share their previously inaccessible data with any application that supports XML.

For background information see:
The XML Files: A Quick Guide to XML Schema
The XML Files: A Quick Guide to XML Schema-Part 2
Understanding XML Namespaces
Understanding XML Schema

Aaron Skonnard is an instructor and researcher at DevelopMentor, where he develops the XML and Web Service-related curriculum. Aaron coauthored Essential XML Quick Reference (Addison-Wesley, 2001) and Essential XML (Addison-Wesley, 2000).