This documentation is archived and is not being maintained.

Microsoft Office System and XML: XML in Action

Office 2003

Microsoft Corporation

October 2003

Applies to:
    Microsoft® Office System

Summary: Over the past few years, much was written about XML and the positive effect that widespread adoption of XML will have on business and business processes. While many companies and industries have seen great benefit from the use of XML, primarily for exchanging data between back-end servers, the introduction of XML-enabled desktop applications with support for customer-defined schemas carries the potential for even greater benefit. The Microsoft Office System enhances XML support in the 2003 versions of the Office applications and introduces new applications, which put XML in the hands of the information worker and enable businesses to reap the full benefit and promise of this revolutionary technology. (15 printed pages)


XML in Action: Scenarios
XML in the Microsoft Office System: Product Overview
For More Information


Today, XML is a public and widely accepted standard that enables exchange of data between disparate systems. The use of XML and the advent of the Web services architecture have revolutionized the way many companies—or, in some cases, entire industries—transact business on the Web. However, XML represents more than a simple data exchange protocol; its founders originally envisioned XML as a way to capture more of the meaning locked within business documents by defining the structure and context of the information these documents contain.

This brochure examines how the introduction of a suite of XML-savvy desktop applications brings us closer to fulfilling that vision by facilitating creation of XML documents and allowing users to retrieve and reuse that information in innovative new ways. Several scenarios demonstrate how companies can enhance their business processes and improve employee productivity by delivering XML capabilities to the desktops of information workers. Finally, brief descriptions of the new features of Microsoft® Office Professional Edition 2003 show how Microsoft is enabling these benefits and making the Microsoft Office System the premiere platform for solutions that create and consume XML.

Why XML?

Today's businesses thrive on information. This information is generated by many channels and exists in many forms: as raw data collected from operational systems, as content and documents that are published and shared, and in countless e-mail messages exchanged and stored locally by users throughout the company.

Although we have well-established methods for storing and managing some kinds of data (for example, numerical data in databases), a significant portion of the information created in the business environment is not captured in any meaningful way. Workers everywhere generate reports, e-mail messages, and spreadsheets that contain vital, valuable information. But if they need to reuse this information, these same workers may also spend significant time searching for the appropriate files and subsequently spend effort to re-key, cut and paste, or otherwise import the relevant information into another document. The way these documents are created and handled limits the extent or ease with which the information can be used outside the original document.

While data capture and validation is a well established methodology for traditional data management, the technology to similarly gather and manage the information contained in text-based reports and other common business documents has not been available. XML was originally created to solve this problem. XML enables businesses to capture all manner of business information in a way that maximizes its value. By facilitating reuse, indexing, search, storage, aggregation, and other practices more often associated with management of relational databases, XML brings the power of traditional data management to bear on documents.

As described in the following sections, the XML standard is ideally suited for defining all types of business information, and documents in particular. In addition, because it was designed specifically for delivering this information over the Web, XML is an ideal technology for companies that rely on enterprise networks, intranets, and the Internet for communication within the company or with partners, vendors, or suppliers. As will be shown, the adoption of XML standards across the enterprise and on the desktop will lead to significant benefits both for users and for the business itself.

What is XML?

XML is a markup language that is used to identify structure within a document. The XML standard is published and maintained by W3C, the consortium that maintains many of the standards for the World Wide Web.

Like other markup languages, XML uses tags to define specific elements within a document. XML tags define the document's structural elements and the meaning of those elements. Unlike HTML tags, which specify how a document looks or is formatted, XML can be used to define the document structure and content—not just the look and feel. By doing so, XML separates a document's content from its presentation, thereby enabling us to manage this content in a meaningful way.

Although XML is a published standard, the XML specification does not specify the tags themselves. Instead, it provides a standard way to define tags and relationships and to add markup to documents. Because there is no predefined tag set, XML presents an extremely flexible meta-language, which can be used to model virtually any type of document.

This flexibility also results in a highly scalable model, which can be applied to very simple, text-based documents or to very complex hierarchical information. This scalable model enables the creation of "semi-structured" documents—documents that have regions of meaning, in the same way that columns in a database have meaning. As a result, its applications are virtually unlimited in terms of the types of documents or data structures it can describe.

The tags that can be used for a particular document type or information type are contained in XML schemas, which define the set of tags and the rules for applying them. Schemas define the structure and type of data that each data element in a document can contain and can be created by a user, a company, or at the industry level. Thus, XML schemas can be created to define and qualify content for virtually any application.

Another key distinction of XML is that it is text-based. Because XML files consist completely of text, they can be read by humans, thereby facilitating the creation of cross-platform tools and the exchange of files between applications. In addition, there are few limits on the tools, platforms, or even devices that create or consume XML documents. This presents broad opportunity for business, in that it allows exchange of data between applications, systems, and companies regardless of platform.

Text-based encoding also allows for remarkably flexible delivery of XML-based information, which can be displayed in a browser or surfaced in a full-featured application for editing. One technique that has proven useful employs style sheets, or XSL-Ts (Extensible Style Sheet Language–Transformation), to define how information is displayed in a particular instance. In addition to defining the formatting for a particular instance of a document, XSL-Ts can also translate XML documents into the format required by the application that is consuming the information. Typically, an XSL-T translates the original XML into another XML schema (as required by an application) or into HTML (for display in a Web browser).

Because an XML document is structured, platform-independent, and text-based, XML documents can be opened and operated on by a range of editing programs (such as Microsoft Office System programs) or integrated into automated business processes.

The Benefits of XML on the Desktop

Today, XML is a public and widely accepted standard that enables exchange of data between disparate systems. For many companies, XML facilitates enterprise transactions and business-to-business data exchange, where it solves issues of cross-platform compatibility. Many companies are also adopting the Web services architecture as a way to expose data in back-end systems and to leverage the existing infrastructure for XML-based solutions.

But even though many businesses rely on XML for data exchange and transaction processing, and even though the necessary servers and architecture are in place on the Internet and at the enterprise level, XML has not been fully exploited on the desktop. The reason is that, historically, it has been extremely difficult to separate content from presentation in desktop applications, and doing so is the key to implementing XML in end-user scenarios.

The Microsoft Office System, which offers enhanced XML support in many of the Microsoft Office Professional 2003 productivity applications, enables rich solutions that bring the power of XML to the desktop. The Microsoft Office System opens the way for the next generation of Office solutions, in which XML plays a critical role in empowering users and enabling companies to capture the full value of their collective knowledge.

Intelligent Applications

Important  The information set out in this section of this topic is presented exclusively for the benefit and use of individuals and organizations outside the United States and its territories or whose products were distributed by Microsoft before January 2010, when Microsoft removed an implementation of particular functionality related to custom XML from Word. This information may not be read or used by individuals or organizations in the United States or its territories whose products were licensed by Microsoft after January 10, 2010; those products will not behave the same as products licensed before that date or licenses for use outside the United States.

XML offers exceptional potential for automating virtually any task that involves working with documents. Creating documents such as reports, spreadsheets, and forms with an attendant XML schema—even if that schema is hidden to the users—enables developers to build solutions that recognize the structure and meaning of the content within those documents and respond intelligently to the user. Application intelligence can also be used to validate information or data as it is input, avoiding errors and aiding in data cleansing and normalization.

The ability for companies to define their own schemas allows them to identify the unique regions of meaning within their documents and to create solutions that correlate these structures to their own business processes. Because content is separate from the presentation, these solutions can be tailored to display the same information in many different ways, as appropriate for a particular task, user, or process.

Moreover, the ability to identify sections of a document structurally—or to recognize specific content within a section—allows developers to create applications that respond intelligently to user input, offering context-sensitive actions and guidance, suggesting content, or providing supporting data or links to related information.

Because the client software understands the content of the document, intelligent applications present endless possibilities for helping users interact with documents. The advent of such solutions will revolutionize the way users create and work with documents. Intelligent applications will guide and facilitate the creation of documents, reducing the time spent on traditionally manual tasks.

Connecting Users to Data

XML is widely regarded as a standard for data exchange, and for exposing the information in databases or back-end systems. Web services use open, Internet standards to allow communication between business systems and data sources, or between systems that are written in different languages on different platforms. By providing XML-enabled applications on the desktop, companies take advantage of this infrastructure to empower employees, by enabling them to connect directly to enterprise systems and data sources.

Smart client applications are a class of powerful new desktop counterparts to distributed applications that rely on Web services. Companies can create smart clients based on Office technologies that can take full advantage of Web services by accessing the information directly and dynamically, and then surfacing the right information, in the right format, based on where it's needed: in the spreadsheet, word processor, or line-of-business application that will be used to analyze, format, or publish the information. The result is a flexible, richer, more integrated desktop environment.

Capture and Reuse of Information

XML-based documents enable companies to capture more of the intellectual property that is created on an ongoing basis, but that typically remains locked in documents and files. Once captured, this information becomes a valuable asset, which can be used and reused in many ways. When companies can define their own XML schema, they gain the ability to decide exactly what data to capture and how this data is structured.

Customer-defined schema are analogous to the columns and tables in a database; thus, documents of all kinds become a source of information as rich as any other operational data store. With documents functioning at the storage level, companies have the capability to aggregate, parse, search, manage, and reuse documents and domain knowledge in the same way they do their business data.

For users, the ability to search for specific information and to aggregate information from numerous sources eliminates many of the time-consuming, error-prone tasks associated with document creation and update; for example, opening and closing files to find information; cutting and pasting information between documents; and searching for labels to combine data in like fields.

XML in Action: Scenarios

Regulatory Compliance and Preparation of Financial Data

Industry-specific schemas are being created and adopted by numerous industries, including financial services, insurance, health care, and others. Standardization across an entire industry streamlines regulatory compliance and reporting as well as exchange of data among partners and companies within the industry. The following scenario illustrates how adoption of a standard schema for corporate financial data could enable institutions, such as the FDIC and the SEC, to automate processing and storage of required filings. In addition, greater use of XML could allow financial services companies to automate creation of the reports and simplify and speed up dissemination of these reports to their customers through a variety of channels.


The financial services industry is characterized by stringent reporting requirements. Government regulators require businesses to submit numerous financial reports in compliance with government regulations. In the private sector, equity analysts depend on these filings as well as on extensive research to prepare their own reports, recommendations, and earnings estimates on industries and specific companies. This information is then provided to institutional investors.

Heightened scrutiny on financial markets has underscored the need for timely, accurate reporting and led to increased pressure for impartiality, transparency, and accountability in the equity research and fund management industry.


Research reports and SEC filings take a myriad of forms, each with its own standard format. In the case of SEC filings, these reports must be submitted on a tight schedule in compliance with strict requirements. In the private sector, reports and research are distributed through multiple channels, including fax, phone, print, e-mail, instant messenger, and Web portals. The delivery method may be dictated by customer preference or by content type. In addition, content is frequently translated for worldwide distribution and may be syndicated for delivery by third parties. The sheer number of reports circulated on a daily basis is driving both the government and the industry to pursue automation of the submittal and publishing processes.

A Desktop XML Solution

With hundreds of private and public companies and numerous government agencies involved, the first step toward successful automation is to establish standards for representing the data and other information common to the financial reports. Companies that adopt such a standard could use this schema to identify specific data elements within their financial records, as well as to define structures for the various report types. They could subsequently create custom applications or automated processes that extract and format the appropriate data in accordance with the reporting requirements of various governmental agencies.

For the government (and the other organizations that aggregate large numbers of SEC filings), use of a standard XML schema would enable complete automation of the processes that accept, validate, normalize, and store these filings.

The inherent portability of XML documents allows these agencies to easily automate submission of XML-based reports and filings over the Web or secure networks. Similarly, XML data can be easily republished and disseminated to commercial clients via Web services that retrieve and format the data to target virtually any device, format, or language. The Web services and orchestration servers that control these processes can accept XML documents regardless of the software tools used or the platform on which the report was created, so adoption of the standard does not require "re-tooling" of the entire industry.

The use of a standard schema enables validation of the financial data, whether by the service receiving the data or on the analyst's desktop, where the application uses the XML to provide context-sensitive guidance and information needed to successfully complete the report.

The use of XML-capable desktop applications, such as Microsoft Office Excel 2003, allows the analysts who prepare these reports and work with the data on a daily basis to embrace the industry's XML standard while continuing to work with familiar tools. With a few simple software extensions, these analysts can export data directly from their Excel-based models and analysis packages in a format that is ready to submit.

XML-savvy desktop applications also allow analysts to retrieve financial data directly into the application, thereby avoiding manual cut-and-paste from other electronic sources or typing from printed reports. Retrieval is not limited to data that conforms to a single schema, but can be performed on any customer-defined or industry standard XML format.

Easy retrieval also allows users to more easily combine data from multiple sources as they retrieve it into Excel 2003. For example, a financial executive who is creating a consolidated report across multiple divisions in a company can more easily match the revenues from each division because the actual revenue numbers are tagged by XML as "<revenue>."

In one such scenario which is possible today, financial equity research companies expose data from public and private companies to their analysts through a solution that uses a standard XML schema for financial reporting and an Excel-based smart client solution. Cells in the Excel spreadsheet are mapped to various elements of the XML document. The spreadsheet retrieves a list of companies into the Excel task pane, and automatically enters individual or combined results into the spreadsheet as the analyst selects the companies.

Important  The information set out in this section of this topic is presented exclusively for the benefit and use of individuals and organizations outside the United States and its territories or whose products were distributed by Microsoft before January 2010, when Microsoft removed an implementation of particular functionality related to custom XML from Word. This information may not be read or used by individuals or organizations in the United States or its territories whose products were licensed by Microsoft after January 10, 2010; those products will not behave the same as products licensed before that date or licenses for use outside the United States.

After analysis, a task pane in Microsoft Office Word 2003 exposes the XML data for preparation of a dynamic, customer-ready report. Because the solution relies on XML, both the Excel spreadsheet and the Word 2003 document are inured to changes in both the spreadsheet and the format of the incoming data. And when data changes, the analysts need only refresh either file to immediately update the information in the appropriate places within the report.


  • Industry-standard schemas provide consistency for regulatory reporting and enable data exchange between partners and third parties.
  • XML-formatted documents enable agencies to automate submission and processing of reports and other filings.
  • XML-savvy desktop tools automate the creation of standardized reports.
  • With all data tagged as XML, analysts can easily aggregate information form multiple sources for analysis and modeling.

Field Service Execution and Management

This scenario describes how a mobile solution using XML-enabled forms can help field service technicians capture data efficiently and accurately, automate generation of data-based reports, and allow sophisticated analysis, aggregation, and storage of the information gathered in the field.


Manufacturers of large industrial equipment (powerful electric generators, excavation machines, farm equipment, construction loaders, tractors, etc.) routinely dispatch field service teams to conduct on-site repairs. These repairs can take from a week to many months, depending on the maintenance required.


During each repair, manufacturers collect large volumes of data so that they can help companies gauge the success of the service call. Today, engineers and repair teams typically collect this data on paper forms or in expensive custom-developed software applications. Because the engineers don't have persistent Internet connections, the data they collect must be transferred to a central database at the end of the service call.

The data is used in many ways, including generating formal, post-service reports for the customer. The creation of these reports often takes engineers as long as the service call itself, especially if data was collected on paper forms. The customer is not billed before delivery of the post-service report. In addition to using the data in post-service reports, the equipment manufacturers use data collected during the call in their own back-end systems, performing predictive analytics to spot trends that might signal a need for product modifications. Finally, the collected data is archived as part of the customer's history.

The use of paper collection forms creates inefficient, error-prone manual processes and makes it difficult to aggregate repair histories for a specific customer or type of equipment. In addition, the long latency between when the data is recorded in the field and when it is captured in the central database prevents the company from assessing status of on-going repairs or responding quickly to trends in equipment failures or customer requests.

A Desktop XML Solution

An XML-based solution would revolutionize the way these companies implement their processes, allowing them to accurately capture the wealth of information gathered by on-site technicians and to efficiently manage this information.

In the field, a notebook computer equipped with Microsoft Office InfoPath™ 2003, a forms-based, XML-enabled data-entry application, replaces the paper forms used by the engineers. The computer is also loaded with detailed repair and operational manuals, which are created and maintained in Word 2003 with an underlying XML schema.

InfoPath 2003 forms allow engineers to capture detailed data accurately, such as equipment performance statistics, on-site operational usage information, and repair needs. An XML schema developed by the company specifically to characterize this data defines the electronic form. The schema enables real-time validation of the data; when entries fall outside expected ranges they are flagged, and the engineer is prompted to explain the variance.

When imported into Word, the forms data that has been captured by the engineer supports intelligent navigation through the repair manuals; for example, the equipment configuration described on the form could enable Word to automatically retrieve task-specific definitions, updated information, best practices, and even videos of repair techniques in the task pane.

When the notebook computer is synchronized with a desktop computer at the home office, the XML document containing the field service data is submitted to a Web service. The Web service adds the data to the master database and initiates several related processes through Microsoft BizTalk® Server, including billing, creation of customer reports, and analysis of repairs. The XML tagging in the raw data allows an intelligent application to automatically extract the relevant notes and data to create a near-final customer report and invoice. The data can be extracted later and used in other ways, as well, for example, to create aggregated analyses of repair trends for a specific machine or to update calculations of mean time to failure for a particular product line.

The company has the option to store the XML-tagged report as a document in its original form or to segment the informational elements into a relational database, which respects the XML tags and allows great scalability. In either case, the data could still be searched, sorted or aggregated according to the XML tagging, which was applied transparently as the engineer captured the data in the field.


  • XML-enabled forms allow engineers to capture data in the field accurately and efficiently.
  • Intelligent applications guide users and provide meaningful information based on context and content.
  • Pre-tagging of data gathered in the field enables Web service automation of data collection and analysis.
  • Smart document solutions automate creation of reports and other processes based on data gathered in the field.

Marketing and Brand Reports

This scenario describes how a market research firm can use XML to automate the creation of dynamic, data-driven reports for their clients—delivering more meaningful reports, and capturing more information, such as specific references to products and companies, for future reference and reuse.


The success of a brand, which represents the sum total of the customer's experience with a product, service, or company, can be directly correlated to customer loyalty and profits. The process of building, managing, and tracking brands for companies and specific products is critical to most businesses.

Anticipating shifts in a brand's relevancy, channel consistency, positioning, and alignment with customer's desires is critical to maintaining competitive advantage. Brand managers frequently rely on consultants to deliver on market intelligence to create dynamic "brand scorecards," which effectively track and assess a particular brand's strengths and weaknesses. This intelligence is typically delivered in periodic reports.


Thoroughly monitoring all the market intelligence necessary to assess the effectiveness of a brand requires companies to assimilate data from multiple sources, such as customer phone surveys, call center complaints, satisfaction data, Web stats, newsgroup postings, and others. This information is received in multiple formats, including XML, print documents, HTML, and raw data. The variations in format complicate the process of aggregating the information and creating a brand scorecard—a process which can take more than a month using current methods.

Because of the time involved to create and distribute brand scorecard reports, the data in these reports can be out of date before they are even published. Moreover, the static nature of these reports (typically presented as Word documents or Microsoft Office PowerPoint® 2003 presentations) do not allow brand managers to monitor perception in real-time nor to respond to shifts in perception in a timely fashion.

In addition, for companies with numerous brands, the traditional, static reports do not facilitate comparison or 'roll up' of data across different brands or products. This makes it extremely difficult to compare data among brands (or with competitors) and can hinder the product development cycles that may rely on historical trends.

A Desktop XML Solution

With incoming brand data and market research tagged as XML, a company can improve the process of creating brand scorecards and make more effective use of this data in other processes, such as new product development.

To achieve this, the company must first create a custom XML schema that represents common data types as well as internal processes for market and brand research. Second, they must equip their brand managers with XML-enabled tools on their desktops for reporting on brand information. The combination of XML-formatted data with a rich set of desktop tools provides flexible, powerful options for creating reports.

The Microsoft Office FrontPage® 2003 Web site creation and management tool enables consultants to quickly create customized Web pages or portals for specific clients or brands. FrontPage 2003 allows users to work with or connect to any XML data/source and to create style sheets (XSL-Ts) using graphical tools that don't require programming skills. By creating Web pages that retrieve the XML data directly from the source, the consultants can deliver real-time information—clients don't have to wait for a monthly or quarterly report.

Important  The information set out in this section of this topic is presented exclusively for the benefit and use of individuals and organizations outside the United States and its territories or whose products were distributed by Microsoft before January 2010, when Microsoft removed an implementation of particular functionality related to custom XML from Word. This information may not be read or used by individuals or organizations in the United States or its territories whose products were licensed by Microsoft after January 10, 2010; those products will not behave the same as products licensed before that date or licenses for use outside the United States.

Microsoft Office Word 2003 supports customer-defined XML schema, enabling users to work in the comfortable environment for creating brand scorecards. The addition of XML support also enables users to pull together data from many sources—including real-time data from Web services, and archived data stored in previous reports or a central database.

A custom application hosted in Word could list these data sources in the task pane, allowing consultants to enter parameters, such as dates, products, or geographic regions for the information to be retrieved. The application could retrieve the data, displaying the resulting data set in the task pane and allowing the user to select all or part for inclusion in the Word document. In addition, as users create reports, they can tag keywords and important content—or use the smart tag capability of Word to automatically identify items such as product names and competitors that they wish to monitor.

Because the resulting document or report retains the XML structure, the data it contains can be sorted, filtered, indexed and reused. People can subsequently search against the brand report to find comparable reports that have similar findings or similar studies.


  • Customer-defined schemas allow users to create data structures and definitions appropriate to their unique processes.
  • XML-savvy tools allow users to quickly create dynamic reports in a variety of formats.
  • Smart document solutions guide users in the creation of sophisticated reports, suggesting content and offering links to related information.
  • Tagging data with XML facilitates storage, aggregation, analysis, and reuse.

Insurance Claims Processing


When an insured party informs his insurance company of a loss, the company initiates the claims process by collecting a variety of information. The process then moves offline where one or more claims handlers are assigned to manage the rest of the process. Some property and casualty claims are complex, especially when they correspond to an event and involve more than one covered item (for example, a tornado hitting an insured home and car), and may even involve some personal casualty. Other claims, like personal lines losses (theft, property damage, etc.) are less complex, but still require accurate data handling.

The data collected is already highly standardized; the ACORD (Association for Cooperative Operations Research and Development), a nonprofit insurance association, develops and publishes more than 450 standard insurance forms, which are used by more than 1000 companies and meet all regulatory requirements for the U.S. property and casualty market.

The in-field assessments of the reported damage are far less controlled. Often, the high volume of claims forces insurance companies to assign claims investigations to field adjustors by availability and physical proximity to the loss, instead of by appropriate skill set. An inexperienced adjustor can expose the insurer to the potential risk of paying fraudulent or out-of-policy claims.


Although the data collected follows a standard form, no validation is performed at time of initial customer contact. The notification process can rely on paper forms or custom software applications (depending on the size of the insurer), but, regardless of format, inaccuracies introduced during this step affect the entire process—and can delay payment.

The process is further complicated by very complex claims processes, which typically involve numerous handoffs of information between different parties. Poor handoffs can result in lost or inaccurate data.

A Desktop XML Solution

It is very important that an insurance company gather accurate data in order to make informed judgments about claims and to analyze collective data for trends and risks. Automating the notification process allows insurers to ensure accuracy by eliminating transcription errors and other mistakes.

Because the data is already well defined by the ACORD standards, the definition of a set of industry-specific XML schemas is relatively easy. Applying the appropriate schemas to electronic forms allows insurers to tag data with XML at the time it is captured—without requiring users to understand or even be aware of the encoding schemes. Once information is tagged, the insurers can automate processes, perform validation, and minimize opportunity for inaccuracy or error at each step in a complex process.

As in previous scenarios, an InfoPath form would allow the insurer to validate data at the time of entry and flag or highlight anomalies or data that falls outside a preset range. The XML documents themselves are highly conducive to workflow management and process orchestration tools, such as BizTalk Server, which can automate routing of claims information between insurers, banks, underwriters, and other entities, regardless of the systems or platforms used.

XML also facilitates storage of claims information. Because it is inherently scalable, XML is suited to any size application; small offices or individual agents can export data to a Microsoft Office Access 2003 database on the desktop or to a larger Microsoft SQL Server database, while industry leaders can channel the same data into the massive relational structures they rely on for actuarial data, analysis, and data mining.


  • XML-enabled forms allow pre-tagging of information during data entry.
  • XML documents facilitate workflow solutions and integration with orchestration/automation of business processes.
  • Tagging data with XML facilitates storage of data on small or large scale.

Flow of Information between Government Agencies

This scenario describes how the widespread adoption of XML could streamline the flow of information between government agencies, improving the efficiency of government and reducing bureaucratic "red tape."


Nowhere are administrative requirements as complex and onerous as in government. Thousands of city, state, and federal agencies generate millions of records, which are stored in disparate databases as well as in paper files distributed all over the world. The efficiency of this bureaucracy frequently depends on the ability of these agencies to share this information—to transfer records, look up data, or confirm reports. The inability to do so leads to many of the redundancies and delays that have come to be known as bureaucratic "red tape."


The amount of information that must be collected and maintained by the U.S Government is staggering. While nearly all government agencies now have, or are in the process of moving toward, some form of electronic record-keeping, the myriad platforms, formats, databases, and systems used by the various agencies still pose a daunting challenge to the smooth flow of information. In addition, differences in how various agencies and organizations collect information often lead to redundant requirements for reporting.

The following examples illustrate two typical scenarios in which information collected from multiple points by a single agency is consumed by numerous downstream agencies.

  • Visas for foreign nationals wishing to enter the United States are typically issued at any of the hundreds of U.S. embassies abroad. Domestic agencies that accept applications from foreign nationals (for example, state driver licensing, university admission offices, or social services agencies) are responsible for verifying the type and status of the individual's visa; yet, no central, worldwide database for this information exists. The domestic agencies typically rely on the individuals themselves to produce the required documentation as proof of eligibility.
  • Immunization records present another case where information generated by a government agency (a public health office offering free immunizations) might be consumed by a completely separate organization (for example, a public school system), yet no standardized process exists for requesting or exchanging this information. Immunization requirements vary by state, and the processes for locally verifying and maintaining records vary by district or even by individual school.

The U.S Government's Federal Enterprise Architecture (FEA) Program seeks technology solutions that simplify and unify government processes, in an effort to promote interactions across governmental organizations and perform business activities while continuously improving internal efficiency and effectiveness. A major component of the FEA is the adoption of XML data standards and the move toward an Web services architecture. Doing so will facilitate storage and consumption of data and streamline transactions across the broad range of applications, platforms, and systems in use today.

The guiding principles for the FEA include:

  • Standards: Establish federal interoperability standards.
  • Investments: Coordinate technology investments with the federal business and architecture.
  • Data Collection: Minimize the data collection burden.
  • Security: Secure federal information against unauthorized access.
  • Functionality: Take advantage of standardization based on common functions and customers.

A Desktop XML Solution

The adoption of XML and standardized XML schemas across the federal government will help fulfill these principles. As part of the new "E-Gov" initiatives, the government is encouraging adoption of Internet and Web standards, XML, portals, and new integration models such as Web services. These technologies help agencies overcome interoperability issues traditionally associated with inconsistencies in the underlying hardware and software platforms. In addition, XML improves efficiency by allowing constituents to enter information once, and then store, share, and aggregate that data for later reuse.

The use of XML-aware applications and Web service connectivity on the desktops of civil servants and in field locations across the world—from school nurses' offices to U.S. embassies to the customs desks at ports of entry—will enable the free flow of information between government agencies, reducing waste and streamlining hundreds of processes associated with endless paperwork.

These agencies already employ tightly defined standards for data capture (whether on paper or online); formalizing the structures of these forms through XML is simply a matter of defining schemas that describe the information. For agencies that use documents extensively, the use of XML-enabled versions of Word and Excel can present a seamless transition to an XML-based system, requiring little or no retraining of employees or replacement of existing systems. For agencies that capture data primarily through forms, deployment of InfoPath forms and XML-enabled Web pages built in FrontPage can dramatically streamline manual processes, providing significant improvements in employee productivity in addition to delivering the other benefits of XML.

With more and more data captured and stored in XML, the federal government can more efficiently share the data across continental, state, and agency boundaries to ensure that information is appropriately disseminated and efficiently used.

In the case of visa status, XML would streamline the process by which embassies and other issuing agencies capture new or updated visa information. New information could be input using XML to facilitate the move to a central database, while Web services could allow access to data now stored in distributed databases by the issuing agency. Given proper authorization, agencies across the country could retrieve or verify the specific information (e.g., student or work authorization, date of expiration, or last known address) they need to complete their function.

In the case of immunization records, XML would likewise enable public health organizations to submit records to a central database. When students move or transfer schools, the new school could easily verify records at the source and retrieve data in the format required by its own systems, rather than relying on parents to provide the data or requesting records from the previous school.

These examples could be extended to nearly every branch of government where agencies interact with each other or with the public. And in an era of cost-consciousness, it is important to note that federal agencies can receive the benefits of XML without the prohibitive expense of replacing costly information systems. In each case, citizens benefit from faster, better service when interacting with government agencies, as well as reduced waste and inefficiency within the bureaucracy itself.


  • XML-enabled desktop applications enable agencies to capture data pre-tagged as XML, to facilitate storage, reuse, and exchange of information.
  • Use of the familiar Office environment minimizes retraining.
  • XML and Web services promote Interoperability across platforms and systems, minimizing replacement or upgrade of existing systems.

XML in the Microsoft Office System: Product Overview

This section highlights some of the new and enhanced features of the Microsoft Office System that enable the XML-based solutions described in the scenarios in the previous sections. Some of the XML capabilities in this document are only available in Microsoft Office Professional Edition 2003.

Microsoft Office Word 2003

Important  The information set out in this section of this topic is presented exclusively for the benefit and use of individuals and organizations outside the United States and its territories or whose products were distributed by Microsoft before January 2010, when Microsoft removed an implementation of particular functionality related to custom XML from Word. This information may not be read or used by individuals or organizations in the United States or its territories whose products were licensed by Microsoft after January 10, 2010; those products will not behave the same as products licensed before that date or licenses for use outside the United States.

Native XML support in Word 2003 enables authoring of rich content with customer-defined XML schemas, enabling the repurposing of document content across devices, platforms and processes. In addition, Word can act as a smart client and a host for smart document solutions.

Word offers two ways to save documents as XML. Support for XML as a native file format preserves the Word document, including formatting, hyperlinks and paragraphs. Support for customer-defined XML schemas enables users to preserve or extract from the document only the data or structural elements of interest to a particular application. In either case, users can create documents containing information marked by XML tags in a completely intuitive fashion; users need not learn or understand the concepts behind XML to realize the full benefit.

Smart client solutions allow users to consume data from Web services within Word, effectively connecting the desktop to data sources across the enterprise or on the Internet. Smart document solutions can automate document creation and document-related business processes. Smart documents combine the familiar Word (or Excel) interface with an intuitive task pane, which offers relevant information and context-sensitive actions based on the location of the cursor within the document. XML support also makes it possible for developers to create document solutions that incorporate virtually any live or static XML content into a Word document.

Microsoft Office Excel 2003

Customers who use Excel 2003 for importing and analyzing business data will benefit from the enhanced XML capabilities introduced in this version. Like Word 2003, Excel 2003 can act as a smart client for Web services and a host for smart document solutions that require analytical and calculation capabilities rather than rich text formatting.

Additional Excel XML capabilities include a visual tool for ease of mapping between XML spreadsheet and customer-defined XML schema. This enables developers or power users to more easily import or export data in Excel to or from enterprise data stores or Web services.

Microsoft Office InfoPath 2003

New with the Microsoft Office System, InfoPath 2003 uses a forms metaphor to capture information according to a customer-defined XML schema. InfoPath enables customers to gather and reuse information with predefined structure (pre-tagging) and as part of a business process.

The InfoPath interface allows users to easily create and gather information on top of the core XML model. InfoPath associates an XSL-T style sheet with the form interface, enabling users to view and edit XML forms. InfoPath provides all the functionality expected from a forms package, including the ability to structure and validate data, as well as the ease of use of word processing—all within the familiar Office user interface.

InfoPath supports complex forms with hierarchical structures, freeform text, tables, optional or repeated blocks, data validation, data aggregation, and forms with need of multiple views. In a corporate environment, InfoPath streamlines data entry and data capture; native support for XML enables companies to create InfoPath solutions that send data from the desktop environment to backend systems via Web services.

Microsoft Office Access 2003

Access 2003 enables Office users to extract XML data from one or more tables in a database. From within Access, users can browse tables in a database and select some or all of the data to be exported. This enables simple extraction of relevant data (for example, to attach to an e-mail message) and integration with automated business processes. For importing, Access lets users directly import data respecting referenced XSDs (XML schema definitions) or create XSL-Ts that define how data is imported, so users can control exactly how the data is represented in Access tables.

Microsoft Office FrontPage 2003

FrontPage 2003 lets users quickly build high-quality, data-driven Web sites that present dynamic views of information from enterprise systems or local data stores. FrontPage supports a complete set of tools for creating and editing Web pages that connect to a variety of data sources, including XML files that follow XML-defined schemas, databases, and Web services. Users control how data will be displayed on the Web by creating XSL-Ts using an intuitive, graphical editor. These XSL-T data views include industry-standard reporting tools for sorting, grouping, filtering, and conditionally formatting data.

Microsoft Office Visio 2003

Microsoft Office Visio® 2003 drawing and diagramming software gives users the capability to integrate information from a database into a diagram. Diagrams saved as Visio XML files could incorporate XML data that follows a customer-defined schema and can later be mined to retrieve data from within the diagram. This enables developers to create rich Visio solutions for modeling business processes, or that associate data from any XML data source with specific shapes or diagram elements.

For More Information