Access 2003 and XML
The new XML features in Access 2003 aren't just an interesting enhancement—they have the potential to redefine the role of Access in Office applications. Peter Vogel explains.
The ability to export data from Access into XML has been part of the Access developer's toolkit for some years now, either as a feature in ADO or as part of the Access object model. With Access 2003, the latest versions of these XML features, coupled with the XML support in the other Office applications, offer the potential to make Access and Jet the repository for all Office data. There's one fly in the ointment: Fully exploiting these opportunities requires using XSLT, an issue I'll discuss at the end of this article (see the sidebar "The Role of XSLT" for an introduction to XSLT). In the meantime, I'm going to show you what's possible with the XML features in Access 2003 in the hopes that you'll get so excited that the "need for XSLT" won't stop you from following up on these opportunities. I'll begin with the Access 2003 XML features that don't require you to learn yet another programming language.
Before I begin, let me pose a question: If XML documents are so important that all the Office applications can be saved as XML, where are you going to store your XML documents? One answer, of course, is to store your XML documents in Access databases in a memo field (after all, an XML document is just a big string). Unfortunately, data stored in a memo field is a single unanalyzable blob. However, if you could dismantle XML documents into database records or assemble database data into XML documents—well, that would be useful. Bear that in mind as we proceed.
XML in the user interfaceI won't review the existing XML functionality in Access (for that, see my article in the August 2000 issue of Smart Access, "XML in Access 2002"). In fact, the first thing that struck me about the new Access 2003 XML features is how much of the new functionality seems to be the logical extension of features introduced in the previous version of Access.
One example is the ability to export XML components to an XML file. If you select File | Export and, in the following dialog, set the Export As Type option to XML, the same "Export to XML" dialog comes up as it did in previous versions of Access. However, when you select the More Options button, you get a new dialog (shown in Figure 1).
The new dialog not only shows the item selected for export (in this case, the Employees table) but also any related tables. You can choose to select to export these related tables—and any tables related to those tables, and any tables related to them, and so on. In Figure 1, I've drilled down to get the Order records for each employee and the Order Detail records for those Orders. In addition, as the figure shows, the Order header records are related to Customer records as "Lookup Data."
A pause for some definitions: A "related table" is a table that's on the "many side" of a one-to-many relationship with the selected table; "Lookup Data" refers to tables on the "one side" of a one-to-many relationship with the selected table.
Clicking on the OK button causes all the records in the selected tables to be exported (not just the related records in the selected tables). You can't, however, from the user interface export a set of unrelated tables to the same XML file.
When importing XML data, the dialog box in the current version of Access implies that you should be able to import several tables at once—but you can't. In Access 2003, the new dialog provides a list of all of the tables in the file (see Figure 2). You can't, however, choose to select to import only some of the tables.
There are more useful XML features in the Access 2003 user interface. When exporting from an open datasheet (either a query or a table), the options shown in the upper right hand corner of the dialog in Figure 1 become available. You can choose, for instance, to export only the current record rather than all the records displayed in the datasheet. If you've applied a sort or filter to the datasheet, you can chose to export only the filtered data. You can also choose to have the data written to the XML file in the same order as the datasheet is currently sorted.
Exporting from codeThe most exciting (at least, exciting to me) new feature in exporting data, compared to previous versions of Access, is the ability to include additional tables in the export file. From your code you have more flexibility than you do from the Access user interface since you can export any set of tables that you want. This facility is handled through a new parameter that's been added to the Application object's ExportXML method. This new, final parameter accepts an AdditionalData object that you can use to specify the additional tables to include in the export.
Using the AdditionalData object is easy: Create the object, use its Add method to add table names to the object, and then pass the object to the ExportXML method. In this example, the ExportXML method is exporting the Employees table along with the Order Details and Orders table:
Dim adOtherTables As AdditionalData Set adOtherTables = Application.CreateAdditionalData adOtherTables.add "Order Details" adOtherTables.add "Orders" Application.ExportXML acExportTable, "Employees", _ "c:\sample.xml", , , , , , , adOtherTables
Other new features for the ExportXML method include the ability to specify a Where clause in the second last parameter of the ExportXML function to control which records are exported. You can also now export the results of SQL Server objects that return data (for instance, functions and stored procedures) by using the right option in the ExportXML's first parameter. When exporting table structure information, you can choose to omit the primary key and index information.
The ImportXML option is unchanged from the last version of Access.
Importing non-Access dataYou're not limited to importing only XML files that are created by Access; you can import any XML file. For instance, I tried importing this simple XML document into Access:
<?xml version="1.0"?> <so id="1134" type="Rush"> <Products> <Line> <ProductId>1234</ProductId> <Quantity>4</Quantity> </Line> <Line> <ProductId>Axyz</ProductId> <Quantity>5</Quantity> </Line> </Products> </so>
The result was the Access table shown in Figure 3.
However, while that import was a success, it's a qualified success. The data in the id and type attributes didn't make it into the table, for instance.
You may be even less happy with the results from importing more complex XML documents. Table 1 shows an Excel spreadsheet that seems like a likely candidate to be imported into Access. Figure 4 shows the import dialog that results from importing that Excel 2003 spreadsheet (after the spreadsheet was saved as XML). The results, after being imported into an Access table called "Cell," are shown in Table 2. Not quite the same as the data looked in Excel. Importing Word documents fares no better, even when the Word document consists of nothing but tables.
Table 1. An Excel table.
Table 2. The imported table in Access.
The results aren't much better exporting Access data into complex XML formats. For instance, after exporting the Employees and related tables into an XML file, I tried opening the resulting file in Excel. The results can be seen in Figure 5. Each Access table appears in the Excel spreadsheet, laid out horizontally (that is, columns B through G are from the Employees table, columns H through Q are from the Orders table, and so on). Column A holds the "generated" attribute from the dataroot element.
The import formatWhile Access can import files in a variety of formats (as shown by my Salesorder XML file), the process works best when the XML document being imported is in the format that Access expects. You can preview this format by exporting some XML data and opening the resulting file in Notepad.
The XML format used by and expected by Access is relatively straightforward. The opening lines in the file don't vary much from one instance to another. The first line is the xml declaration element; the second line is an element called dataroot. The dataroot element defines two namespaces (which I'll ignore for this article). The dataroot element includes two optional attributes: the location of a Schema file that describes the XML elements, and the date that the file was generated. Here's a typical example:
<?xml version="1.0" encoding="UTF-8"?> <dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="Employees.xsd" generated="2003-09-01T16:15:31">
The dataroot element encloses all the elements in the Access XML file. Within the dataroot element, each exported table is represented by an element that consists of just the name of the table. Within these table elements, there's a separate element for each field in the record. This example shows a record from the Employees table followed by a record from the Customer table:
<Employees> <EmployeeID>1</EmployeeID> <LastName>Davolio</LastName> <FirstName>Nancy</FirstName> </Employees> <Customers> <CustomerID>WOLZA</CustomerID> <CompanyName>Wolski Zajazd</CompanyName> </Customers>
The problem in importing is to convert data into this format; in exporting, to convert the data into the format that the other application requires. Fortunately, there's a tool designed specifically for converting one XML format to another: XSLT. In the rest of this section, I'll show an XSLT stylesheet for converting from Excel XML to the Access format. If you aren't interested in the details of this XSLT stylesheet, skip to the start of the next section where I'll show you how to use XSLT from Access 2003.
In this example, I'll convert the Excel spreadsheet in Table 1 into a useful table in Access. The table will be called Customer and will contain fields called CustomerId, Quantity, and ProductId. The relevant section of the Excel spreadsheet looks like this in XML (you might want to review my article on Excel XML in the September 2003 issue of Smart Access):
<Worksheet ss:Name="Sheet1"> <Table ss:ExpandedColumnCount="3" ss:ExpandedRowCount="3" x:FullColumns="1" x:FullRows="1"> <Row> <Cell><Data ss:Type="String">Customer</Data></Cell> <Cell><Data ss:Type="String">Quantity</Data></Cell> <Cell><Data ss:Type="String">Product</Data></Cell> </Row> <Row> <Cell><Data ss:Type="String">Susan</Data></Cell> <Cell><Data ss:Type="Number">400</Data></Cell> <Cell><Data ss:Type="String">A123Y</Data></Cell> </Row> <Row> <Cell><Data ss:Type="String">Joe</Data></Cell> <Cell><Data ss:Type="Number">200</Data></Cell> <Cell><Data ss:Type="String">B456Y</Data></Cell> </Row> </Table>
The XML document that I want to create looks like this (I've left out any unnecessary elements):
<?xml version="1.0" encoding="UTF-8"?> <dataroot xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Customer> <CustomerId>Susan</CustomerId> <Quantity>400</Quantity> <Product>A123Y</Product> </Customer> <Customer> <CustomerId>Joe</CustomerId> <Quantity>200</Quantity> <Product>B456Y</Product> </Customer> </dataroot>
The XSLT spreadsheet that I need looks like this (I've simplified this slightly by omitting namespace references):
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> <xsl:template match="/"> <dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <xsl:apply-templates select="//Worksheet/Table"/> </dataroot> </xsl:template> <xsl:template match="Table"> <xsl:for-each select="ss:Row[position()>1]"> <Customer> <xsl:apply-templates select="Cell"/> <xsl:apply-templates select="Cell"/> <xsl:apply-templates select="Cell"/> </Customer> </xsl:for-each> </xsl:template> <xsl:template match="Cell"> <CustomerId><xsl:value-of select="Data"/></CustomerId> </xsl:template> <xsl:template match="Cell"> <Quantity><xsl:value-of select="Data"/></Quantity> </xsl:template> <xsl:template match="Cell"> <Product><xsl:value-of select="Data"/></Product> </xsl:template> </xsl:stylesheet>
Even without a background in XSLT you can probably puzzle through this code:
- The first template (the one with match="/") is run automatically by XSLT. Within this template, the dataroot element is added to the output (any non-xsl element is written to the output document).
- Also within this template, an apply-templates statement finds in the XML document the Excel Table element that's nested within the Workbook element.
- The next template (the one with match="Table") is run when the Table element is found in the XML document.
- Also within this template, a for-each statement selects each Row element in turn (skipping the first row, which has the column titles).
- As each Row element is found, the apply-templates statements inside the for-each statement select the first, second, and third Cell elements.
- The final three templates (the ones with match="Cell[position=...]") are run for each of the three Cell elements, adding the tags and data for the CustomerId, Quantity, and Product columns.
With this XSLT stylesheet created to handle the conversion, how do you use it from Access when importing data?
Incorporating XSLTIf you go back to the Import XML dialog in Figure 2, you'll notice an Options button on the right-hand side of the dialog. Clicking on that button adds a number of options to the dialog, including a button labeled "Transform..." (XSLT programs are referred to as "Transforms" because too many people know what a "program" is). Clicking on this button opens the Import Transforms dialog (see Figure 6). This dialog lets you add an XSLT transform to your Access application. Once added, you may select any XSLT stylesheet from this list and click the OK button. The XSLT transform will be applied to the input file selected when you started the Import. The result of this transformation will be displayed in the Import XML dialog instead of the original file. Clicking the OK button in the Import XML dialog will import the transformed file.
In my example, I'd begin by selecting my Excel XML file. After clicking the Options button, I'd select the XSLT transform from the previous section. Access will apply the transform to the Excel stylesheet to create the Access import file. Clicking the OK button imports the newly created Access version of the file.
You can also use XSLT when exporting data. The process is simple: Once the Export XML dialog comes up, click on the More Options button. The dialog that appears (see Figure 1) includes a Transforms... button on the right-hand side. Like the Transforms... button on the Import dialog, clicking this button brings up a dialog that lists all the XSLT transforms associated with the Access database (you can also add new XSLT transforms with this dialog). Once you select an XSLT transform, it's applied to the XML data that you're exporting to create a new file. This transformed output is what will be saved. Using XSLT in an export lets you convert exported Access XML into any format that you want—including Microsoft Word or Excel.
From code, you can use the TransformXML method of the Application object to apply XSLT transforms. This method can be used in conjunction with the import and export code that I showed earlier. The TransformXML method accepts three parameters: the name of the input XML document, the name of the XSLT stylesheet, and the file to put the result in. This example converts a file called Input.xml into a file called Newfile.xml using an XSLT transform called Transform.xsl:
Application.TransformXML "c:\Input.xml", _ "c:\Transform.xsl","c:\Newfile.xml"
As you've noticed, much of this new Access functionality depends on using XSLT. Learning XSLT isn't an insurmountable barrier. Like SQL, XSLT is a... different kind of language. It is, however, reasonably easy to learn. In addition, I expect XSLT templates for common Office activities to appear. Many developers will be able to implement their applications by modifying these templates. The Download for this article includes the XSLT stylesheet that I showed earlier along with an XSLT stylesheet for converting Access data to Microsoft Word. The XSLT files that accompany this article might be all you need—just tweak them to create the XSLT that you need (my article on Word XML in the August 2003 issue of Smart Access might be helpful here). In addition to the sample XSLT spreadsheets, I've included the Excel and Word XML documents that I used in this article along with the database I used in my examples.
I just hope you've seen that the opportunities and rewards of this new functionality are worth the effort of learning how to use it: From your Jet database, store and retrieve all data from any document.
Sidebar: The Role of XSLT
One of the major components of the XSLT world is the programming language XSLT (short for eXtensible Stylesheet Language Transform). There are actually two dialects of XSL, the other being XSL-FO (short for Formatting Objects). XSL-FO is a language for specifying formatting—think of it as HTML on steroids.
XSLT is a combination of procedural language features (if statements, loops, and so on) and rule-based language features. The rule-based features allow you to set up templates to be executed whenever a specific element, or type of element, is found in an XML document. Typically, a template generates output in some format. So the standard XSLT processing cycle is:
- Scan an XML document for elements that match a particular pattern.
- Find an XSLT template that matches those selected elements.
- Generate output based on the content of the selected XML document element.
With an XML document as input, XSLT is potentially capable of generating output in any format that you want.
Typically, XSLT is used to generate another XML document from the input XML document. For instance, in the examples that I use in the article, I convert Access XML formats to Excel and Word formats and vice versa.
To find out more about Smart Access and Pinnacle Publishing, visit their website at http://www.pinpub.com/
Note: This is not a Microsoft Corporation website. Microsoft is not responsible for its content.
This article is reproduced from the November 2003 issue of Smart Access. Copyright 2003, by Pinnacle Publishing, Inc., unless otherwise noted. All rights are reserved. Smart Access is an independently produced publication of Pinnacle Publishing, Inc. No part of this article may be used or reproduced in any fashion (except in brief quotations used in critical articles and reviews) without prior consent of Pinnacle Publishing, Inc. To contact Pinnacle Publishing, Inc., please call 1-800-788-1900.