Creating Word 2007 Templates Programmatically
2007 Microsoft Office Suites
Microsoft Office Word 2007
Summary: Read details about how to create document templates programmatically for Microsoft Office Word 2007, including information about using the new content controls, document building blocks, and XML mapping. Review some of the newest members of the Word 2007 object model. (27 printed pages)
Business Scenario: Stock Options Template
About Content Controls
What Is XML Mapping?
New Objects and Collections in the Word 2007 Object Model
Communicating with Custom XML Parts
We believe that template designers will be very pleased with Microsoft Office Word 2007. Word 2007 introduces several significant new features targeted toward template developers, such as:
- Separation of XML data from the document
- Content controls
- XML mapping
Combined, these new features make template creation much easier through programmatic advances and new features in the user interface (UI). Word 2007 isolates custom XML data from the document, providing true separation of data and view. Content controls provide an easy way to create blocks of content, such as drop-down menus, that are more secure from accidental modification by end users. After you add content controls, you can use XML mapping to map data from custom XML data to the document. This article uses a real-world scenario to demonstrate how to use these features. Following the scenario, you can read details about content controls and XML mapping to help you get started creating your own Word 2007 templates.
The following section of this article presents a business scenario that enables you to understand the potential development possibilities for template creation offered by Word 2007. Our goal is to describe a clear narrative of the approach followed in order to create a stock options template using Microsoft Office Word 2003, and to highlight how the development experience is improved using new features offered by Word 2007. The successive sections of this article dive into the details of how to work with each of the new Word 2007 features listed in the scenario's narrative.
Suppose a fictional financial company, Contoso, wants a Word template with the following content on the first page:
- A table displaying the data for a specific stock:
- Last price
- 52–week high
- 52–week low
- Daily volume
- 50–day moving average
- 200–day moving average
- A table displaying fundamental data, such as:
- Debt/equity ratio
- Gross margin
- Net profit margin
- Market cap
The second page of the document contains regular text.
The first page's tables may appear like this:
Figure 1. Tables displaying data for a specific stock
Word 2003 Approach: Using Tables
In Word 2003, to use tables, you must surround each cell in the two tables with an XML element. If you have done this before, then you are familiar with the pink tags that represent each element's begin and end tags. You can update the data in each XML element as necessary. This is a valid approach, but it leaves the data vulnerable to accidental modification. An end user who is not knowledgeable about XML might accidentally delete one of the pink beginning or end tags of an XML element. The solution to lock down the tables may sound obvious, but it is not a trivial task in Word 2003. To address this, many template designers create a one-page section for the tables and completely lock it down, preventing end users from making any modifications whatsoever. They then create a second section for the page with text so that end users can update it. This is not the ideal solution because it requires the use of sections and locking down the entire set of tables, which prevents user modification of any cell.
Word 2007: Approach Using Content Controls
Word 2007 offers a new feature called content controls that make a solution to the scenario much more robust and elegant. There are several different types of content controls, including drop–down menus, combo boxes, rich text, plain text, calendar, and picture, all of which let a template designer create form-like editing in a document. When a content control is inserted in a document, it is identified by a border and a title in the UI, neither of which prints. This results in a highly visible piece of content that is easy to work with, but, unlike the older controls in Word, does not contain unnecessary pieces that print on the page.
For this scenario, the data in each cell is contained in a content control for plain text. When a control is inserted, it looks like this in the document:
Figure 2. A content control in a Word document
Notice the title tag at the top. When you are in Design mode, a drop-down triangle appears on the title tag. By clicking the drop-down triangle, you can change the control's properties.
Locking Content Controls
After you create a content control, you can lock its content to prevent modification and protect the control itself from deletion by using the Properties dialog box. For more information, see Locking Content Controls.
You can check Content control cannot be deleted, Contents cannot be edited, or both. You can also enable each of these programmatically.
Nesting Content Controls
You can embed content controls within one another. Nesting controls helps make a template easier to work with. You can also create a custom document building block out of a group of nested content controls.
Document building blocks are predefined pieces of content, such as a cover page, a header, a footer, or a custom-built clause in a contract. Building blocks facilitate the quick creation of professional-looking Word documents. You can also create custom building blocks. For example, in a table, you can insert content controls into each cell and group the entire table into its own, larger document building block, and then you can insert the custom document building block elsewhere in the document.
Using XML Mapping
The new Microsoft Office Open XML Formats store custom XML data in document parts. Document parts help define sections of the overall contents of the file. The new file format consists of many different types of document parts, including a custom XML data part. All document content associated with an XML element maps to data in a custom XML data part. This separation of the XML from the document formatting and layout makes it easier to access data programmatically and provides a more robust document.
XML mapping is a feature of Word 2007 that enables you to create a link between a document and an XML file. This creates true data/view separation between the document formatting and the custom XML data.
XML mapping enables you to map an element in a custom XML part that is attached to the document. For our stock options template scenario, you can map most of the cells in the tables to custom XML. For example, look at the content control in the "Last Price" cell in the table. To map the "Last Price" content control to data in an element in a custom XML part, you do the following:
- Attach a CustomXMLPart object to the document. In this case, the custom XML part is an XML file that contains an element with data that is the last price of a stock. You can do this programmatically by calling the Add method on the CustomXMLParts collection of the Document object.
- Map the element with the last stock price to the content control either manually or programmatically. To do so:
- You can use an XPath to identify the specific element in the CustomXMLParts object.
- After you identify the element, the data in the element specified by the XPath is mapped to the content control. This mapping causes the data in the XML element (assumed to be a string in this case) to appear in the UI in the content control.
- If the data in this element is modified, then the data in the control reflects the modification.
- Similarly, if the end user edits the text in the control, then the data in the XML element to which it is mapped is also updated. However, if you do not want the end user to modify the text, you can lock the content control against editing or deletion.
- Create a small add-in that automatically updates the data in the custom XML part directly so that the last price is always reflected when the document is opened.
Note The code works against a custom schema attached to the document, not the Word 2007 schema.
This is accomplished using events. In this scenario, the document Open event triggers an add-in that retrieves the latest stock price for the company in question. This price is inserted into the element in the custom XML data to which the "last price" content control is mapped. The add-in can also perform calculations to determine figures, such as the 52–week high, the 52–week low, the moving averages, and others. All this happens behind the scenes when the document opens. The template user never does anything and always sees current data.
Modifying Custom XML Data
Because creating rich templates likely involves the use of custom XML data, it is worth briefly discussing how custom XML is stored in a document and how to work with this custom XML. Word 2003 technology and WordprocessingML introduced the capability of including custom XML data in a Word document. This created a multitude of possibilities for template developers by allowing you to associate pieces of content in the document with custom XML, so that if the XML changed, so did the content. This was a huge success. Word 2007 introduces several new enhancements to this functionality.
Custom XML in Word 2003 is immersed within the XML that describes the entire document. When the end user saves a document as XML in Word 2003, the result is an XML file conforming to the WordprocessingML schema that contains every bit of information necessary to describe the document. When you introduce custom XML, Word 2003 embeds it within the same WordprocessingML document. This means that programmatically manipulating the custom XML data poses some risk because any accidental invalidation of the custom XML can corrupt the entire document.
Word 2007 now uses the new Microsoft Office Word XML Format. The Word XML Format separates key pieces of a document from the body. Content such as headers, footers, endnotes, images, and lists is contained within separate XML files within the new file format. The new file format is a bundle of XML files contained inside a ZIP container that are mapped together with relationship files. If you attach custom XML to a Word 2007 document, then Word 2007 stores it in its own separate XML file within the ZIP container. This makes it easier to identify and manipulate custom XML programmatically by changing, deleting, or adding data to the file.
The new Microsoft WinFX System.IO.Packaging class enables you to access the Word XML Format programmatically. You use the PackagePart class to operate on an individual part, such as a custom XML part, in the document.
Even if a mistake occurs that invalidates the custom XML data, the rest of the document is untouched. Therefore, the document still opens successfully. When this happens, a dialog box indicating a problem appears, alerting the end user to the damaged part. You must fix the custom XML data, but all other data within the document is still intact.
As demonstrated using the scenario discussed earlier in this document, Word 2007 templates can be much richer and more robust because of content controls. This section provides in-depth detail about how they work. Content controls enable you to:
- Specify structured regions in a template. Each structured region has its own unique ID, so that you can read from and write to it. Examples of types of structured regions (or content controls) include combo boxes, pictures, text blocks, and calendars.
- Determine the behavior of content controls. Each content control takes up a portion of a document and, as the template author, you can specify what each region does. For example, if you want a region of your template to be a calendar, you insert a calendar content control in that area of the document, which automatically determines what that block of content does. Similarly, if you want a section of a template to display an image, create a picture content control in that area. In this way, you can easily build a template with pre-defined block types.
- Restrict the content of content controls. You can restrict each content control, so that end users cannot inadvertently delete or edit it. This is useful if, for example, you have copyright information in a template that the end user should be able to read but not edit. You can also easily lock a template's content so that an end user does not accidentally delete portions of it. This makes templates more robust than in previous versions.
- Map the contents of a content control to data in a custom XML part that is stored with the document. For example, if you insert a content control that contains a table of stock prices, you can map the table cells to nodes in an XML file that contain the current stock prices. When the prices change, an add-in can programmatically update the attached XML file, which is mapped to each cell, so that the new, updated prices automatically appear in the table.
The easiest way to create a content control is through the UI (although you can also create them programmatically). To create a content control through the UI, select the text that you want to turn into a content control and then, on the Developer tab of the Ribbon UI, choose the content control type that you want.
The following figure shows how to insert a plain text content control while building a template.
Figure 3. Create a plain text content control in your templates
There are seven types of content controls that you can add into a document. A new enumeration called wdContentControlType describes each control.
Table 1. Content control types and their wdContentControlType enumerations
|Content Control Type||Description||wdContentControlType Enumeration Value|
|date picker||A date-time picker.||wdContentControlDate|
|document building block||Allows the end user to choose from specified "document building blocks" types. It uses a rich-text block for its text.||wdContentControlBuildingBlockGallery|
|drop-down list||A drop-down list.||wdContentControlDropDownList|
|combo box||A combo box.||wdContentControlComboBox|
|rich text||A block of rich text.||wdContentControlRichText|
|plain text||A block of plain text.||wdContentControlText|
In the Word object model, there are two new collections (ContentControls and ContentControlEntries) and three new objects (ContentControl, ContentControlEntry, and XMLMapping) that pertain to content controls. Each is documented in more detail later in this article.
The ability to lock content controls enables you, as the template author, to create content controls without any fear of end users accidentally deleting or editing them. In the past, achieving this behavior was difficult. With the content controls in Word 2007, you can easily lock predefined portions of content against deletion and editing. Any portion of a template can also accept user input while other portions that are meant to be read-only are locked. You lock content controls using a dialog box that is located on the Developer tab of the Ribbon UI. By default, this dialog box is turned off, so the casual user cannot unlock content controls easily. You can also lock content controls programmatically, using the Word 2007 object model.
Another powerful feature of content controls is their ability to map to data in an XML file. When a content control is mapped to XML data, you gain new possibilities that do not exist in forms. As an example, one task you can do is map the options in a drop-down list content control to a node in an XML file. When you do this, each entry in the drop-down list in the UI is determined by the XML file's schema definition, and the drop-down list's current selection is stored in the mapped node.
The new Word XML Format is another change that creates additional possibilities. In the Word XML Format, XML data is stored in its own data store, completely separate from the document. This is a change from the WordprocessingML file format of Word 2003. This separation makes it easier to update XML data programmatically behind the scenes. Using the more flexible XML storage in Word 2007, you can programmatically change the items in a drop-down list content control by mapping each entry to an XML file. Simply modify each node that is mapped to an entry you want to change, and you see those changes in the drop-down list.
This article introduces how to add and work with content controls. It includes information about the new objects and collections in the Word 2007 object model. It shows you how to work with content controls programmatically and through the UI, and provides an object model reference for all collections and objects that are specific to content controls, including both new objects and current objects that have new members.
Several examples of how to modify or add a content control follow. To do these tasks, you must turn on the Developer tab, which is new to Word 2007.
To show the Developer tab
- Click the Microsoft Office Button and then click Word Options.
- In the Word Options dialog box, click Personalize.
- Check Show Developer tab in the Ribbon, as shown in Figure 4.
- Click OK.
Figure 4. The Developer tab in the Ribbon UI
Note When you are in developer mode, you see the Developer tab in the Ribbon UI.
Content Control Objects and Collections
There are two new collections and three new objects that are related to content controls:
A collection of ContentControl objects. The Document, Range, and Selection objects each have a ContentControls collection.
An object representing a single content control in a document. You can choose one of seven content control types in the wdContentControlType enumeration.
A collection of ContentControlListEntry objects on ContentControl objects that are of type wdContentControlDropDownList or wdContentControlComboBox. Each ContentControlListEntry is an entry in either a drop-down list or a combo box. This collection does not pertain to other types of content controls. When you call a method or property on ContentControlListEntries for a content control that is not a drop-down list or a combo box, Word displays an error to the user.
An object for a single entry in a content control that is either a drop-down list or a combo box. ContentControlListEntry objects do not pertain to other types of content controls. When you call a method or property on ContentControlListEntry for a content control that is not a drop-down list or a combo box, Word displays an error to the user.
An object that represents data that is mapped to a content control. XMLMapping is a member of the ContentControl object. You can use the XMLMapping object to map data to a content control.
The following figure shows a partial Word 2007 object model showing the ContentControl object.
Figure 5. Partial Word 2007 object model that shows the ContentControl object
Content Control Types
The calendar content control enables the end user to select a date from a calendar control. The calendar appears when the end user moves the pointer over the content control or clicks inside it. You can specify the format of the selected date either through the UI or programmatically.
Document Building Blocks
Document building blocks are new in Word 2007. A document building block is a predesigned piece of content, such as a cover page, a header, or a footer. A document building block allows the end user to choose from a defined list of document building blocks to insert into a document.
Note that you cannot map a document building block to data in an XML file. However, the document building blocks may contain content controls that you can map to data.
For example, if you want a list of stock quotes in a template to update automatically, you can create a document building block that is a table composed of cells, with each cell containing one plain text content control for a stock quote. Each text content control is mapped to a node in a custom XML part that contains a current stock price. In this case, you do not map the document building blocks to data, but you do map text content controls inside it. After you create it, you can lock the whole document building block so that end users cannot accidentally delete or break it.
The drop-down list content control allows the end user to select from a list of options in a drop-down menu. You can map the data in each entry to a node in a custom XML part. Programmatically, drop-down list and combo box content controls are different from other content controls because they each have a ContentControlListEntries collection. The ContentControlListEntries collection contains each entry from which the user can choose in the content control.
The combo box content control functions the same as a drop-down content control. The only difference is that the region allows the user to edit the text directly, like a typical combo box.
The picture content control enables the user to insert a picture into a document. When the user clicks this content control, a control appears, enabling the end user to select an image to insert.
Note When you map an XML node to a picture content control, it contains a picture in base-64 binary format.
A rich text content control is a block of rich text. The rich text content control is unique in that you cannot map it to data in a custom XML part.
A text content control is a block of plain text. Unlike the similar rich text content control, you can map the contents of a text content control to data in a custom XML part.
If you do not remove the ability to edit the contents of a text content control, and you map the content control to data in an XML node, the user can directly edit the XML data through the document. For example, a plain text content control mapped to a node that contains the word "hello" displays that word in the document. If a user edits that content control by deleting "hello" and replacing it with "good bye," the XML node updates so that the data it contains is the string "good bye."
Adding Content Controls
You can add any type of content control to a document either through the UI or programmatically.
To add content controls using the UI
- Select text to convert to a content control.
You can also position the cursor at the location where you want to insert a content control with no text selected.
- On the Controls section of the Developer tab, select the content control that you want to insert (as shown previously, in Figure 3).
Adding Content Controls Programmatically
To add a content control programmatically, you call the Add method on a ContentControls collection. The ContentControls collection on the Range object, Selection object, or Document object can call the Add method. Add the following line of code into the Microsoft Visual Basic for Applications (VBA) immediate window to create a picture content control on the current selection in the document.
This code creates a picture content control. To add other types of content controls, modify the wdContentControlType parameter to be wdContentControlDate, wdContentControlDocumentPartGallery, wdContentControlDropDownList, wdContentControlComboBox, wdContentControlRichText, or wdContentControlText.
A ContentControls collection belongs to the Range object and to the Selection object.
You can lock content controls to prevent deletion, content modification, or both. Locking content controls enables you to create, for example, a drop-down list populated with data from an XML file that the end user cannot delete or modify. The user can select any of the items in this drop-down list, but cannot modify the list itself. With this new feature, it is easy for you, as the template designer, to create a content control of rich text that contains, for example, copyright information that you do not want deleted or modified. In Word 2003, if you want to protect a part of a document you must protect a style and then apply that style to whatever text you want to lock. Content controls make this easier. Now, you can create rich documents with combo boxes, drop-down lists, text, pictures, and more, and protect any or all of the items so that the end user does not accidentally break the document.
You can lock a content control either through the UI or programmatically.
To lock content controls using the UI
- Select the content control.
- On the content control section of the Developer tab, click Properties.
- In the Content Control Properties dialog box, check Control cannot be deleted, Contents cannot be edited, or both.
Locking Content Controls Programmatically
If you want to lock a content control programmatically, set either the LockContentControl property or the LockContents property of the corresponding ContentControl object to True. If you add the following lines of code into the immediate window of VBA, they add a combo box content control to an empty document, and then lock it against both deletion and content modification.
activedocument.ContentControls.Add(wdContentControlComboBox) activedocument.ContentControls(1).LockContentControl = True activedocument.ContentControls(1).LockContents = True
LockContentControl corresponds to the Contents cannot be deleted check box in the Content Control Properties dialog box. LockContents corresponds to the Contents cannot be edited check box in the same dialog box.
Adding Titles to Content Controls
Each content control may also have a customizable label, or title, that appears above its contents. The title is optional. It provides a way to identify a content control on the document page.
Note The Tag property allows you to achieve the same result programmatically.
The title of a content control serves two purposes:
- In the UI, it appears in the tab on top of the content control. For example, in a template with content controls that help users add their mailing address, you might insert a drop-down list content control from which the user can choose their country/region. To help the user, you may assign a title for this content control called Country/Region.
- Programmatically, you can also use the title to identify a content control. You can retrieve a ContentControl object using its index in its ContentControls collection or using its title. If you already know the title of your content control, then you do not need to use the index to iterate through a collection in order to locate it.
To add a title to a content control in the UI
- Select the content control.
- On the content control section of the Developer tab, click Properties.
- Under the General tab, in the Title field, type a title for your content control.
Adding Titles to Content Controls Programmatically
The following sample code demonstrates how to create a text content control and assign it a title using the VBA immediate window.
Dim strTitle As String strTitle = "MyTitle" Dim oCC As Word.ContentControl Set oCC = Application.Selection.ContentControls.Add(wdContentControlText) oCC.Title = strTitle
Now that the content control has a title, you can refer to it using its title in the Item method. For example, to delete the content control with the title "MyTitle," add the following code in the VBA immediate window.
Modifying Placeholder Text for Content Controls
You can assign helpful instructional text to content controls. This text is called placeholder text. Placeholder text appears on the content control when its contents are empty. For example, placeholder text assigned to a rich text content control might say, "Please enter your comments here." When the user begins typing in the content control, the placeholder text goes away. If the user deletes all the content from the content control, then the placeholder text reappears. Also, if the content control is mapped to XML data and an event forces the data to update so that its contents are empty, the placeholder text reappears.
You can format placeholder text. This formatting is separate from the formatting of the contents of the content control. For example, you might format the placeholder text to have a font with a large point size and to be bold, so that the user notices it. But, you might want the style of the contents of the content control to match the style of text in the rest of the document.
Modifying Placeholder Text Programmatically
To modify the placeholder text of a content control programmatically, call the SetPlaceholderText method of the content control. SetPlaceholderText can take three parameters, an AutoTextEntry, a Range, or a String. For simplicity, this example takes a String literal containing "Type text here."
ActiveDocument.ContentControls(1).SetPlaceholderText ,,"Type text here"
You can set placeholder text to appear or not appear by using the ShowPlaceholderText property.
ActiveDocument.ContentControls(1).ShowPlaceholderText = True
If you need to get the placeholder text, you can get it as an AutoTextEntry using the PlaceholderText property of the ContentControl object.
XML mapping is new to Word 2007. XML mapping is a feature that enables you to create a link between a document and an XML file. This creates true data/view separation between the document formatting and the custom XML data. A more formal definition of XML mapping is to say it is a property on a content control that links the content of the content control to an XML element in a data store that is stored alongside the document. The last part of this definition refers to the Word XML Format, in which parts of a document are contained in individual XML files inside a compressed Word 2007 XML document. Inside the compressed document, in their own specific directories, are XML files that contain data mapped to content controls.
This means that XML mapping now works on XML data that is separate from the document content. This separation of data from formatting enables you to create more robust documents than you could with Word 2003. If the XML data is damaged, the document itself is untouched. In addition, any changes you make to the document formatting do not affect the structure of the XML data. This is a significant improvement over Word 2003, where you could accidentally invalidate a schema simply by moving text around in the document.
Any content controls that you map to damaged XML may not work properly, but the rest of the document remains undamaged. If something happens to the XML data that prevents it from validating against its schema, the document itself still opens. This separation also creates additional possibilities because it enables you to change, update, or delete XML data that you map to content controls, thereby changing the contents of those content controls without having to worry about what happens with the formatting of the document.
What Is Included in an XML Mapping?
XML mapping, content controls, and the new file format are related. The following section of this article focuses on XML mapping.
Custom XML Parts
The 2007 release of Microsoft Office associates every XML mapping with unique XML within the XML data store for the document. The data store provides access to all of the custom XML parts that are stored in an open file. You may refer to any node within any custom XML part inside the data store.
Every XML mapping refers to a namespace. An XML mapping refers to the root namespace of the custom XML part with which it is associated.
To create an XML mapping, you use an XPath expression to the node in the custom XML data part to which you want to map a content control. Note that after you provide an XPath, Word retains the most simplified version of it by default. For example, if you provide
//s:docTitle[@initialized='true'], Word, by default, resolves and stores
\document\docHeader(1)\docTitle(4). A flag allows you to create an exception to this default behavior, so that you can retain the original XPath, although this may have a minor impact on performance.
Adding XML Mapping in Word 2007
To map data to a content control, you must do two things:
- Add a data store to the document that contains the information to which you want to map.
- Set an XML mapping on a content control that refers to a node in the added data store.
The data store in a document in the Word 2007 object model is contained in a new collection of the Document object, called CustomXMLParts. A CustomXMLParts collection contains CustomXMLPart objects. It points to all the data store items that are available to a document. A CustomXMLPart object represents a single custom XML part in the data store.
To load a custom XML part, you must first add a new data store to a Document object by calling the Add method. The Add method appends a new, empty data store to the document. Because it is empty, you cannot use it yet. Next, you must load a custom XML part from an XML file into the data store, or CustomXMLPart object, by calling its Load method, using a valid path to an XML file as the parameter.
Notice that, by default, there is always at least one data store on the document. For example, if you add the following code in the VBA immediate window on a new, blank document, you get a count of one, not zero.
? ActiveDocument.CustomXMLParts.Count 1
The default custom XML part contains the document's standard document properties; you cannot delete it. You can always look at a custom XML part by calling the read-only XML property on it. If you call the XML property of a CustomXMLPart (data store), a string is returned, which contains the XML in that data store. For example, use the VBA immediate window to call the XML property on a blank document:
The following XML appears.
<?xml version="1.0" standalone="yes"?> <CoreProperties xmlns="http://schemas.microsoft.com/package/2005/06/metadata/core-properties"> <Title></Title> <Subject></Subject> <Creator>Microsoft Employee</Creator> <Keywords></Keywords> <Description></Description> <LastModifiedBy></LastModifiedBy> <Category/> <Identifier/> <ContentType/> <ContentStatus/> <Language/> <Version/> <Revision xmlns="http://schemas.microsoft.com/package/2005/06/ metadata/core-properties">1</Revision> <DateCreated xmlns="http://schemas.microsoft.com/package/2005/06/metadata/core-properties"> 2005-06-29T20:13:00Z</DateCreated> </CoreProperties>
You can also access a specific item on the whole CustomXMLParts collection if you index the namespace. For example:
? ActiveDocument.CustomXMLParts("http://schemas.microsoft.com/ package/2005/06/metadata/core-properties").XML
The following sample code demonstrates how to attach an XML file to a document, so that it becomes an available data store item. Remember that if you identify your first added data store by passing an index to the CustomXMLParts object (for example,
oCustomXMLPart(2).Load), you must use an index of two. This is because an index of one returns the default store with standard properties for the 2007 release. Note that, alternatively, you can index CustomXMLParts objects by their namespaces.
Dim oCustomXMLPart As Office.CustomXMLPart Dim strXMLPartName As String strXMLPartName = "c:\myDataStoreFiles\myXMLDataStore.xml" ' First, add a new custom XML part to the document Set oCustomXMLPart = ActiveDocument.CustomXMLParts.Add ' Second, load the XML file into the custom XML part oCustomXMLPart.Load (strXMLPartName)
After you add a data store to your document (and the data store points to a valid XML file), you are ready to map one of its nodes to a content control. To do this, pass a String containing a valid XPath to a ContentControl object using its SetMapping method. An example of doing this with an XPath that refers to a data store node containing the first name of a book's author may look like the following.
' An XML mapping was added and loaded with information ' about books. ' First, create the XPath Dim strXPath As String strXPath = "/s:book/s:AuthorFirstName" ' Next, create an instance of a content control to work with Dim oContentControl As Word.ContentControl Set oContentControl = Application.Selection.ContentControls.Add _ (wdContentcontrolComboBox) ' Last, map the data using the XPath oContentControl.XMLMapping.SetMapping strXPath
XPath Links and Data
The link between a content control and the data in a data store does not change. That is, the XPath link is static. This means that after you map the data to the content control, the content of the content control is linked to the content of the node that is returned by the XPath, until you explicitly remove or change the XML mapping of that content control.
If a change occurs in the node's data, then the content control automatically reflects the change. As an example, suppose a document contains a text content control (a content control of type wdContentcontrolText) as follows. For this example, assume the underlined text is the content control. In the actual document, the content control does not look like this and is not necessarily underlined.
"This is the text in my document. I would like a pear."
Now, suppose the content control is mapped to a <fruitType> node of the following custom XML part.
<tree> <fruit> <fruitType>pear</fruitType> <fruitType>banana</fruitType> </fruit> </tree>
After this content control is mapped to the <
fruitType> node, changes in it reflect in the content control. If you write an add-in that modifies the data store by adding a new <
fruitType> as the first child of <fruit>, then the content control assumes the new data. The updated custom XML part looks like this.
<tree> <fruit> <fruitType>peach</fruitType> <fruitType>pear</fruitType> <fruitType>banana</fruitType> </fruit> </tree>
The text in the document now appears like this.
"This is the text in my document. I would like a peach."
Notice that the content control updates. In a different scenario, suppose the <fruit> node is data mapped to a drop-down list content control. Suppose that, in this case, the schema attached to the document specifies three possible choices for the <
fruitType> element, in this order:
"Orange" "Apple" "Banana"
Therefore, these are the options available from the drop-down list and they appear in the order that you list them in the schema. In this case, the addition to the custom XML part of the third node, containing the string "Banana", changes the selection of the drop-down list to "Banana," from whatever was selected before the addition.
What Are Dangling References?
In a case where a content control cannot be successfully mapped to a node in a linked custom XML part (for example, if the XPath is invalid), the XML mapping is said to become a dangling reference. There is no UI that indicates that a content control contains a dangling reference. The only way to confirm the existence of a dangling reference is through the object model.
When a dangling reference occurs, the content control in the document does not change in appearance.
There are two types of dangling references.
Dangling XPath References
When you replace or remove a node from a custom XML part, the potential exists for one or more XML mappings to become dangling references. This happens when the XPath of the XML mapping no longer resolves to data, which can be caused by the removal of a mapped node. A change to a custom XML part can also append a child to a mapped node that the schema specifies must be a leaf node. The addition of a leaf turns the mapped node from a leaf into a parent node.
In these cases, the XPath used in the XML mapping does not change, only the custom XML part does. Even though the XPath does not change, it now points to data it cannot use. That is a dangling XPath reference.
Each time you update the custom XML part, the Word document determines if any dangling references are resolved. In other words, it checks to see if any dangling references now point to a valid node in the custom XML part. If so, Word immediately updates the content control to reflect the updated, valid data.
As an example, suppose a document contains a text content control (a content control of type wdContentcontrolText), as follows. For this example, assume the underlined text is the content control. In the actual document, the content control does not look like this and is not necessarily underlined.
"This is the text in my document. I would like a banana."
The XPath used to map this content control to a node in a custom XML part looks like this:
This code tells the content control to use data in the third <fruitType> node. The custom XML part used is below.
<tree> <fruit> <fruitType>peach</fruitType> <fruitType>pear</fruitType> <fruitType>banana</fruitType> </fruit> </tree>
Now, suppose the custom XML part is modified so that the node containing the string "banana" is removed.
<tree> <fruit> <fruitType>peach</fruitType> <fruitType>pear</fruitType> </fruit> </tree>
This results in an XPath dangling reference. However, in the document, the content control does not change and it continues to display the word "banana."
"This is the text in my document. I would like a banana."
"Banana" is displayed even though the removal of the third node under <fruit> renders the XPath invalid.
Next, suppose that later you append a new <fruitType> node to the beginning of the list, so that the "pear" node becomes the third node under <fruit>.
<tree> <fruit> <fruitType>plum</fruitType> <fruitType>peach</fruitType> <fruitType>pear</fruitType> </fruit> </tree>
When this happens, the content control immediately updates to reflect this change. Because the XPath in the content control specifies a link to the third <
fruitType> node, the content control now displays the data contained in the new third node, "pear."
"This is the text in my document. I would like a pear."
Dangling Custom XML Part References
If you remove an entire custom XML part from a document or replace it, all XML mappings that refer to that custom XML part immediately become dangling custom XML part references. When this happens, Word retains the last effective XPath in each dangling custom XML part reference in the content control.
When a set of links includes dangling custom XML part references, the document attempts to reattach the links to the first custom XML part. In code, the first custom XML part is described as
ActiveDocument.CustomXMLParts(1). If any of the broken links resolve to a node in the first custom XML part, then every broken mapping immediately refers to that custom XML part and their content updates. If none of the links resolves to the first custom XML part, then the document attempts to resolve the broken links in the second custom XML part. If that does not work, then it attempts to resolve the broken links in the third custom XML part, and so on, until a link is resolved or until all custom XML parts are attempted. If no link is resolved, then each broken link becomes a dangling XPath reference and retains its last valid XPath.
Dangling References When You Change Mappings
When you change the mapping property on a content control, the mapping briefly goes through a state of being a dangling reference. This is completely expected. There is an order of actions that take place when the mapping of a content control is changed:
- The link to the original node is broken, creating a dangling XPath reference.
- The content control is linked to the new node.
- The content from the new node is pulled into the content control.
To the user, this is undetectable. Knowing what happens behind the scenes, however, helps you to understand how XML mapping works.
The following new items in the Word 2007 object model pertain to XML mapping.
Note These new objects and collections are also present in the Microsoft Office Excel 2007 object model and the Microsoft Office PowerPoint 2007 object model, which also use XML mapping.
This object contains mappings between namespaces and prefixes.
This collection represents schemas that are, or that are attached to, a custom XML part in the data store.
This object represents a schema in a CustomXMLSchemaCollection.
This collection represents a set of CustomXMLPart objects. It is also called the data store of a document.
This object represents a single custom XML part in the data store.
This collection represents a set of CustomXMLNode objects in the current document.
This object represents an XML node. A CustomXMLNode may be one of the following seven types, in the new msoXMLDataNodeType enumeration:
The primary role of the interfaces is to enable users to get and work with a custom XML part that is associated with a document. Using the two interfaces through the CustomXMLParts collection and the CustomXMLPart object, the user can accomplish the following tasks.
Getting, Creating, and Deleting Custom XML Parts
- Create an XML store item.
This is the same as creating a CustomXMLPart object in the CustomXMLParts collection of a document. To do this, call the Add method of the CustomXMLParts collection. The Add method creates a new CustomXMLPart, or data store, on the document.
- Enumerate a set of store items, or CustomXMLPart, objects.
You can enumerate the collection of data stores on a document by using the Item method of the CustomXMLParts collection. This Item method can take a Long parameter that specifies the index of the store you want or a String parameter that specifies the root namespace of the data store you want. Note that, if more than one CustomXMLPart object matches this root namespace, Word returns the first match in the index order.
- Get the interface for an existing store item.
You can do this by using the same Item method that enumerates the CustomXMLParts collection. You can also use two other methods. Each returns a CustomXMLPart object. These are the SelectByID method, which takes a String parameter containing the ID of the desired store, and the SelectByNamespace method, which takes a String parameter containing the root namespace of the desired store.
Working with Custom XML Parts
- Add to the custom XML part.
The CustomXMLPart object, which represents a custom XML part, has an AddNode method that can add a node. AddNode adds an XML node to a custom XML part. When calling the AddNode method, a CustomDataXMLNode parameter containing a parent node is required, and the new node becomes the child of that node. Optional parameters include:
- A String containing the name of the node to add.
- A String containing the NamespaceURI of the node to add.
- A CustomXMLNode specifying which node should be the next sibling of the added node.
- An MsoCustomXMLNodeType specifying the type of node to add.
- A String containing the value of the node (valid on text nodes only).
You specify the type of node to add by passing an msoXMLDataNodeType enumeration value to the AddNode method.
- Replace parts of the custom XML part.
The CustomXMLNode object, which represents a node in a custom XML part, contains two methods: ReplaceChildNode and ReplaceChildSubtree. The first method replaces a single node and the second method replaces a node and its children. Both methods require that you specify the node to remove and the node to replace it with, as parameters.
- Delete the custom XML part.
You can delete an entire custom XML part by calling the Delete method of the CustomXMLPart object that you want to remove. You can also delete a single node by calling the Delete method of the CustomDataXMLNode object that you want to remove.
- Get and set values inside the custom XML part.
Values are retrievable from a single node or from entire subtrees of the custom XML part. The CustomDataXMLNode object contains a writeable property called NodeValue, which can either get or set the text in the node. Note that this works only on nodes that contain text. Text nodes are those of types msoXMLNodeText, msoXMLNodeComment, msoXMLNodeProcessingInstruction, and msoXMLNodeAttribute. To get the value of a subtree, use the read-only GetNodes property of a CustomXMLNode object, which returns a CustomDataXMLNodes collection, and then iterate through the collection calling NodeValue on each.
- Listen to events on the custom XML part.
The client can listen for and respond to changes on a node and, if desired, on all of its children. An add-in can respond to the following events.
On the CustomXMLParts collection:
- StreamAfterAdd. Allows a client to respond after a new store is added to the document.
- StreamBeforeDelete. Allows a client to respond before a store is removed from the document.
- StreamAfterLoad. Allows a client to respond after a store item is loaded with XML.
On the CustomXMLPart object:
- NodeAfterInsert. Allows a client to respond after a new node is added to a store. If the added node contains a subtree, the event fires once only, for the top-most node.
- NodeAfterDelete. Allows a client to respond after a node is deleted. If the deleted node contains a subtree, the event fires once only, for the top-most node.
- NodeAfterReplace. Allows a client to respond after an XML node is replaced in the store.
Modifying Mapped Data Through the Document
You can update data in an XML node that you map to a content control directly through the document. To demonstrate this, assume the following XML file, titled "test.xml", is located off the root of drive C.
<?xml version="1.0" standalone="no"?> <root xmlns="urn:test"> <a>NodeA</a> <b>NodeB</b> </root>
Next, assume that the user inserts two plain text content controls into an otherwise blank document. At this point, you have not mapped the content control to any data or added any custom XML parts to the document.
Now, add the XML file, C:\test.xml, as a custom XML part. The first of the following two lines of code creates and adds an empty, custom XML part to the active document. The second line loads the XML file into the newly created custom XML part. Remember that there is already one default XML part containing document properties, so the first custom XML part always has an index of two.
Next, map one text content control to the <a> node and the other to the <b> node by passing an XPath to the appropriate node to a content control in the active document's ContentControls collection:
activedocument.ContentControls(1).XMLMapping.SetMapping "s:test/s:a" activedocument.ContentControls(2).XMLMapping.SetMapping "s:test/s:b"
After you execute these two lines, each text content control displays the text, or data, of the node to which it is respectively mapped. Therefore, in the document, the first text content control displays "NodeA" and the second one displays "NodeB".
It is assumed that neither text block is locked against content changes. Now, suppose a user edits the text of the first node in the document to be "Hello." When this happens, the data in the XML part <a> node instantly changes to be "Hello." To verify this, enter the following line of code into the VBA immediate window and execute it.
? activedocument.CustomXMLParts(2).SelectSingleNode("s:test/s:a").NodeValue <a xmlns="urn:test">Hello</a>
As expected, it returns node <a> with the word "Hello" as its data.
You can now see the tight link between a data-mapped content control and the custom XML part to which it is mapped. The data contained in mapped nodes can change programmatically with an add-in or directly in the document. Note that while an example like this requires the contents of the content control to be unlocked, you can still lock the content control from deletion.
XML Mapping and Events
Assume the same scenario just described in the previous section, that there is an XML file, C:\test.xml, and two text content controls. The XML file looks like this:
<?xml version="1.0" standalone="no"?> <root xmlns="urn:test"> <a>NodeA</a> <b>NodeB</b> </root>
One of the powerful things that you can accomplish with XML mapping is to have one mapped text content control update immediately when a user updates another one. This is accomplished using events. First, create a method with events and run it.
Dim WithEvents oStream As CustomXMLPart Sub Demo() Set oStream = ThisDocument. CustomXMLParts(2) End Sub
Running the Demo subroutine sets up the oStream object to listen to events.
Remember from the previous scenario that the document has two text content controls, one data mapped to the <
a> node and the other data mapped to the <
b> node. You want to set up events so that when the <
a> node is modified, the <
b> node automatically does something. The following
oStream_NodeAfterReplace subroutine accomplishes this.
Private Sub oStream_NodeAfterReplace(ByVal OldNode As Office.CustomXMLNode, ByVal NewNode As _ Office.CustomXMLNode, ByVal InUndoRedo As Boolean) ' Check if NewNode, which is the node after the change, is ' the "a" node by looking at the BaseName of its ParentNode. If NewNode.ParentNode.BaseName = "a" Then oStream.DocumentElement.LastChild.Text = "You changed a!" End If End Sub
This routine is triggered after the user changes the text in the first text content control, mapped to element <
a>. If the <
a> node changes, then the text of the last child in the custom XML part is updated. Because the stream has only two nodes, the last node is the <
b> node. After the text of <
b> node is updated, the updated text of "You changed a!" automatically appears in the second text content control.
While this example is very simple, it shows what you can do with events, XML mapping, and content controls. You can use code such as this to update any text in a document when one text content control changes. This is powerful because it assumes nothing about the document formatting, and it does not work with the document formatting. Instead, it works against the schema that you attach to the document.
XML Mapping and the Word XML Format
In the new Word XML Format, each custom part persists in its own XML part in the document container, which contains the file name and its relationship information. In Word 2007, the XML part is stored off the root of a file's container in a folder called dataStore.
The relationship file, stored inside a _rels folder, describes all the relationships from one XML part to all other XML parts within a Word XML document. There are two relationship types for custom XML parts:
- The relationship type for the XML is:
- The relationship type for the XML properties is:
An ID is stored with each relationship, enabling you to identify it uniquely within the data store.
The actual custom XML part is stored in its own file alongside the _rels folder. The file format for a custom XML part looks like the following.
<o:dataStoreItem> <o:dataStoreItem o:itemID="<ID for the custom XML part>"/> <o:schemaRefs> <o:schemaRef o:uri="<target namespace for schema>"/> </o:schemaRefs> </o:dataStoreItem>
Templates are frequently created in Word. They are so popular that an entire section of the Office Online Web site is dedicated to them. If you are curious, see Templates. The developers and designers of Word listened to the Word community and answered with new features in Word 2007 that enable template designers to create more robust and rich templates. Content controls give template designers tools to easily add form-like content in predesigned pieces (such as drop-down menus, text blocks, calendars, and pictures) to a document. You can also lock these content controls to prevent accidental modification by end users.
Additionally, Word 2007 enhances the way documents work with custom XML, allowing simple mapping of data to content controls. Word separates XML data from the presentation of the document so that it is easier to modify both data and formatting programmatically. New events in the object model let you create quick and simple add-ins that update data using content controls mapped to elements in custom XML attached to a document.
This new functionality greatly increases the speed with which a template designer creates documents. Not only that, but the templates are more user-friendly and more robust. You can load content controls with a wealth of information by mapping to custom XML data.
For more information about developer enhancements in Word 2007, see these resources:
- Microsoft Windows Software Development Kit (SDK) for the February 2006 Community Technology Preview (CTP) for Windows Vista and WinFX Runtime Components
- Ecma International Download: Office Open XML Document Interchange Specification
- Blog: Brian Jones: Open XML Formats
- Channel 9 Video: Office 12 – Word-to-PDF File Translation
- Channel 9 Video:Open XML File Formats
- Channel 9 Video: Brian Jones - New Office File Formats Announced
Thank you to Mark Iverson for his contributions to this article.