Automating the Creation of Data-Rich Business Documents with Word 2007 and Visual Basic 2005

 

Automating the Creation of Data-Rich Business Documents with Word 2007 and Visual Basic 2005

Ed Robinson, Intergen Ltd

Updated: March 2007

Applies to:
   Visual Basic 2005
   Word 2007

Summary: In Word, automation is a great mechanism for populating business documents (like invoices and reports) with data from backend systems. Learn three useful methods for populating Word documents with data. (11 printed pages)

Download the associated WordAutomationSample.exe sample code.

Contents

Introduction
Understanding How Word Automation works
What's in the Sample?
Fundamental Automation Techniques
Populating Documents with Data
VBA, Automation, and VSTO
Conclusion

Introduction

Visual Basic has always been the premier language for Microsoft Word automation. Word automation is a technique for "remote controlling" Word to start, load a document, add content, print, save and exit. Because Word is a fantastic tool for creating rich documents, automation is a natural technique for mechanizing the production of great looking reports, invoices, letters and other documents that combine structured data with unstructured content.

That said, automating Word to create data-rich business documents has long been a black art—a tedious process of trial and error full of crashes and reboots. Thankfully, enhancements in Visual Basic 2005 and Word 2007 make the process of populating invoices and updating reports with data from back-end systems a whole lot easier and more fun.

Word 2007 is more programmable than ever before with support for Visual Basic for Applications (VBA), Visual Studio Tools for Office (VSTO), and automation. The Word application itself is also more robust and easier to program. Visual Basic 2005 is a natural complement for Word 2007 with excellent support for interoperating with COM-based Office applications, powerful debugging, and an awesome editing environment. This combination of robustness in Word and great programming features in Visual Basic 2005 makes for a great RAD experience and enables Visual Basic programmers to create enhanced applications that will drive the next wave of productivity improvements for information workers.

Understanding How Word Automation works

Microsoft Word 2007 is a COM application and can be automated using a COM object library installed with Word. If you've had experience writing VBA Word macros, many of the objects used in automation will be familiar to you since VBA uses the same object library. Because Visual Basic 2005 has great support for COM Interop, the process for automating word is straightforward. To make programming even easier, Visual Basic 2005 includes F1 help for most of the objects in the Word 2007 library. The only limitation is that the help library is one version behind, targeting Word 2003. When writing the sample code for this article, we found it useful to refer to the latest online documentation, which can be found on MSDN: http://msdn2.microsoft.com/en-us/library/bb244515.aspx.

To automate Word

  1. Create a word application object.
  2. Create a new document (or open an existing document).
  3. Update the content.
  4. Print, save or mail merge the document

The Word object model has almost 300 objects, but in practice there are only four objects with which you should be familiar:

  • Application object
  • Document object
  • Paragraph object
  • Selection object

Application Object

The Application object represents the Word 2007 application itself. Creating an Application object is the first step in automation. When you create an Application object, a new instance of Word is started, enabling you to programmatically work with documents. Although Word can stay hidden during the automation process, most programmers choose to make Word visible so that users can see documents dynamically update. Only one instance of the application object needs to be created; you can use this single instance to work with multiple documents.

Document Object

After creating the Application object, the next step is to create or load one or more Document objects. The Document object represents a document in Word and you can programmatically add content just as you would when using Word in everyday situations.

Paragraph Object

Every document contains one or more paragraphs of content. In the automation programming model, a Paragraph object is used to represent each of these paragraphs. A paragraph can contain text, fields, a table or picture. Navigating the paragraphs in a document helps with placing your content in exactly the right place.

Selection Object

The Selection object represents the area of content currently selected in Word. If nothing is selected, it represents where the insertion point is located. The selection object is used to add content at the insertion point—either overwriting the current selection or inserting new. Although similar to the paragraph object, many programmers find the Selection object easier to use when adding content to a document and has two important differences to the Paragraph object:

  • The Selection object is a member of the application object, not the document object. This reflects that the application has a single "active" document with a single insertion point at any one time.
  • The Selection object cannot be created. Instead, the Application object always contains a single Selection object that can be used without explicitly creating it.
Note   For more information on these and other members of the Word object model, see the MSDN article Word Object Model Overview.

Together with understanding the Word object model, it pays to understand the best practices for working with Word from Visual Basic 2005. Because Word is a separate application ("loosely coupled" to your Visual Basic 2005 application), many unexpected events can happen. As such, the user might choose to close the Word application while your code is automating. An unexpected problem might cause Word to show a dialog box, halting automation code from running until after the dialog box is dismissed. The user may inadvertently click around the document, causing the insertion point to move. They may also randomly type something into the document. To handle these conditions gracefully, it is good practice to add validation and error checking into your automation code to ensure your application is robust. The sample application demonstrates how.

What's in the Sample?

Let's start by looking at the sample application that accompanies this article.

The sample application—"WordAutomation"—demonstrates how to start Word, create a document, insert text and tables, replace fields with your own text, perform a mail merge, and create a Word 2007 table from an ADO.NET DataTable.

First we'll need to make sure your machine has all the necessary prerequisites installed:

  • Visual Studio 2005
  • Microsoft Word 2007

After installing the prerequisites, unzip the sample files to a directory on your local hard disk drive. To ensure the database connection remains valid, we recommend installing the sample to a directory named c:\WordAutomation (the sample solution file should be c:\WordAutomation\VB2005\WordAutomation.sln).

If you choose to place the files in a different folder, you'll need to update the WordAuto.My.MySettings.ConnectionString setting in the App.Config file to point to the new database path for the Microsoft Access database CustomersDB.mdb.

Now we're ready to walk through the sample

  1. Open the Solution WordAutomation.sln.
  2. Press F5 to compile and run the application
    Note   When the WordAutomation application runs, it will open with a single window with buttons for automating Word. This window is set to stay on top of other Windows, allowing easy access to the automation buttons while the Word document is updated in the background.

    Bb407305.vb05autoword01(en-US,VS.80).gif

    Figure 1. WordAutomation application window, with buttons for automating Word.

  3. Click the New Document button, and the WordAutomation application will create a new Word application and document.
  4. Click Insert Text and Insert Table to add text and tables—the will application automatically apply a style to any tables you insert.

    Bb407305.vb05autoword02(en-US,VS.80).gif

    Figure 2. Insert text and tables into your Word document.

  5. Click the Close button to close the document in preparation for the next step in the walk through.
  6. Click the Replace Field button to open FormSample.docx document and replace the text in a content control.
  7. Click Insert As Table to copy the information in the DataGrid into a new table at the end of the document.
    Note   The sample application includes a reusable library so you can easily copy an ADO.NET DataTable to a table in a word document in your own applications.

    Bb407305.vb05autoword03(en-US,VS.80).gif

    Figure 3. Copy the information in the DataGrid into a new table at the end of the Word document.

  8. Click the Mail Merge button to fully initiate a mail merge to a new document.
    Note   This feature fully automates the entire mail merge process, opening a template document, configuring the mail merge DataSource and performing the merge.

    Bb407305.vb05autoword04(en-US,VS.80).gif

    Figure 4. The Mail Merge button will initiate a mail merge to a new document.

Fundamental Automation Techniques

The following code snippets illustrate the techniques used in the sample application.

Setting Up

You can try the following techniques in any Visual Basic client application. Before beginning, there are a few things you need to do to set up your project for Word automation

  1. Add a project reference to the COM object Microsoft Word 12.0 Object Library.
  2. Import the Microsoft.Office.Interop namespace by adding the statement Imports Microsoft.Office.Interop to the top of any module, form, or class that contains Word automation code.
  3. If you are planning to use the Application or Document objects in multiple methods, you should consider defining application and document variables as member variables of the parent form or class so they don't go out of scope and your application won't lose its connection to Word.

Starting and Quitting Word

To start Word, use the following code, which creates a Word.Application object and makes Word visible:

Private axWord As Word.Application
axWord = New Word.Application
axWord.Visible = True

To Quit word, use this code:

axWord.Quit

Creating and Opening documents

To create a new document, first start Word, then create a document using the Application.Documents.Add method:

Dim axDoc As Word.Document
axDoc = axWord.Documents.Add

To open a document, use the Application.Documents.Open method:

Dim axDoc As Word.Document
axDoc = axWord.Documents.Open("c:\MyDocument.docx")

To save a document, use the Document.SaveAs method:

axDoc.SaveAs("C:\MyDocument.docx")

Reading and Inserting Text

When automating Word, you will often find multiple methods to do the same thing. Reading and inserting text is a good example of this. You can read and insert text using a number of different objects, however the most commonly used objects are the Paragraph and Selection objects. We'll look at the paragraph object first.

A word document is composed of a collection of paragraphs. Paragraphs are sequentially numbered starting from 1, and include all text and content up to the next carriage return. Here is how to the contents of the first paragraph of a document into a string:

Dim strText As String
strText = axDoc.Paragraphs(1).Range.Text

Here is how you to set the text of the first paragraph:

axDoc.Paragraphs(1).Range.Text = "Hello from Visual Basic 2005"

The Selection object is the most flexible method for inserting text. Automating the Selection object is very similar to how you use Word when you write a document—first you use the selection object to move the insertion point to a location in the document, and then you insert some text. The following snippet uses the Selection object to insert text at the end of a document, insert text at the beginning of a document, then find the word "Foo" and insert text immediately after.

'Activate the document first
axDoc.Activate()
'Move to the end and add text
axWord.Selection.EndKey(Word.WdUnits.wdStory)
axWord.Selection.TypeText("This is the end")
'Move to the beginning and add text
axWord.Selection.HomeKey(Word.WdUnits.wdStory)
axWord.Selection.TypeText("This is the beginning")
axWord.Selection.Find.ClearFormatting()
'Locate Foo, then add text
'following it
axWord.Selection.Find.ClearFormatting()
axWord.Selection.Find.Text = "Foo"
axWord.Selection.Find.Execute()
axWord.Selection.MoveRight(Word.WdUnits.wdCharacter, 1)
axWord.Selection.TypeText("This is Foo")

Inserting tables

Inserting tables is similar to inserting text. The following snippet inserts a 5x5 table at the end of a document, applies a style to the table, and changes the text a cell within the table to read "Hello World":

Dim axTable As Word.Table
axTable = axDoc.Tables.Add(axWord.Selection.Range, 5, 5)
axTable.Style = "Table Grid 8"
axTable.Cell(3, 3).Range.Text = "Hello World"

Printing

To print a document use the PrintOut Method:

axDoc.PrintOut()

Robustness

Because the user may close your document—or Word itself—while your application is automating, it is worth your time to check that the document is still open before you attempt to automate the document. An easy way to do this is to attempt to reference the Document.Name property. If referencing the Name property raises an exception, then the document is not available and your application can react gracefully. The following snippet demonstrates this technique:

Dim strName As String
Dim blnIsAvailable As Boolean
Try
    strName = axDoc.Name
    blnIsAvailable = True
Catch ex As Exception
    blnIsAvailable = False
End Try
MsgBox("Document is available: " & blnIsAvailable)

Populating Documents with Data

Word is often used for creating business documents like invoices and reports combining business data with human readable text. Automation is a great mechanism for populating invoices and reports with data from backend systems. The sample application demonstrates three useful methods for populating Word documents with data.

Copying an ADO.NET DataTable into a Word Table

The sample application includes a method you can use as-is or modify for inserting the contents of a DataTable into a table inside a Word document. This is a useful mechanism when the data you want to add to a Word document is in table format. Here is the signature for the method which is included in the clsWordDoc class:

Function AddDataTable(ByVal tbl As DataTable) As Boolean

Content Controls

Word 2007 introduces a new mechanism for inserting data into documents named "content controls." Content Controls are fields that can be either be edited manually, filled programmatically, or automatically populated from data in an XML file. There are nine different types of Content Control including text, picture, and ComboBox. For information on how to add and use content controls in a Word document, see the article Content Controls.

Content Controls are referenced by using the Document objects ContentControls collection. In this snippet, we programmatically set the text of the first content control in a document to read "Content Control Field":

Dim ccCollection As Word.ContentControls
ccCollection = axDoc.ContentControls
ccCollection.Item(1).Range.Text = " Content Control Field"

Bb407305.vb05autoword05(en-US,VS.80).gif

Figure 5. The ContentControls collection from the Document object set the text "Content Control Field" in a document.

Mail Merge

Word mail merges are used to create a set of personalized envelopes or letters by filling mail merge fields in a document with information from a database or file. For example, names and addresses in a database can be used to populate name and address fields in a document. When a user performs a mail merge, Word combines the source document and database information to generate a new personalized envelope or letter for each record in the database. Mail merges are often a time-consuming and manual process. The sample application shows how to automate a mail merge, making the process entirely automated.

To perform a mail merge

  1. Open the source document.
  2. Set the Mail Merge DataSource to point to the database.
  3. A new document containing a set of letters is produced.

The sample application uses the following code to perform the process silently with no user intervention:

axDoc.Activate()
axDoc.MailMerge.MainDocumentType = _
Word.WdMailMergeMainDocType.wdFormLetters
axDoc.MailMerge.OpenDataSource(strDatabaseFilename, _
wdOpenFormatAuto, False, False, False, False, "", "", _
False, "", "", strConnectionString, strSQL, "", False,_
Word.WdMergeSubType.wdMergeSubTypeOAL)
axDoc.MailMerge.DataSource.FirstRecord = wdDefaultFirstRecord
axDoc.MailMerge.DataSource.LastRecord = wdDefaultLastRecord
axDoc.MailMerge.Destination = _
Word.WdMailMergeDestination.wdSendToNewDocument
axDoc.MailMerge.Execute(False)

Although the sample makes the process look simple, automating a mail merge requires careful configuration of the DataSourse for the following reasons:

  • Word is particular about the connection string used. If Word has difficulty interpreting any of the settings, Word will open a dialog boxes prompting the user to manually define the DataSource.
  • The easiest-to-set-up DataSource is a Microsoft Office Address List (this is what the sample uses). A Microsoft Office Address list is an Access 2007 database that can be created from Word 2007
  • While other types of data sources can be used for automation, setting up the DataSource setting to ensure silent operation may require careful configuration.

VBA, Automation, and VSTO

There are three different techniques for programming Word 2007: VBA code-behind-document that Word 2007 still produces when recording macros, Visual Studio Tools for Office (VSTO) for producing application-level add-ins, and Word Automation. So many choices may leave you scratching your head over when to use which.

Each technique lends itself to a particular purpose which actually makes it quite easy to choose between them:

  • For macros, use VBA.
  • For automation, use Word Automation (the techniques in this article).
  • For application-level add-ins, use VSTO.

VBA

Visual Basic for Applications (VBA) is the default language for programming Word. When people record macros, Word 2007 produces VBA code for each step. These macros can be added to toolbars or menu buttons and are saved inside Word templates and documents. Although VBA is a COM-based technology, lacking the connectivity, security and other modern features of .NET based languages, it is still a great language for creating macros that run behind a document.

Automation

As we've seen in this article, Word automation is an excellent technique for remote controlling Word 2007 from an external application to create and manipulate word documents. Like VBA, automation still relies on the COM programming model.

VSTO

Visual Studio Tools for Office (VSTO) is the newest set of tools for programming Word. VSTO 2005 SE is a free download (to licensed Visual Studio users) extending Visual Studio 2005 to support the generation of office application-level add-ins. For example: Using VSTO, you could create a richer task processing capability for Outlook. VSTO supports both the Visual Basic and C# programming languages. VSTO ships with a full set of managed APIs, which makes Word programming a natural experience for .NET developers.

Conclusion

In this article we introduced techniques for using Visual Basic 2005 and automation to remote control Word 2007. This article merely scratches the surface of what is possible with automation—Office is an incredibly rich platform for developing solutions that will improve information workers' productivity. The next article in this series will discuss how to use automation to update and regularly e-mail a PowerPoint-based report, and includes an introduction on how to automate the generation of graphs and charts.

Ed Robinson co-authored "Upgrading Visual Basic 6.0 to Visual Basic .NET", "Security for Visual Basic .NET", and numerous technology articles. Ed is the CIO for Intergen Ltd—one of New Zealand's most prominent Microsoft Gold Certified Partners.

Show: