Retrieving Application Properties from Word 2010 Documents by Using the Open XML SDK 2.0
Summary: Use strongly typed classes in the Open XML SDK 2.0 for Microsoft Office to retrieve application document properties in a Microsoft Office Word 2007 or Microsoft Word 2010 document, without loading the document into Microsoft Word.
Last modified: September 12, 2012
Applies to: Excel 2010 | Office 2010 | Open XML | PowerPoint 2010 | VBA | Word 2010
Published: February 2012
Provided by: Ken Getz
The Open XML file formats enable you to retrieve application document properties from a Microsoft Office Word 2007 or Microsoft Word 2010 document. The Open XML SDK 2.0 adds strongly typed classes to simplify access to the Open XML file formats. The SDK also simplifies the retrieval of application document properties, and the code sample that is included with this article describes how to use the SDK to retrieve those properties in an Office Word 2007 or Word 2010 document.
To use the code sample, install the Open XML SDK 2.0 by using the link in the Explore It section. The code sample is modified from code that is included as part of a set of code examples for the Open XML SDK 2.0. The Explore It section also includes a link to the full set of code examples, although you can use the code sample without downloading and installing the code examples. The sample application retrieves core document properties (that is, properties provided for all Office documents) in a document that you supply.
The code sample that accompanies this article includes the code that is required to retrieve application document properties in an Office Word 2007 or Word 2010 document.
Setting up references
To use the code from the Open XML SDK 2.0, you must add several references to your project. The sample project includes these references, but in your own code, you must explicitly reference the following assemblies:
You should also add the following using or Imports statements to the top of your code file.
Retrieving application properties
Because of the power of the Open XML SDK 2.0, retrieving application document properties is so simple that you do not have to call a special helper procedure. You can just retrieve the ExtendedFileProperties property of a WordProcessingDocument object, and then retrieve the specific application property that you need. First, set up a reference to the document.
Private Const FILENAME As String = "DocumentProperties.docx" Using document As WordprocessingDocument = WordprocessingDocument.Open(FILENAME, True) ' Code removed here... End Using
Given the reference to the WordProcessingDocument object, the code can retrieve a reference to the ExtendedFileProperties property of the document. This object provides its own properties, each of which exposes one of the application document properties.
Given the reference to the properties of ExtendedFilePropertiesPart, the code can then retrieve any of the application properties by using the code in the next example. Note that the code must confirm that the reference to each property isn't null before retrieving its Text property. Unlike core properties, document properties aren't available if you (or the application) haven't specifically given them a value.
If props.Company IsNot Nothing Then Console.WriteLine("Company = " & props.Company.Text) End If If props.Lines IsNot Nothing Then Console.WriteLine("Lines = " & props.Lines.Text) End If If props.Manager IsNot Nothing Then Console.WriteLine("Manager = " & props.Manager.Text) End If
The sample includes the following code.
Private Const FILENAME As String = "I:\Samples\DocumentProperties.docx" Sub Main() Using document As WordprocessingDocument = WordprocessingDocument.Open(FILENAME, False) Dim props = document.ExtendedFilePropertiesPart.Properties If props.Company IsNot Nothing Then Console.WriteLine("Company = " & props.Company.Text) End If If props.Lines IsNot Nothing Then Console.WriteLine("Lines = " & props.Lines.Text) End If If props.Manager IsNot Nothing Then Console.WriteLine("Manager = " & props.Manager.Text) End If End Using End Sub
It is important to realize that the application properties provided by the ExtendedFilePropertiesPart class are defined in XML, and may or may not exist. To retrieve those properties, you must verify that they exist by comparing the reference to the property to null before you attempt to retrieve the value of the property. On the other hand, the PackageProperties class provides a group of properties that define the core properties, so the properties themselves always exist. In other words, you do not have to confirm that the property isn't null before you retrieve the property when you retrieve core properties.
The code examples in this article include several of the issues that you encounter when you work with the Open XML SDK 2.0. Each example is slightly different. However, the basic concepts are the same. Unless you understand the structure of the part that you are trying to work with, even the Open XML SDK 2.0 does not make it possible to interact with that part. Take the time to investigate the objects that you are working with before you start to write code. You will save time.
About the Author
Ken Getz is a developer, writer, and trainer, working as a senior consultant with MCW Technologies, LLC, a Microsoft Solution Provider. He has co-authored several technical books for developers, including the best-selling ASP.NET Developer's Jumpstart, the Access Developer's Handbook series, and the VBA Developer's Handbook series. Ken is a lead courseware author for AppDev, and has authored many of their most popular titles. Ken has spoken for many years at technical conferences, including Microsoft TechEd.