Retrieving Application Properties from Word 2010 Documents by Using the Open XML SDK 2.0

Office Visual How To

Summary:  Use strongly typed classes in the Open XML SDK 2.0 for Microsoft Office to retrieve application document properties in a Microsoft Office Word 2007 or Microsoft Word 2010 document, without loading the document into Microsoft Word.

Applies to: Excel 2010 | Office 2010 | Open XML | PowerPoint 2010 | VBA | Word 2010

Published:  February 2012

Provided by:   Ken Getz

Overview

The Open XML file formats enable you to retrieve application document properties from a Microsoft Office Word 2007 or Microsoft Word 2010 document. The Open XML SDK 2.0 adds strongly typed classes to simplify access to the Open XML file formats. The SDK also simplifies the retrieval of application document properties, and the code sample that is included with this article describes how to use the SDK to retrieve those properties in an Office Word 2007 or Word 2010 document.

To use the code sample, install the Open XML SDK 2.0 by using the link in the Explore It section. The code sample is modified from code that is included as part of a set of code examples for the Open XML SDK 2.0. The Explore It section also includes a link to the full set of code examples, although you can use the code sample without downloading and installing the code examples. The sample application retrieves core document properties (that is, properties provided for all Office documents) in a document that you supply.

Code It

The code sample that accompanies this article includes the code that is required to retrieve application document properties in an Office Word 2007 or Word 2010 document.

Setting up references

To use the code from the Open XML SDK 2.0, you must add several references to your project. The sample project includes these references, but in your own code, you must explicitly reference the following assemblies:

  • WindowsBase. This reference might be set for you, depending on the kind of project that you create.

  • DocumentFormat.OpenXml. Installed by the Open XML SDK 2.0.

You should also add the following using or Imports statements to the top of your code file.

Imports DocumentFormat.OpenXml.Packaging
using System;
using DocumentFormat.OpenXml.Packaging;

Retrieving application properties

Because of the power of the Open XML SDK 2.0, retrieving application document properties is so simple that you do not have to call a special helper procedure. You can just retrieve the ExtendedFileProperties property of a WordProcessingDocument object, and then retrieve the specific application property that you need. First, set up a reference to the document.

Private Const FILENAME As String = "DocumentProperties.docx"

Using document As WordprocessingDocument =
  WordprocessingDocument.Open(FILENAME, True)
  ' Code removed here...
End Using
const string FILENAME = "DocumentProperties.docx";

using (WordprocessingDocument document = 
 WordprocessingDocument.Open(FILENAME, false))
{
  // Code removed here...
}

Given the reference to the WordProcessingDocument object, the code can retrieve a reference to the ExtendedFileProperties property of the document. This object provides its own properties, each of which exposes one of the application document properties.

Dim props = document.ExtendedFilePropertiesPart.Properties
var props = document.ExtendedFilePropertiesPart.Properties;

Given the reference to the properties of ExtendedFilePropertiesPart, the code can then retrieve any of the application properties by using the code in the next example. Note that the code must confirm that the reference to each property isn't null before retrieving its Text property. Unlike core properties, document properties aren't available if you (or the application) haven't specifically given them a value.

If props.Company IsNot Nothing Then
  Console.WriteLine("Company = " & props.Company.Text)
End If

If props.Lines IsNot Nothing Then
  Console.WriteLine("Lines = " & props.Lines.Text)
End If

If props.Manager IsNot Nothing Then
  Console.WriteLine("Manager = " & props.Manager.Text)
End If
Console.WriteLine("Creator = " + props.Creator);
Console.WriteLine("Created = " + props.Created);
Console.WriteLine("Title = " + props.Title);

Sample procedure

The sample includes the following code.

Private Const FILENAME As String = "I:\Samples\DocumentProperties.docx"

Sub Main()
  Using document As WordprocessingDocument =
    WordprocessingDocument.Open(FILENAME, False)

    Dim props = document.ExtendedFilePropertiesPart.Properties
    If props.Company IsNot Nothing Then
      Console.WriteLine("Company = " & props.Company.Text)
    End If

    If props.Lines IsNot Nothing Then
      Console.WriteLine("Lines = " & props.Lines.Text)
    End If

    If props.Manager IsNot Nothing Then
      Console.WriteLine("Manager = " & props.Manager.Text)
    End If
  End Using
End Sub
private const string FILENAME = @"I:\Samples\DocumentProperties.docx";

static void Main(string[] args)
{
  using (WordprocessingDocument document = 
    WordprocessingDocument.Open(FILENAME, false))
  {
    var props = document.ExtendedFilePropertiesPart.Properties;

    if (props.Company != null)
      Console.WriteLine("Company = " + props.Company.Text);

    if (props.Lines != null)
      Console.WriteLine("Lines = " + props.Lines.Text);

    if (props.Manager != null)
      Console.WriteLine("Manager = " + props.Manager.Text);
  }
}
Read It

It is important to realize that the application properties provided by the ExtendedFilePropertiesPart class are defined in XML, and may or may not exist. To retrieve those properties, you must verify that they exist by comparing the reference to the property to null before you attempt to retrieve the value of the property. On the other hand, the PackageProperties class provides a group of properties that define the core properties, so the properties themselves always exist. In other words, you do not have to confirm that the property isn't null before you retrieve the property when you retrieve core properties.

The code examples in this article include several of the issues that you encounter when you work with the Open XML SDK 2.0. Each example is slightly different. However, the basic concepts are the same. Unless you understand the structure of the part that you are trying to work with, even the Open XML SDK 2.0 does not make it possible to interact with that part. Take the time to investigate the objects that you are working with before you start to write code. You will save time.

See It

 

Watch the video

> [!VIDEO https://www.microsoft.com/en-us/videoplayer/embed/e7f9039e-459e-4649-a102-ad8ab43167d4]

Length: 00:6:01

Click to grab code

Grab the Code

Explore It

 

About the Author

Ken Getz is a developer, writer, and trainer, working as a senior consultant with MCW Technologies, LLC, a Microsoft Solution Provider. He has co-authored several technical books for developers, including the best-selling ASP.NET Developer's Jumpstart, the Access Developer's Handbook series, and the VBA Developer's Handbook series. Ken is a lead courseware author for AppDev, and has authored many of their most popular titles. Ken has spoken for many years at technical conferences, including Microsoft TechEd.