How to: Extract Styles from a Word Processing Document

Applies to: Excel 2010 | Office 2010 | PowerPoint 2010 | Word 2010

In this article
ExtractStylesPart Method
Calling the Sample Method
How the Code Works
Find the Correct Styles Part
Retrieve the Part Contents
Sample Code

This topic shows how to use the classes in the Open XML SDK 2.0 for Microsoft Office to programmatically extract the styles or stylesWithEffects part from a word processing document to an XDocument instance. It contains an example ExtractStylesPart method to illustrate this task.

To use the sample code in this topic, you must install the Open XML SDK 2.0. You must explicitly reference the following assemblies in your project:

  • WindowsBase

  • DocumentFormat.OpenXml (installed by the Open XML SDK)

You must also use the following using directives or Imports statements to compile the code in this topic.

using System;
using System.IO;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;
Imports System.IO
Imports System.Xml
Imports DocumentFormat.OpenXml.Packaging

ExtractStylesPart Method

You can use the ExtractStylesPart sample method to retrieve an XDocument instance that contains the styles or stylesWithEffects part for a Office Word 2007 or Word 2010 document. Be aware that in a document created in Office Word 2007, there will only be a single styles part; Word 2010 adds a second stylesWithEffects part. To provide for "round-tripping" a document from Word 2010 to Office Word 2007 and back, Word 2010 maintains both the original styles part and the new styles part. (The Office Open XML File Formats specification requires that Microsoft Word ignore any parts that it does not recognize; Office Word 2007 does not notice the stylesWithEffects part that Word 2010 adds to the document.) You (and your application) must interpret the results of retrieving the styles or stylesWithEffects part.

The ExtractStylesPart procedure accepts a two parameters: the first parameter contains a string indicating the path of the file from which you want to extract styles, and the second indicates whether you want to retrieve the styles part, or the newer stylesWithEffects part (basically, you must call this procedure two times for Word 2010 documents, retrieving each the part). The procedure returns an XDocument instance that contains the complete styles or stylesWithEffects part that you requested, with all the style information for the document (or a null reference, if the part you requested does not exist).

public static XDocument ExtractStylesPart(
  string fileName,
  bool getStylesWithEffectsPart = true)
Public Function ExtractStylesPart(
  ByVal fileName As String,
  Optional ByVal getStylesWithEffectsPart As Boolean = True) As XDocument

The complete code listing for the method can be found in the Sample Code section.

Calling the Sample Method

To call the sample method, pass a string for the first parameter that contains the file name of the document from which to extract the styles, and a Boolean for the second parameter that specifies whether the type of part to retrieve is the styleWithEffects part (true), or the styles part (false). The following sample code shows an example. When you have the XDocument instance you can do what you want with it; in the following sample code the content of the XDocument instance is displayed to the console.

string filename = @"C:\Users\Public\Documents\StylesFrom.docx";

// Retrieve the StylesWithEffects part. You could pass false in the 
// second parameter to retrieve the Styles part instead.
var styles = ExtractStylesPart(filename, true);

// If the part was retrieved, send the contents to the console.
if (styles != null)
    Console.WriteLine(styles.ToString());
Dim filename As String = "C:\Users\Public\Documents\StylesFrom.docx"

' Retrieve the stylesWithEffects part. You could pass False
' in the second parameter to retrieve the styles part instead.
Dim styles = ExtractStylesPart(filename, True)

' If the part was retrieved, send the contents to the console.
If styles IsNot Nothing Then
    Console.WriteLine(styles.ToString())
End If

How the Code Works

The code starts by creating a variable named styles that the method returns before it exits.

// Declare a variable to hold the XDocument.
XDocument styles = null;
// Code removed here...
// Return the XDocument instance.
return styles;
' Declare a variable to hold the XDocument.
Dim styles As XDocument = Nothing
' Code removed here...
' Return the XDocument instance.
Return styles

The code continues by opening the document by using the Open method and indicating that the document should be open for read-only access (the final false parameter). Given the open document, the code uses the MainDocumentPart property to navigate to the main document part, and then prepares a variable named stylesPart to hold a reference to the styles part.

// Open the document for read access and get a reference.
using (var document = 
    WordprocessingDocument.Open(fileName, false))
{
    // Get a reference to the main document part.
    var docPart = document.MainDocumentPart;

    // Assign a reference to the appropriate part to the
    // stylesPart variable.
    StylesPart stylesPart = null;
    // Code removed here...
}
' Open the document for read access and get a reference.
Using document = WordprocessingDocument.Open(fileName, False)

    ' Get a reference to the main document part.
    Dim docPart = document.MainDocumentPart

    ' Assign a reference to the appropriate part to the 
    ' stylesPart variable.
    Dim stylesPart As StylesPart = Nothing
    ' Code removed here...
End Using

Find the Correct Styles Part

The code next retrieves a reference to the requested styles part by using the getStylesWithEffectsPart Boolean parameter. Based on this value, the code retrieves a specific property of the docPart variable, and stores it in the stylesPart variable.

if (getStylesWithEffectsPart)
    stylesPart = docPart.StylesWithEffectsPart;
else
    stylesPart = docPart.StyleDefinitionsPart;
If getStylesWithEffectsPart Then
    stylesPart = docPart.StylesWithEffectsPart
Else
    stylesPart = docPart.StyleDefinitionsPart
End If

Retrieve the Part Contents

If the requested styles part exists, the code must return the contents of the part in an XDocument instance. Each part provides a GetStream method, which returns a Stream. The code passes the Stream instance to the XmlNodeReader.Create method, and then calls the XDocument.Load method, passing the XmlNodeReader as a parameter.

// If the part exists, read it into the XDocument.
if (stylesPart != null)
{
    using (var reader = XmlNodeReader.Create(
      stylesPart.GetStream(FileMode.Open, FileAccess.Read)))
    {
        // Create the XDocument.
        styles = XDocument.Load(reader);
    }
}
' If the part exists, read it into the XDocument.
If stylesPart IsNot Nothing Then
    Using reader = XmlNodeReader.Create(
      stylesPart.GetStream(FileMode.Open, FileAccess.Read))
        ' Create the XDocument:  
        styles = XDocument.Load(reader)
    End Using
End If

Sample Code

The following is the complete ExtractStylesPart code sample in C# and Visual Basic.

// Extract the styles or stylesWithEffects part from a 
// word processing document as an XDocument instance.
public static XDocument ExtractStylesPart(
  string fileName,
  bool getStylesWithEffectsPart = true)
{
    // Declare a variable to hold the XDocument.
    XDocument styles = null;

    // Open the document for read access and get a reference.
    using (var document = 
        WordprocessingDocument.Open(fileName, false))
    {
        // Get a reference to the main document part.
        var docPart = document.MainDocumentPart;

        // Assign a reference to the appropriate part to the
        // stylesPart variable.
        StylesPart stylesPart = null;
        if (getStylesWithEffectsPart)
            stylesPart = docPart.StylesWithEffectsPart;
        else
            stylesPart = docPart.StyleDefinitionsPart;

        // If the part exists, read it into the XDocument.
        if (stylesPart != null)
        {
            using (var reader = XmlNodeReader.Create(
              stylesPart.GetStream(FileMode.Open, FileAccess.Read)))
            {
                // Create the XDocument.
                styles = XDocument.Load(reader);
            }
        }
    }
    // Return the XDocument instance.
    return styles;
}
' Extract the styles or stylesWithEffects part from a 
' word processing document as an XDocument instance.
Public Function ExtractStylesPart(
  ByVal fileName As String,
  Optional ByVal getStylesWithEffectsPart As Boolean = True) As XDocument

    ' Declare a variable to hold the XDocument.
    Dim styles As XDocument = Nothing

    ' Open the document for read access and get a reference.
    Using document = WordprocessingDocument.Open(fileName, False)

        ' Get a reference to the main document part.
        Dim docPart = document.MainDocumentPart

        ' Assign a reference to the appropriate part to the 
        ' stylesPart variable.
        Dim stylesPart As StylesPart = Nothing
        If getStylesWithEffectsPart Then
            stylesPart = docPart.StylesWithEffectsPart
        Else
            stylesPart = docPart.StyleDefinitionsPart
        End If

        ' If the part exists, read it into the XDocument.
        If stylesPart IsNot Nothing Then
            Using reader = XmlNodeReader.Create(
              stylesPart.GetStream(FileMode.Open, FileAccess.Read))
                ' Create the XDocument:  
                styles = XDocument.Load(reader)
            End Using
        End If
    End Using
    ' Return the XDocument instance.
    Return styles
End Function

See Also

Reference

Class Library Reference