Export (0) Print
Expand All
Expand Minimize

How to: Use Annotations to Minimize Serialization and Deserialization by Using the Open XML API

Office 2007

This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

The Office Open XML Package specification defines a set of XML files that contain the content and define the relationships for all of the document parts stored in a single package. These packages combine the parts that comprise the document files for Microsoft® Office Excel® 2007, Microsoft Office PowerPoint® 2007, and Microsoft Office Word 2007. The Open XML Application Programming Interface (API) allows you to create packages and manipulate the files that comprise the packages. This topic walks through the code and steps to use annotations to minimize serialization and deserialization while associating data with a document part (file) of an Office Open XML package in Office Word 2007, although the steps are the same for each of the three 2007 Microsoft Office system programs that support the Office Open XML Format.

NoteNote

The code samples in this topic are in Microsoft Visual Basic .NET and Microsoft Visual C#. You can use them in an add-in created in Microsoft Visual Studio 2008. For more information about how to create an add-in in Visual Studio 2008see Getting Started with the Open XML Format SDK 1.0.

When you develop Open XML Format solutions, you often want to associate arbitrary data with a particular document part. After opening an Open XML document, you may want to read a document part into a System.Xml.Linq.XDocument object, query the XDocument object using LINQ to XML, perhaps modify the XDocument, and then serialize the XDocument back into the package.

Reading the XML from the document part, parsing it, modifying it, and then serializing it back into the package every time that you want to access the XML, results in poor performance. Reading the XML from the document part only once, using it as appropriate, and then serializing it back into the part is more efficient. After reading the XML from the document part, if you add the XDocument instance as an annotation on the document part, you can easily retrieve the annotation instead of rereading the XML each time you access it. Annotations allow you to associate any object with an OpenXmlPartContainer object (the base class of OpenXmlPartpart) in a type-safe way.

NoteNote

The same approach applies if you are using a System.Xml.XmlDocument object.

Using Microsoft Visual C# 3.0 and Microsoft Visual Basic .NET 9.0, you can write an extension method that easily retrieves an XDocument from the document part. The extension method first checks for the existence of an annotation of type XDocument. If it exists, it is returned. If it does not exist, then the method populates the XDocument from the document part, and adds it as an annotation to the document part. The following code shows an extension method.

NoteNote

The following code samples are edited to facilitate online viewing. Change the indentation and remove extra lines to work with this code.

public static class LocalExtensions {
    // How to create an extension method.
    public static XDocument GetXDocument(this OpenXmlPart part) {
        XDocument xdoc = part.Annotation<XDocument>();
        if (xdoc != null)
            return xdoc;
        using (StreamReader streamReader = 
                               new StreamReader(part.GetStream()))
            xdoc = XDocument.Load(XmlReader.Create(streamReader));
        part.AddAnnotation(xdoc);
        return xdoc;
    }
}

The following example shows the simplest use of this extension method.

class Program {

    // Get the XDocument part using the GetXDocument function.
    // This function executes quickly, as the XDocument is stored as an annotation.
    static void ModifyDocument(OpenXmlPart mainDocumentPart) {
        XDocument xdoc = mainDocumentPart.GetXDocument();
        Console.WriteLine("Count of nodes:{0}", 
                           xdoc.DescendantNodes().Count());
    }

    // Simple use of extension method.
    static void Main(string[] args) {
        using (WordprocessingDocument wordDoc = 
                WordprocessingDocument.Open(@"C:\Test.docx", true)) {
            XDocument xdoc = wordDoc.MainDocumentPart.GetXDocument();

            // Query the document, and modify it as necessary.
            Console.WriteLine("Count of nodes:{0}",
                               xdoc.DescendantNodes().Count());

            // Call another function, passing the MainDocumentPart part.
            ModifyDocument(wordDoc.MainDocumentPart);

            // Serialize the XDocument object back to the package.
            using (XmlWriter xw =
                XmlWriter.Create(wordDoc.MainDocumentPart.GetStream
                (FileMode.Create, FileAccess.Write))) {
                xdoc.Save(xw);
            }
        }
    }
}

The following procedure walks through a more sophisticated approach.

To use an extension method

  1. Add an event handler to the XDocument object that watches for any changes to the tree.

  2. If the event handler is called, then remove the event handler, and add a semaphore annotation to the XDocument that indicates that the XDocument was changed.

  3. When finished with the Open XML document, before serializing back into the document part, check for the existence of the semaphore annotation. Only serialize back into the package if the semaphore annotation exists.

The following code example demonstrates this technique.

NoteNote

The following code samples are edited to facilitate online viewing. Change the indentation and remove extra lines to work with this code.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using Microsoft.Office.DocumentFormat.OpenXml.Packaging;
using System.Xml;
using System.Xml.Linq;

namespace OpenXmlFormatSDKAnnotationSamples {
    public static class LocalExtensions {

        private class ChangedSemaphore { }

        private static EventHandler<XObjectChangeEventArgs>
                                                 ElementChanged = null;

        // Add an event handler to the XDocument that watches for any 
        // changes to the tree.
        private static void ElementChangedHandler(object sender,
                                            XObjectChangeEventArgs e) {
            XObject xSender = (XObject)sender;
            XDocument xDocument = xSender.Document;

            // Sometimes while moving a node, this event handler may 
            // receive an event for a node that has been removed from 
            // its parent (and therefore its document), in which case
            // it is not necessary to remove the event handler and add
            // an annotation.

            // If the event handler is called, remove the event handler
            // and add a semaphore anotation to the XDocument object to 
            // indicate that the XDocument object changed.
            if (xDocument != null) {
                xDocument.Changing -= ElementChanged;
                xDocument.AddAnnotation(new ChangedSemaphore());
            }
        }

        public static XDocument GetXDocument(this OpenXmlPart part) {
            if (ElementChanged == null)
                ElementChanged = 
       new EventHandler<XObjectChangeEventArgs>(ElementChangedHandler);

            XDocument xdoc = part.Annotation<XDocument>();
            if (xdoc != null)
                return xdoc;
            using (StreamReader streamReader = new StreamReader(part.GetStream()))
                xdoc = XDocument.Load(XmlReader.Create(streamReader));
            part.AddAnnotation(xdoc);
            xdoc.Changed += ElementChanged;
            return xdoc;
        }

        public static void PutXDocument(this OpenXmlPart part) {
            XDocument xdoc = part.GetXDocument();
            if (xdoc != null) {
                // Before serializing back into the document part, check for existence of the semaphore
                // annotation. Only serialize back into the package if the semaphore annotation
                // exists.
                if (part.GetXDocument().Annotation<ChangedSemaphore>() != null) {
                    Console.WriteLine("The XDocument was changed.  Serialize back into the part.");

                    // Serialize the XDocument object back to the package.
                    using (XmlWriter xw =
                        XmlWriter.Create(part.GetStream
                       (FileMode.Create, FileAccess.Write))) {
                        xdoc.Save(xw);
                    }
                }
                else {
                    Console.WriteLine("No need to serialize back to
                                part.  XDocument was not changed.");
                }
            }
        }
    }

    class Program {
        // This function changes the first paragraph to upper case.
        static void ModifyDocument(OpenXmlPart mainDocumentPart) {
            XNamespace w = 
            "http://schemas.openxmlformats.org/wordprocessingml/2006/main";

            // Get the XDocument object using the GetXDocument function.
            // This function executes quickly, as the XDocument object is 
            // stored as an annotation.
            XElement paraNode = mainDocumentPart.GetXDocument()
                                .Root
                                .Element(w + "body")
                                .Descendants(w + "p")
                                .FirstOrDefault();

            string paraText = paraNode
                              .Elements(w + "r")
                              .Elements(w + "t")
                              .Aggregate(new StringBuilder(), (s, i) => 
                               s.Append((string)i), s => s.ToString());

            // Remove all text runs.
            paraNode.Descendants(w + "r").Remove();

            paraNode.Add(
                new XElement(w + "r",
                    new XElement(w + "t", paraText.ToUpper())
                )
            );
        }

        static void Main(string[] args) {
            using (WordprocessingDocument wordDoc = 
                  WordprocessingDocument.Open(@"C:\Test.docx", true)) {
                XDocument xdoc = 
                               wordDoc.MainDocumentPart.GetXDocument();

                // Query the document, and modify it as necessary.
                Console.WriteLine("Count of nodes:{0}",
                                       xdoc.DescendantNodes().Count());

                //Call another function, passing the MainDocumentPart.
                ModifyDocument(wordDoc.MainDocumentPart);

                wordDoc.MainDocumentPart.PutXDocument();
            }
        }
    }
}

To use an extension method and annotations

  1. In this console application, first you open a document as a WordprocessingDocument object.

  2. Then, you retrieve an XDocument from the MainDocumentPart part.

  3. Next, you query the document and modify it as necessary. In this case, the ModifyDocument method changes the first paragraph of the document to upper case.

  4. Finally, you serialize the document part back into the package if the semaphore annotation exists.

Show:
© 2014 Microsoft