Export (0) Print
Expand All

Identifying Open XML Word-Processing Documents with Tracked Revisions

Office 2010

Summary:   Determining whether an Open XML WordprocessingML document contains tracked revisions is important. You can significantly simplify your code to process Open XML WordprocessingML if you know that the document does not contain tracked revisions. This article describes how to determine whether a document contains tracked revisions.

Processing tracked changes (also known as tracked revisions) is an important task that you should full understand when you write Open XML applications. If you accept all tracked revisions first, your job of processing or transforming the WordprocessingML is made significantly easier.

To review the semantics of the elements and attributes of WordprocessingML that hold tracked changes information in detail, see Accepting Revisions in Open XML Word-Processing Documents. In addition, you can download the code sample, RevisionAccepter.zip from the following project on CodePlex, CodePlex.com/PowerTools. To download, go to the Downloads tab, and then click RevisionAccepter.zip.

There are other scenarios where you want to process documents that you know do not contain tracked changes, and because of certain business requirements, you do not want to automatically accept tracked changes. For example, perhaps you have a SharePoint document library that contains no documents that contain tracked changes. Before users add the document to that document library, you want them to consciously and intentionally address and accept all tracked revisions. Accepting revisions as part of checking the document into the document library circumvents this manual process, where you want each person to examine their documents and resolve any issues.

As an alternative, instead of accepting revisions with the RevisionAccepter class, you can validate that the document contains no tracked revisions, and refuse to let the document be checked into the document library without tracked changes being accepted.

The code is not complex. It defines an array of revision tracking element names, and if any of these elements occur in any of the parts that can contain tracked revisions, then the document contains tracked revisions. We can use a LINQ query to determine if any of the revision tracking elements exist in the markup. This article presents four versions of the code to determine whether a document contains tracked revisions.

  • Using C# and LINQ to XML.

  • Using C# and the Open XML SDK strongly-typed object model.

  • Using Visual Basic and LINQ to XML.

  • Using Visual Basic and the Open XML SDK strongly-typed object model

The process of determining whether a document contains tracked revisions is more complex because there are five varieties of parts in an Open XML WordprocessingML document that can contain tracked revisions.

  • Main document part

  • Header parts. There can be multiple header parts for each section. A document can contain multiple sections. Therefore, there may be a fair number of header parts.

  • Footer parts. Again, there can be multiple footer parts for each section.

  • EndNotes part. There is either zero or one End Note part.

  • FootNotes part. There is either zero or one Foot Note part.

By using the code that is presented in this article, you can then ignore those elements and attributes that contain tracked revisions. This simplifies processing WordprocessingML.

The following two examples use the strongly-typed object model of the Welcome to the Open XML SDK 2.0 for Microsoft Office to determine whether a document contains tracked revisions.

To build these examples, you must download and install the download version of the Open XML SDK 2.0. Next, add a reference to the Open XML SDK to your project and a reference to the WindowsBase assembly.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

class Program
{
    public static System.Type[] trackedRevisionsElements = new System.Type[] {
        typeof(CellDeletion),
        typeof(CellInsertion),
        typeof(CellMerge),
        typeof(CustomXmlDelRangeEnd),
        typeof(CustomXmlDelRangeStart),
        typeof(CustomXmlInsRangeEnd),
        typeof(CustomXmlInsRangeStart),
        typeof(Deleted),
        typeof(DeletedFieldCode),
        typeof(DeletedMathControl),
        typeof(DeletedRun),
        typeof(DeletedText),
        typeof(Inserted),
        typeof(InsertedMathControl),
        typeof(InsertedMathControl),
        typeof(InsertedRun),
        typeof(MoveFrom),
        typeof(MoveFromRangeEnd),
        typeof(MoveFromRangeStart),
        typeof(MoveTo),
        typeof(MoveToRangeEnd),
        typeof(MoveToRangeStart),
        typeof(MoveToRun),
        typeof(NumberingChange),
        typeof(ParagraphMarkRunPropertiesChange),
        typeof(ParagraphPropertiesChange),
        typeof(RunPropertiesChange),
        typeof(SectionPropertiesChange),
        typeof(TableCellPropertiesChange),
        typeof(TableGridChange),
        typeof(TablePropertiesChange),
        typeof(TablePropertyExceptionsChange),
        typeof(TableRowPropertiesChange),
    };

    public static bool PartHasTrackedRevisions(OpenXmlPart part)
    {
        return part.RootElement.Descendants()
            .Any(e => trackedRevisionsElements.Contains(e.GetType()));
    }

    public static bool HasTrackedRevisions(WordprocessingDocument doc)
    {
        if (PartHasTrackedRevisions(doc.MainDocumentPart))
            return true;
        foreach (var part in doc.MainDocumentPart.HeaderParts)
            if (PartHasTrackedRevisions(part))
                return true;
        foreach (var part in doc.MainDocumentPart.FooterParts)
            if (PartHasTrackedRevisions(part))
                return true;
        if (doc.MainDocumentPart.EndnotesPart != null)
            if (PartHasTrackedRevisions(doc.MainDocumentPart.EndnotesPart))
                return true;
        if (doc.MainDocumentPart.FootnotesPart != null)
            if (PartHasTrackedRevisions(doc.MainDocumentPart.FootnotesPart))
                return true;
        return false;
    }

    static void Main(string[] args)
    {
        foreach (var documentName in Directory.GetFiles(".", "*.docx"))
        {
            using (WordprocessingDocument wordDoc =
                WordprocessingDocument.Open(documentName, false))
            {
                if (HasTrackedRevisions(wordDoc))
                    Console.WriteLine("{0} contains tracked revisions", documentName);
                else
                    Console.WriteLine("{0} does not contain tracked revisions", documentName);
            }
        }
    }
}

The following two examples use LINQ to XML to determine whether a document contains tracked revisions.

To build these examples, you must install the download version of the Open XML SDK 2.0. Next, add a reference to the Open XML SDK to your project and a reference to the WindowsBase assembly.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;

public static class LocalExtensions
{
    public static XDocument GetXDocument(this OpenXmlPart part)
    {
        XDocument partXDocument = part.Annotation<XDocument>();
        if (partXDocument != null)
            return partXDocument;
        using (Stream partStream = part.GetStream())
        using (XmlReader partXmlReader = XmlReader.Create(partStream))
            partXDocument = XDocument.Load(partXmlReader);
        part.AddAnnotation(partXDocument);
        return partXDocument;
    }
}

public static class W
{
    public static XNamespace w =
        "http://schemas.openxmlformats.org/wordprocessingml/2006/main";

    public static XName cellDel = w + "cellDel";
    public static XName cellIns = w + "cellIns";
    public static XName cellMerge = w + "cellMerge";
    public static XName customXmlDelRangeEnd = w + "customXmlDelRangeEnd";
    public static XName customXmlDelRangeStart = w + "customXmlDelRangeStart";
    public static XName customXmlInsRangeEnd = w + "customXmlInsRangeEnd";
    public static XName customXmlInsRangeStart = w + "customXmlInsRangeStart";
    public static XName del = w + "del";
    public static XName delInstrText = w + "delInstrText";
    public static XName delText = w + "delText";
    public static XName ins = w + "ins";
    public static XName moveFrom = w + "moveFrom";
    public static XName moveFromRangeEnd = w + "moveFromRangeEnd";
    public static XName moveFromRangeStart = w + "moveFromRangeStart";
    public static XName moveTo = w + "moveTo";
    public static XName moveToRangeEnd = w + "moveToRangeEnd";
    public static XName moveToRangeStart = w + "moveToRangeStart";
    public static XName numberingChange = w + "numberingChange";
    public static XName pPrChange = w + "pPrChange";
    public static XName rPrChange = w + "rPrChange";
    public static XName sectPrChange = w + "sectPrChange";
    public static XName tblGridChange = w + "tblGridChange";
    public static XName tblPrChange = w + "tblPrChange";
    public static XName tblPrExChange = w + "tblPrExChange";
    public static XName tcPrChange = w + "tcPrChange";
    public static XName trPrChange = w + "trPrChange";
}

class Program
{
    public static XName[] trackedRevisionsElements = new[]
    {
        W.cellDel,
        W.cellIns,
        W.cellMerge,
        W.customXmlDelRangeEnd,
        W.customXmlDelRangeStart,
        W.customXmlInsRangeEnd,
        W.customXmlInsRangeStart,
        W.del,
        W.delInstrText,
        W.delText,
        W.ins,
        W.moveFrom,
        W.moveFromRangeEnd,
        W.moveFromRangeStart,
        W.moveTo,
        W.moveToRangeEnd,
        W.moveToRangeStart,
        W.numberingChange,
        W.pPrChange,
        W.rPrChange,
        W.sectPrChange,
        W.tblGridChange,
        W.tblPrChange,
        W.tblPrExChange,
        W.tcPrChange,
        W.trPrChange,
    };

    public static bool PartHasTrackedRevisions(OpenXmlPart part)
    {
        return part.GetXDocument()
            .Descendants()
            .Any(e => trackedRevisionsElements.Contains(e.Name));
    }

    public static bool HasTrackedRevisions(WordprocessingDocument doc)
    {
        if (PartHasTrackedRevisions(doc.MainDocumentPart))
            return true;
        foreach (var part in doc.MainDocumentPart.HeaderParts)
            if (PartHasTrackedRevisions(part))
                return true;
        foreach (var part in doc.MainDocumentPart.FooterParts)
            if (PartHasTrackedRevisions(part))
                return true;
        if (doc.MainDocumentPart.EndnotesPart != null)
            if (PartHasTrackedRevisions(doc.MainDocumentPart.EndnotesPart))
                return true;
        if (doc.MainDocumentPart.FootnotesPart != null)
            if (PartHasTrackedRevisions(doc.MainDocumentPart.FootnotesPart))
                return true;
        return false;
    }

    static void Main(string[] args)
    {
        foreach (var documentName in Directory.GetFiles(".", "*.docx"))
        {
            using (WordprocessingDocument wordDoc =
                WordprocessingDocument.Open(documentName, false))
            {
                if (HasTrackedRevisions(wordDoc))
                    Console.WriteLine("{0} contains tracked revisions", documentName);
                else
                    Console.WriteLine("{0} does not contain tracked revisions", documentName);
            }
        }
    }
}

Determining whether an Open XML WordprocessingML document contains tracked revisions enables certain advanced scenarios. You can prevent processing of a document if it contains tracked revisions, which may be important for your business processes. You can make sure that a document contains no tracked revisions before transforming to another form. This significantly simplifies the code that you must write to process word-processing documents.

Show:
© 2014 Microsoft