Deleting Comments By All Authors or Specific Authors in Word 2010 Documents by Using the Open XML SDK 2.0

Office Visual How To

Summary:  Use the strongly typed classes in the Open XML SDK 2.0 to delete comments by all or specific authors in a Word document, without loading the document into Microsoft Word.

Applies to: Excel 2010 | Office 2010 | Open XML | PowerPoint 2010 | VBA | Word 2010

Published:  January 2011

Provided by:  Ken Getz, MCW Technologies, LLC

Overview

The Open XML file formats allows you to delete comments in Microsoft Word documents, but doing this requires some effort. The Open XML SDK 2.0 adds strongly typed classes that grant access to the Open XML file formats. In doing so, the SDK simplifies the tasks of retrieving a list of and deleting comments. The code sample that is included with this Visual How To describes how to the use the SDK to complete this goal.

Code It

The sample provided with this Visual How To includes the code that you need to delete comments by one or all authors in a Word 2007 Word 2010 document. The following sections show you the code, in explicit detail.

Setting Up References

To use the code from the Open XML SDK 2.0, you must add two references to your project. The sample project already includes these references, but in your own code, you would have to explicitly reference the following assemblies:

  • WindowsBase─This reference may be set for you, depending on the kind of project that you create.

  • DocumentFormat.OpenXml─Installed by the Open XML SDK 2.0.

Also, you should add the following using/Imports statements to the top of your code file.

Imports DocumentFormat.OpenXml.Packaging
Imports DocumentFormat.OpenXml.Wordprocessing
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

Examining the Procedure

The WDDeleteComments procedure accepts two string parameters: the name of the document to modify, and optionally the name of the author whose comments you want to delete. If you supply an author name, the code deletes comments by the specified author; if you do not supply a name, the code deletes all comments.

Public Sub WDDeleteComments(
 ByVal fileName As String, Optional ByVal author As String = "")
public static void WDDeleteComments(
  string fileName, string author = "")

The procedure modifies the document that you specify, by deleting all comments in the document, or deleting only comments by a specific author. To call the procedure, pass the parameter values, as shown in the example code. For the purpose of this example, verify that a document named C:\temp\Comments.docx exists on your computer. Before you run the sample code, also ensure that the file contains at least a single comment.

The following example describes how to call WDDeleteComments to remove all comments in the specified document. To limit the results to comments by a particular author, add the name of the author as the second parameter.

WDDeleteComments("C:\temp\comments.docx")
WDDeleteComments(@"C:\temp\comments.docx");

Accessing the Document

The code starts by opening the document, by using the [WordprocessingDocument.Open] method. With the final true parameter set to true, this call indicates that the document should be open for read-write access. Next, the code retrieves a reference to the [MainDocumentPart.WordprocessingCommentsPart] property of the word processing document. If the comments part is missing, there are no comments to delete, and therefore no reason to continue.

Using document As WordprocessingDocument = WordprocessingDocument.Open(fileName, True)
  ' Set commentPart to the document WordprocessingCommentsPart, 
  ' if it exists.
  Dim commentPart As WordprocessingCommentsPart =
    document.MainDocumentPart.WordprocessingCommentsPart

  ' If no WordprocessingCommentsPart exists, there can be no comments. 
  ' Stop execution and return from the method.
  If (commentPart Is Nothing) Then
    Return
  End If
  ' Code removed here…
End Using
using (WordprocessingDocument document = 
  WordprocessingDocument.Open(fileName, true))
{
  // Set commentPart to the document WordprocessingCommentsPart, 
  // if it exists.
  WordprocessingCommentsPart commentPart =
    document.MainDocumentPart.WordprocessingCommentsPart;

  // If no WordprocessingCommentsPart exists, there can be no comments. 
  // Stop execution and return from the method.
  if (commentPart == null)
  {
    return;
  }
  // Code removed here…
}

Creating the List of Comments

The code next takes on two tasks: to create a list of all the comments to delete, and to create a list of comment IDs that corresponds to the list of comments to delete. The code will then perform two additional actions: delete the comments from the part that contains the comments themselves, and delete the references from the document part.

The sample code starts by retrieving a list of Comment elements. To retrieve the list, the code converts the Elements collection exposed by the commentPart variable into a list of Comment objects.

Dim commentsToDelete As List(Of Comment) = _
 commentPart.Comments.Elements(Of Comment)().ToList()
List<Comment> commentsToDelete =
  commentPart.Comments.Elements<Comment>().ToList();

So far, the list of comments contains all of the comments: if the author parameter was not an empty string, the code must limit the list to only those comments whose Author property matches the parameter that you supplied.

If Not String.IsNullOrEmpty(author) Then
  commentsToDelete = commentsToDelete.
   Where(Function(c) c.Author = author).ToList()
End If
if (!String.IsNullOrEmpty(author))
{
  commentsToDelete = commentsToDelete.
    Where(c => c.Author == author).ToList();
}

Before deleting any comments, the code retrieves a list of comments ID values. The code uses these values later to delete matching elements from the document part. The call to the Select method effectively projects the list of comments, retrieving an IEnumerable of Strings that contains all the comment ID values.

Dim commentIds As IEnumerable(Of String) =
  commentsToDelete.Select(Function(r) r.Id.Value)
IEnumerable<string> commentIds = 
  commentsToDelete.Select(r => r.Id.Value);

Deleting Comments and Saving the Part

Given the commentsToDelete collection, the code can easy to loop through all the comments that require deleting and perform the deletion. The code then saves the comments part.

For Each c As Comment In commentsToDelete
  c.Remove()
Next
' Save comment part change.
commentPart.Comments.Save()
foreach (Comment c in commentsToDelete)
{
  c.Remove();
}
// Save comment part change.
commentPart.Comments.Save();

Deleting Comment References in the Document

Although the code has now successfully removed all the comments, there is still more to do: the code must also remove references to the comments from the document part. This action requires three steps: a comment reference includes the [CommentRangeStart], [CommentRangeEnd], and [CommentReference] elements, and the code must remove all three for each comment.

Before performing any deletions, the code first retrieves a reference to the document part itself.

Dim doc As Document = document.MainDocumentPart.Document
Document doc = document.MainDocumentPart.Document;

Given a reference to the document, the code can perform the deletion loop three times, once for each of the different elements. In each case, the code looks for all descendants of the correct type: [CommentRangeStart], [CommentRangeEnd], or [CommentReference]. The code then limits the list to those whose [Id.Value] property is contained in the list of comment IDs to be deleted. Given the list of elements to be deleted, the code removes each element in turn.

Dim commentRangeStartToDelete As List(Of CommentRangeStart) = _
  doc.Descendants(Of CommentRangeStart). _
  Where(Function(c) commentIds.Contains(c.Id.Value)).ToList()
For Each c As CommentRangeStart In commentRangeStartToDelete
  c.Remove()
Next

' Delete CommentRangeEnd for each deleted comment within main document.
Dim commentRangeEndToDelete As List(Of CommentRangeEnd) = _
 doc.Descendants(Of CommentRangeEnd). _
 Where(Function(c) commentIds.Contains(c.Id.Value)).ToList()
For Each c As CommentRangeEnd In commentRangeEndToDelete
  c.Remove()
Next

' Delete CommentReference within main document.
Dim commentRangeReferenceToDelete As List(Of CommentReference) = _
 doc.Descendants(Of CommentReference). _
 Where(Function(c) commentIds.Contains(c.Id.Value)).ToList
For Each c As CommentReference In commentRangeReferenceToDelete
  c.Remove()
Next
// Delete CommentRangeStart within main document.
List<CommentRangeStart> commentRangeStartToDelete =
  doc.Descendants<CommentRangeStart>().
  Where(c => commentIds.Contains(c.Id.Value)).ToList();
foreach (CommentRangeStart c in commentRangeStartToDelete)
{
  c.Remove();
}

// Delete CommentRangeEnd within the main document.
List<CommentRangeEnd> commentRangeEndToDelete =
  doc.Descendants<CommentRangeEnd>().
  Where(c => commentIds.Contains(c.Id.Value)).ToList();
foreach (CommentRangeEnd c in commentRangeEndToDelete)
{
  c.Remove();
}

// Delete CommentReference within main document.
List<CommentReference> commentRangeReferenceToDelete =
  doc.Descendants<CommentReference>().
  Where(c => commentIds.Contains(c.Id.Value)).ToList();
foreach (CommentReference c in commentRangeReferenceToDelete)
{
  c.Remove();
}

The code finishes by saving the document.

doc.Save()
doc.Save();

Sample Procedure

The following code example contains the complete sample procedure.

Public Sub WDDeleteComments(ByVal fileName As String,
                            Optional ByVal author As String = "")

  ' Get an existing Wordprocessing document.
  Using document As WordprocessingDocument =
    WordprocessingDocument.Open(fileName, True)
    ' Set commentPart to the document 
    ' WordprocessingCommentsPart, if it exists.
    Dim commentPart As WordprocessingCommentsPart =
      document.MainDocumentPart.WordprocessingCommentsPart

    ' If no WordprocessingCommentsPart exists, there can be no comments. 
    ' Stop execution and return from the method.
    If (commentPart Is Nothing) Then
      Return
    End If

    ' Create a list of comments by the specified author, or
    ' if the author name is empty, all authors.
    Dim commentsToDelete As List(Of Comment) = _
     commentPart.Comments.Elements(Of Comment)().ToList()
    If Not String.IsNullOrEmpty(author) Then
      commentsToDelete = commentsToDelete.
        Where(Function(c) c.Author = author).ToList()
    End If
    Dim commentIds As IEnumerable(Of String) =
      commentsToDelete.Select(Function(r) r.Id.Value)

    ' Delete each comment in commentToDelete from the Comments collection
    For Each c As Comment In commentsToDelete
      c.Remove()
    Next

    ' Save comment part change.
    commentPart.Comments.Save()

    Dim doc As Document = document.MainDocumentPart.Document

    ' Delete the CommentRangeStart for each 
    ' deleted comment within main document.
    Dim commentRangeStartToDelete As List(Of CommentRangeStart) = _
      doc.Descendants(Of CommentRangeStart). _
      Where(Function(c) commentIds.Contains(c.Id.Value)).ToList()
    For Each c As CommentRangeStart In commentRangeStartToDelete
      c.Remove()
    Next

    ' Delete CommentRangeEnd for each deleted comment within main document.
    Dim commentRangeEndToDelete As List(Of CommentRangeEnd) = _
     doc.Descendants(Of CommentRangeEnd). _
     Where(Function(c) commentIds.Contains(c.Id.Value)).ToList()
    For Each c As CommentRangeEnd In commentRangeEndToDelete
      c.Remove()
    Next

    ' Delete CommentReference within main document.
    Dim commentRangeReferenceToDelete As List(Of CommentReference) = _
     doc.Descendants(Of CommentReference). _
     Where(Function(c) commentIds.Contains(c.Id.Value)).ToList
    For Each c As CommentReference In commentRangeReferenceToDelete
      c.Remove()
    Next

    ' Save main document.
    doc.Save()
  End Using
End Sub
public static void WDDeleteComments(string fileName, 
  string author = "")
{
  // Get an existing Wordprocessing document.
  using (WordprocessingDocument document = 
    WordprocessingDocument.Open(fileName, true))
  {
    // Set commentPart to the document WordprocessingCommentsPart, 
    // if it exists.
    WordprocessingCommentsPart commentPart =
      document.MainDocumentPart.WordprocessingCommentsPart;

    // If no WordprocessingCommentsPart exists, 
    // there can be no comments. 
    // Stop execution and return from the method.
    if (commentPart == null)
    {
      return;
    }

    List<Comment> commentsToDelete =
      commentPart.Comments.Elements<Comment>().ToList();

    // Create a list of comments by the specified author.
    if (!String.IsNullOrEmpty(author))
    {
      commentsToDelete = commentsToDelete.
        Where(c => c.Author == author).ToList();
    }
    IEnumerable<string> commentIds = 
      commentsToDelete.Select(r => r.Id.Value);

    // Delete each comment in commentToDelete from the 
    // Comments collection.
    foreach (Comment c in commentsToDelete)
    {
      c.Remove();
    }

    // Save comment part change.
    commentPart.Comments.Save();

    Document doc = document.MainDocumentPart.Document;

    // Delete CommentRangeStart within main document.
    List<CommentRangeStart> commentRangeStartToDelete =
      doc.Descendants<CommentRangeStart>().
      Where(c => commentIds.Contains(c.Id.Value)).ToList();
    foreach (CommentRangeStart c in commentRangeStartToDelete)
    {
      c.Remove();
    }

    // Delete CommentRangeEnd within the main document.
    List<CommentRangeEnd> commentRangeEndToDelete =
      doc.Descendants<CommentRangeEnd>().
      Where(c => commentIds.Contains(c.Id.Value)).ToList();
    foreach (CommentRangeEnd c in commentRangeEndToDelete)
    {
      c.Remove();
    }

    // Delete CommentReference within main document.
    List<CommentReference> commentRangeReferenceToDelete =
      doc.Descendants<CommentReference>().
      Where(c => commentIds.Contains(c.Id.Value)).ToList();
    foreach (CommentReference c in commentRangeReferenceToDelete)
    {
      c.Remove();
    }

    // Save changes back to the MainDocumentPart part.
    doc.Save();
  }
}
Read It

The sample that is included with this Visual How To describes code that deletes comments from a Word document. To use the sample, you must install the Open XML SDK 2.0, available from the link listed in the Explore It section. The sample also uses a modified version of code included as part of a set of code examples for the Open XML SDK 2.0. The Explore It section also includes a link to the full set of code examples, although you can use the sample without downloading and installing the code examples.

The sample application demonstrates only some the available properties and methods that are provided by the Open XML SDK 2.0 that you can interact with when modifying document structure. For more information, investigate the documentation that is included with the Open XML SDK 2.0 Productivity Tool: Click the Open XML SDK Documentation tab in the lower-left corner of the application window, and search for the class that you would like to study. Although the documentation does not currently include code examples, given the sample shown here and the documentation, you should be able to successfully modify the sample application.

See It

Watch the video

> [!VIDEO https://www.microsoft.com/en-us/videoplayer/embed/11b26ae1-b5d6-4ae7-a7ac-e8c0c2370855]

Length: 00:20:29

Click to grab code

Grab the Code

Explore It

About the Author
Ken Getz is a senior consultant with MCW Technologies. He is coauthor of ASP.NET Developers Jumpstart (Addison-Wesley, 2002), Access Developer's Handbook (Sybex, 2001), and VBA Developer's Handbook, 2nd Edition (Sybex, 2001).