Using VBA to Enumerate All Word 2007 Documents Containing Revisions and Comments
Summary: Learn how to use Microsoft Visual Basic for Applications (VBA) to enumerate comments and revisions in Microsoft Word 2007 documents contained in a single directory. This article shows you how to take this information and populate a Microsoft Excel 2007 spreadsheet. (7 printed pages)
Josh Cassell, Microsoft
Applies to: Microsoft Office Word 2007, Microsoft Office Excel 2007, Visual Basic for Applications
It is not unusual for people to work together on groups of documents. This may mean that you have multiple users working on a document or group of documents making multiple revisions and comments. Imagine how useful a report of all the changes in a group of documents could be.
When you modify a Microsoft Office Word 2007 document using Track Changes or insert a New Comment, a collection of revisions is created. By default, Word 2007 uses balloons to display deletions, comments, formatting changes and content that has moved. This article shows you how to use Microsoft Visual Basic for Applications (VBA) in Microsoft Office Excel 2007 to loop through a given directory, and for each Word 2007 document found in that directory, collect information about revisions and comments and populate an Excel 2007 workbook with this information.
Let's briefly examine how comment and revision information is stored in an Office Open XML document. The body of the comments are stored in a separate file named comments.xml in the document package. In the document.xml part, the comment ranges are wrapped in w:commentRangeStart and w:commentRangeEnd tags. The comment is linked back to the comments.xml part with a w:commentReference specifying the ID of that comment.
Revision changes are specified with multiple tags in document.xml. Some examples include w:ins for insertions, w:del for deletions, and w:tcPrChange for table property changes.
You can work with these Open XML elements directly; however this article shows you how to write a VBA macro that enumerates through the corresponding Comments and Revisions collection objects in Word 2007.
To create a report of comments and revisions, you first must get the path to the directory of Word 2007 documents from cell B1 in the Excel 2007 file that contains the macro. Then, create a new Excel 2007 document and set it up with one worksheet for comments and another worksheet for revisions. Next, loop through the directory, and for each file with a docx extension, populate the Excel 2007 workbook with information about the comments and revisions found within that file. To demonstrate this, I've divided this code into the following four procedures:
StartReporting – sets up the environment, creates the Excel 2007 workbook and sets up the column headers, creates the Word 2007 application, and calls the Report subroutine.
Report – loops through and finds all the Word 2007 documents in the specified directory, and passes along each document to the DoComments and DoRevisions subroutines.
DoComments – loops through the comments in the specified Word 2007 document and populates the reporting worksheet with the name of the author and the text of the comment.
DoRevisions – loops through the revisions in the specified Word 2007 document and populates the reporting worksheet with the author's name, the date of the revision, and its type.
Before you begin writing your VBA macro, you need to set up your project.
To set up the Excel Workbook and VBA project
Adding Global Variables
The StartReporting() function opens a new Excel 2007 workbook, sets up two worksheets to hold comment and revision information, launches Word 2007 and then calls the function that searches the directory and starts the reporting.
Sub StartReporting() 'Turn alerts off. Application.DisplayAlerts = False 'Turn screen updating off. Application.ScreenUpdating = False 'Create the Report Workbook and set up the worksheets. Set RptBook = Application.Workbooks.Add RptBook.Sheets(3).Delete RptBook.Sheets(2).Name = "Comments" RptBook.Sheets(2).Range("$A$1").Value = "Document Name" RptBook.Sheets(2).Range("$B$1").Value = "Author" RptBook.Sheets(2).Range("$C$1").Value = "Comment Text" RptBook.Sheets(1).Name = "Revisions" RptBook.Sheets(1).Range("$A$1").Value = "Document Name" RptBook.Sheets(1).Range("$B$1").Value = "Revision Type" RptBook.Sheets(1).Range("$C$1").Value = "Revision Author" RptBook.Sheets(1).Range("$D$1").Value = "Revision Date" 'Open Word and make it visible. Set WdApp = New Word.Application WdApp.DisplayAlerts = wdAlertsNone WdApp.Visible = True 'Start the reporting loop. For this example, you get the path 'where the docs are stored from cell B1. Report (Workbooks("Main.xlsm").Sheets(1).Range("$b$1").Text) 'Close Word. WdApp.Quit Set WdApp = Nothing Application.DisplayAlerts = True Application.ScreenUpdating = True End Sub
In this code, you set the DisplayAlerts and ScreenUpdating properties of the reporting workbook to False to prevent any modal dialog boxes and to prevent flicker on the screen of the report workbook you are updating. This also has the added benefit of speeding up the macro. You also set Word 2007 to visible. This prevents any pauses from modal dialogs that are shown in Word 2007 due to issues such as when one of the files you are accessing is currently open.
Next, add the code to loop through the directory, open the documents, and call separate procedures to handle the comments and revisions.
Sub Report(path As String) Dim wdDoc As String Dim curDoc As Word.Document 'Get first document in directory wdDoc = Dir(path & "\*.docx") 'Loop until we don't have anymore documents in the directory Do While wdDoc <> "" 'Open the document Set curDoc = WdApp.Documents.Open(path & "\" & wdDoc) 'Get comments DoComments curDoc 'Get revisions DoRevisions curDoc 'Close the document curDoc.Close 'Get the next document wdDoc = Dir() Loop End Sub
The following subroutine processes the comments for each document. You will handle the revisions separately.
Sub DoComments(cd As Word.Document) On Error Resume Next Dim x As Integer Dim cm As Word.Comment Dim rw As Integer 'Find the last row in the comments worksheet. Enumerating in 'reverse is common practice in case there are blank rows (this 'should not be the case here). rw = RptBook.Sheets("comments").Range("$a$1048576").End(xlUp).Row _ + 1 For x = 1 To cd.Comments.Count Set cm = cd.Comments(x) RptBook.Sheets("Comments").Range("$A$" & rw).Value = cd.Name RptBook.Sheets("Comments").Range("$B$" & rw).Value = cm.Author RptBook.Sheets("Comments").Range("$C$" & rw).Value = cm.Range rw = rw + 1 Next x End Sub
Now add the code to process the revisions.
Revisions can be nested and so a simple For Each loop will not discover them all. Instead, use a For-Next loop statement that depends on the Revisions.Count property. It is important to consider the nested relation of revisions when working with them. You can use a Select-Case statement to determine the type of the revision. The types of possible revisions are enumerated in the Word.wdRevisionType enumeration
Sub DoRevisions(cd As Word.Document) On Error Resume Next Dim x As Integer Dim oRev As Revision Dim rw As Integer Dim t As String 'Find the last row in the revisions worksheet. Enumerating in 'reverse is common practice in case there are blank rows (this 'should not be the case here). rw = RptBook.Sheets("Revisions").Range("$a$1048576").End(xlUp).Row _ + 1 'Use a For Next loop with Revisions.Count instead 'of a ForEach because revisions can be nested. For x = 1 To cd.Revisions.Count Set oRev = cd.Revisions(x) 'This is the type of the Revision. Select Case oRev.Type Case 0: t = "No revision." Case 1: t = "Insertion." Case 2: t = "Deletion." Case 3: t = "Property changed." Case 4: t = "Paragraph number changed." Case 5: t = "Field display changed." Case 6: t = "Revision marked as reconciled conflict." Case 7: t = "Revision marked as a conflict." Case 8: t = "Style changed." Case 9: t = "Replaced." Case 10: t = "Paragraph property changed." Case 11: t = "Table property changed." Case 12: t = "Section property changed." Case 13: t = "Style definition changed." Case 14: t = "Content moved from." Case 15: t = "Content moved to." Case 16: t = "Table cell inserted." Case 17: t = "Table cell deleted." Case 18: t = "Table cells merged." End Select RptBook.Sheets("Revisions").Range("$A$" & rw).Value = cd.Name RptBook.Sheets("Revisions").Range("$B$" & rw).Value = t RptBook.Sheets("Revisions").Range("$C$" & rw).Value = oRev.Author RptBook.Sheets("Revisions").Range("$D$" & rw).Value = oRev.Date rw = rw + 1 Next x End Sub
Using VBA to automate Word 2007 is a good way to gather comments and revisions in documents and summarize them. This helps you get an overview of changes in a group of documents without examining each one separately. Group work greatly benefits from this feature, because everyone can now see the changes that are being made. It's important to note that, especially in collaborative environments, revisions can be nested and so you must take that into account when working with them.