Was this page helpful?
Your feedback about this content is important. Let us know what you think.
Additional feedback?
1500 characters remaining
Export (0) Print
Expand All

Protecting Personal Data in Your Microsoft Word Documents

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.
 

Frank C. Rice
Microsoft Corporation

August 2002

Applies to:
   Microsoft® Word 2002
   Microsoft Word 2000

Summary: Learn about some of the places that Microsoft Word stores information in your documents and ways to prevent that information from being shared with others. (10 pages)

Contents

Introduction
Background
Viewing Personal or Hidden Information
Removing Personal Information During Document Saves
Manually Removing Personal Information
Removing Personal Information Programmatically
Displaying Hidden Items
Features That Store Hidden Information
Hyperlinks
Removing Your Name from Macros
Document Variables
See Also

Introduction

Microsoft® Word stores a lot of information in a document, not only about when and how the document was created, and what changes it has undergone in its life, but also about who created it and who made changes to the document. In this article, I will discuss some of the places this information is stored and ways to remove or hide the information.

Privacy and the protection of personal information have become vitally important. Hackers, identity thieves, and even your competitors continue to employ increasingly sophisticated ways to gain access to, and exploit sensitive information about, companies and individuals.

Most software applications store information (also known as metadata) in the files they use to provide and maintain a history of the files, and to aid in searching for and retrieving documents. Metadata is also used to keep all of a file's information in one central location.

Much of this information is stored as part of a product feature or property setting, sometimes without the user being aware of what is being stored or where. Even seemingly innocuous features and settings in a product can store information that can provide telling information about you or your company to prying eyes. For example, Word and some other document processing applications allow you to store different versions of a document in the document file as hidden text. Let's say you've spent several weeks working with your marketing and editorial staff to create a document outlining the features of a new product. You plan on sending this document to your sales staff as part of a new marketing campaign. At the last minute, you decide to remove a couple of features from the product that require more testing but that will definitely be included in the next version of the product. You are unaware that the versioning feature has been turned on for this document so each one of your revisions, including the version with the removed product features, has been saved with the document file. After sending the document to the sales team by e-mail, a copy of the e-mail attachment falls into the hands of a competitor who, after viewing the different versions of the document, sees the version detailing the removed features and sends that information to their engineering team.

Likewise, a document's properties may contain the name and e-mail address of the document's creator, which, at the least, may result in unsolicited e-mail. However, you can protect yourself if you know how and where you personal information is being stored in the documents you create and use.

In this article, we will examine areas in a document where metadata can exist and describe some ways that you can remove this information. With an understanding of where sensitive information may exist in your document and how it got there, you can remove this information with just a little effort. By using the information discussed in this article, you can prevent sensitive information from falling into the wrong hands.

Background

In many different document processing applications, including Word, there are a number of different types of metadata stored. They can include:

  • Your name
  • Your initials
  • Your company or organization name
  • The name of your computer
  • The name of the network server or hard disk where you saved the document
  • The names of previous document authors
  • Document revisions
  • Document versions
  • Template information
  • Hidden text
  • Comments

The storage of metadata in a document can provide benefits to include ensuring that important information won't be separated from the document as well as safeguarding proprietary information in the event of plagiarism or copyright infringement. However, these benefits must be carefully weighed against the risks of unintentionally exposing this information. By understanding where and what types of metadata can exist in a document, you are in a better position to control what information you expose.

Now let's look at some ways to safeguard and remove sensitive information from Word documents.

Viewing Personal or Hidden Information

Discovering the metadata in a document isn't that difficult. For example, one feature of Word lets you open a document that has become corrupt by viewing the text without the formatting. This feature can also be used to view some of the metadata associated with a document. To test this, perform the following steps on a document of your own:

  1. Start Word.
  2. On the File menu, click Open.
  3. In the Files of type list, click Recover Text from Any File, locate a Word document (.doc) file, and then click Open.

The document opens without any formatting. After scrolling through the document, you may see information such as the name of the author of the document and the path of the stored document.

Before you provide others with a copy of your document, it's a good idea to view any hidden information and decide whether it's appropriate to store this information. For example, if you click Track Changes on the Tools menu, click Versions on the File menu, or select the Allow fast saves option on the Save tab of the Options dialog box on the Tools menu, you should look at removing any hidden or deleted information that might remain in your document. You can do this by completing the tasks in the following sections.

Removing Personal Information During Document Saves

You can ensure that certain personal information is removed when saving your documents. To enable this option, use the following steps:

  1. On the Tools menu, click Options, and then click the Security tab.
  2. Select the Remove personal information from this file on save check box in the Privacy options area, and then click OK.
  3. Save the document.

When you select this check box, the following personal information is removed from your document:

  • File properties: Author, Manager, Company, and Last saved by.
  • Names associated with comments or tracked changes; names are changed to Author.
    Note   Tracked changes are marks that show where a deletion, insertion, or other editing change has been made in a document.
  • Routing slip: The routing slip is removed.
  • The e-mail message header that's generated with the E-mail toolbar button is removed.
  • Versioning: The name under Saved by is changed to Author.
Note   The Remove personal information from this file on save check box is not selected by default. In addition, when this check box is selected, it applies only to the active document and not to any existing or new documents. So you will need to select this check box for each document.

While selecting the Remove personal information from this file on save check box can remove the metadata described above, there is other metadata in a document that is not removed by this check box.

Manually Removing Personal Information

Document properties store information about a document, such as the document's file name, storage location, creation date, and file attributes. However, the document properties can also store more personal metadata, such as the author's name, the author's company, and the document's editor. You can manually remove this information from the properties of a document by using one or both of the following procedures.

In Word:

  1. Open the document.
  2. On the File menu, click Properties.
  3. The Summary, Statistics, Contents, and Custom tabs may each contain information that you will want to remove. To remove, highlight the information in each box and press DELETE.

In Microsoft Windows® Explorer:

  1. Locate the document file in the Explorer pane.
  2. Right-click the file and then click Properties.
  3. The Summary tab may contain information that you will want to remove. To remove, highlight the information in each box and press DELETE.

Removing Personal Information Programmatically

The options covered so far are good for uncovering confidential metadata, but they require that you select each of the options manually. However, trying to remember to do this for each of your documents, or worse, ensuring that the others in your company do this, isn't always practical. Fortunately, Word also provides the RemovePersonalInformation property, which, when set to True, removes all user information from comments, revisions, and the Properties dialog box when the user saves a document. For example, the following procedure creates a new document, and then adds code to the document's Open event that sets the RemovePersonalInformation property to True. This ensures that personal information will be removed from the document whenever the user saves it. For the procedure to take effect, you must close and then reopen the document. Here's how to set the RemovePersonalInformation property for the current document:

  1. Start Microsoft Word.
  2. Create a new blank document.
  3. On the Tools menu, point to Macro and then click Visual Basic Editor.
  4. In the Project Explorer window, under the folder for the current document, double-click ThisDocument under the Microsoft Word Objects folder.
  5. In the Code window, click the arrow beside the Object drop-down list (left drop-down list), and click Document.
  6. Click the arrow beside the Procedure drop-down list (right drop-down list), and then click Open.
  7. Insert the following statement between Sub Document_Open() and End Sub:
    ThisDocument.RemovePersonalInformation = True
    
    
  8. Close the Visual Basic Editor by clicking Close and Return to Microsoft Word on the File menu.
  9. Save and close the document. When you reopen the document, the document's Open event will execute, setting the RemovePersonalInformation property to True. Personal information will then be removed from the document whenever the user saves it.

Displaying Hidden Items

Sometimes, to protect your personal information, you must display the information before you can decide whether or not to remove it. The following sections explain how to display various items that may contain hidden information.

Display Tracked Changes and Comments

Markup items in a Word document consist of comments and tracked changes, such as insertions, deletions, and formatting changes, which are used by writers and editors to annotate a document during the editing process. When you choose to display all markup, all types of markup and all reviewers' names will be selected on the Show menu.

Note   Before deleting, it is a good idea to print a document with the markup to keep a record of changes made to a document.

To display tracked changes or comments, click Markup on the View menu.

Note   You can also choose to display a warning if you print, save, or send a document that contains tracked changes by clicking the option Warn before printing, saving, or sending a file that contains tracked changes or comments, which is available on the Security tab in the Options dialog box on the Tools menu.

Display Hidden Text

Hidden text in a Word document is character formatting that allows you to show or hide specified text. For example, while researching and writing a document, you might include, as hidden text in the document, notes to yourself to recheck a reference.

To view hidden text, click Options on the Tools menu, click the View tab, and then select the Hidden text check box in the Formatting marks area. Word indicates the hidden text with a dotted underline.

To remove hidden text from a printed document, click Options on the Tools menu, click the Print tab, and then clear the Hidden text check box in the Include with document area. If you plan to distribute the document online, just delete the hidden text as you would delete any other text.

Remove Previous Versions of a Document

You can specify that you want Word to save one or more versions of your document in the same file. Those versions are then saved as hidden information in the document so that you can retrieve them later. Because these hidden versions are available to others and because they do not remain hidden if the document is saved in another format, you may want to remove these versions before you share the document. There are a couple of ways to do this:

To keep the previous versions, the following steps allow you to save the current version as a separate document and then distribute only that document:

  1. On the File menu, click Versions.
  2. Click the version of the document you want to save as a separate file.
  3. Click Open.
  4. On the File menu, click Save As.
  5. In the File name box, type a name, and then click Save.

To delete the unwanted versions and then distribute the document, do the following:

  1. On the File menu, click Versions.
  2. Click the version of the document you want to delete.
  3. To select more than one version, press and hold CTRL as you click each version.
  4. Click Delete.

Features That Store Hidden Information

Some features in Word store metadata by default. Disabling these features can remove unwanted metadata from your documents.

Fast Save Option

If you save a document with the Allow fast saves check box selected, and then open the document as a text file, the document may contain information that you previously deleted. This happens because a fast save appends the changes you make to the end of the document; it doesn't incorporate the changes (including deleted information) into the document itself.

To completely remove the deleted information from the document, do the following:

  1. If you opened the document as a text file, close the text file and open the document as a regular Word document.
  2. On the Tools menu, click Options, click the Save tab, and then clear the Allow fast saves check box.
  3. On the File menu, click Save.

Random Numbers Used When Merging Documents

When you compare and merge documents, Word uses randomly generated numbers to help keep track of related documents. Although these numbers are hidden, they could potentially be used to demonstrate that two documents are related. To stop storing random numbers during the merge process, perform the following:

  1. On the Tools menu, click Options, and then click the Security tab.
  2. Clear the Store random number to improve merge accuracy check box.
Note   If you choose not to store these numbers, the results of merged documents will be less than optimal, meaning that it may be difficult for Word to determine whether two or more documents are related.

Routing Slip Information

If you send a document through e-mail by using a routing slip, routing information may be attached to the document. To remove this information from the document, you must save the document in a format that does not retain routing slip information, such as Rich Text Format (RTF) or HTML format.

You can also use the following procedure to remove routing slip information:

  1. Turn off the Allow fast saves option by using the steps in the "Fast Save Option" section of this article.
  2. On the File menu, point to Send to, and then click Other Routing Recipient.
  3. Click Clear to remove the routing slip, and then click OK.
  4. On the File menu, click Save.

The document is now saved without any routing slip information.

Hyperlinks

Documents may contain hyperlinks to other documents or Web pages on an intranet or the Internet. This information is contained within the document and stays with the document if it is shared or copied.

Note   Hyperlinked text typically appears as blue and underlined.

To manually delete a single hyperlink from a document, right-click the hyperlink, point to Hyperlink, and then click Remove Hyperlink.

To delete all hyperlinks in a document, you can use a Microsoft Visual Basic® for Applications (VBA) macro. In the following procedure, you create a new macro for the current document, add code to remove all hyperlinks in the document, and then execute the macro.

  1. On the Tools menu, point to Macro, and then click Macros.
  2. In the Macros in list, click the name of the current document.
  3. In the Macro name box, type a name for the macro. For this example, type the name RemoveHyperlinks.
  4. Click Create to open the Visual Basic Editor.
  5. Insert the following code between Sub RemoveHyperlinks and End Sub.
    Dim objDoc As Document
    Dim objStory As Range
    Dim objHlink As Hyperlink
    
    For Each objStory In ActiveDocument.StoryRanges
        For Each objHlink In objStory.Hyperlinks
            objHlink.Delete
        Next
    Next
    
    
  6. Close the Visual Basic Editor.
  7. To run the macro, point to Macro on the Tools menu, click Macros, click the RemoveHyperlinks macro, and then click Run.

After running this macro, only the link is removed. The text of the hyperlink remains in the document.

To remove all traces of both the hyperlink and the text of the hyperlink from the document, follow the steps above, naming the macro RemoveAllHyperlinks, and then inserting the following code between Sub RemoveAllHyperlinks and End Sub:

Dim objDoc As Document
Dim objStory As Range
Dim objHlink As Hyperlink

For Each objStory In ActiveDocument.StoryRanges
    For Each objHlink In objStory.Hyperlinks
        objHlink.Range.Delete
    Next
Next

Executing this macro will remove both the hyperlink and the text of the hyperlink from the document.

Removing Your Name from Macros

When you record a VBA macro in Word, the recorded macro begins with a header similar to the following:

' Macro1 Macro
' Macro recorded 3/11/1999 by <User Name> 

To remove your name from any macros that you record, perform the following:

  1. Open the document that contains the macros.
  2. On the Tools menu, point to Macro, and then click Visual Basic Editor.
  3. In the Project Explorer window, double-click the module that contains the macros.
  4. Remove your name from the recorded macro code by highlighting the text and pressing DELETE.
  5. Close the Visual Basic Editor, and then click Save on the File menu to save the document.

Document Variables

Document variables are used to store information in a document. For example, document variables can be used to preserve macro settings between macro sessions. They can also contain metadata.

In the following procedure, you create a new macro, insert code in the macro that displays a message box containing the number of document variables in a document named "MyDoc.doc," and then execute the macro:

  1. On the Tools menu, point to Macro, and then click Macros.
  2. In the Macros in list, click the name of the current document.
  3. In the Macro name box, type a name for the macro. For this example, use the name CountofDocVariables.
  4. Click Create to open the Visual Basic Editor.
  5. Insert the following code between Sub CountofDocVariables and End Sub.
    MsgBox Documents("MyDoc.doc").Variables.Count & " variables"
    
    
  6. Close the Visual Basic Editor.
  7. To run the macro, point to Macro on the Tools menu, click Macros, click the CountofDocVariables macro, and then click Run. A message box is displayed with the count of document variables in the document.

Using the previous steps, you can create other macros that work with document variables. For example, you can create a new macro and then insert the following procedure to display the name and value of each document variable in the active document:

...
For Each myVar In ActiveDocument.Variables
    MsgBox "Name =" & myVar.Name & vbCr & "Value = " & myVar.Value
Next myVar
...

You can use the following statement to delete a particular document variable from a document:

...
ActiveDocument.Variables.Item("MyVar").Delete
...

In this article, we have covered just a few of the ways of dealing with metadata in your document. For more information on other ways to manage this information, see the following references.

See Also

HOW TO: Minimize Metadata in Microsoft Word 97 (technical article)

HOW TO: Minimize Metadata in Microsoft Word 2000 (technical article)

HOW TO: Minimize Metadata in Microsoft Word 2002 (technical article)

OFF: How to Minimize Metadata in Microsoft Office Documents

Show:
© 2015 Microsoft