Information
The topic you requested is included in another documentation set. For convenience, it's displayed below. Choose Switch to see the topic in its original location.

Manipulating Word 2007 Files with the Open XML Format API (Part 3 of 3)

Office 2007

Summary: This is the third in a series of three articles that describes the Open XML Application Programming Interface (API) code that you can use to access and manipulate Microsoft Office Word 2007 files. (16 printed pages)

Frank Rice, Microsoft Corporation

September 2007 (Revised August 2008)

Applies to: Microsoft Office Word 2007

Contents

The 2007 Microsoft Office system introduces new file formats that are based on XML called Open XML Formats. Microsoft Office Word 2007, Microsoft Office Excel 2007, and Microsoft Office PowerPoint 2007 all use these formats as the default file format. Open XML formats are useful because they are an open standard and are based on well-known technologies: ZIP and XML. Microsoft provides a library for accessing these files as part of the .NET Framework 3.0 technologies in the DocumentFormat.OpenXml namespace in the Welcome to the Open XML Format SDK 1.0. The Open XML Format members are contained in theDocumentFormat.OpenXml API and provide strongly-typed part classes to manipulate Open XML documents. The SDK simplifies the task of manipulating Open XML packages. The Open XML Format API encapsulates many common tasks that developers perform on Open XML Format packages, so you can perform complex operations with just a few lines of code.

In the following code, you set the value of a custom property in a document. A document may or may not include custom properties that reside in the custom.xml part. Therefore, the procedure does the following:

  • If the custom.xml part does not exist in the document, it adds it.

  • If the custom.xml part is there, but the property is not, it adds the property.

  • If the property exists and is of the same type, it replaces the value.

  • If the property exists and is of a different type, it updates the existing property.

When completed, the procedure returns a Boolean value indicating whether the operation succeeded or not.

public enum PropertyTypes
{
   YesNo,
   Text,
   DateTime,
   NumberInteger,
   NumberDouble,
}

public static bool WDSetCustomProperty(string docName, string propertyName, object propertyValue, PropertyTypes propertyType)
{
   // Given a document name, a property name/value, and the property type, add a custom property 
   // to a document. Return True if the property was added/updated, or False if the property cannot be updated.
   // The function's return value is true if the code could add/update the property,
   // and false otherwise.

   const string customPropertiesSchema = "http://schemas.openxmlformats.org/officeDocument/2006/custom-properties";
   const string customVTypesSchema = "http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes";

   bool retVal = false;
   string propertyTypeName = "vt:lpwstr";
   string propertyValueString = null;

   //  Calculate the correct type.
   switch (propertyType)
   {
      case PropertyTypes.DateTime:
         propertyTypeName = "vt:filetime";
         // Make sure you were passed a real date, 
         // and if so, format in the correct way. The date/time 
         // value passed in should represent a UTC date/time.
         if (propertyValue.GetType() == typeof(System.String))
         {
            propertyValueString = string.Format("{0:s}Z", Convert.ToDateTime(propertyValue));
         }
         break;
      case PropertyTypes.NumberInteger:
         propertyTypeName = "vt:i4";
         if (propertyValue.GetType() == typeof(System.Int32))
         {
            propertyValueString = Convert.ToInt32(propertyValue).ToString();
         }

         break;
      case PropertyTypes.NumberDouble:
         propertyTypeName = "vt:r8";
         if (propertyValue.GetType() == typeof(System.Double))
         {
            propertyValueString = Convert.ToDouble(propertyValue).ToString();
         }

         break;
      case PropertyTypes.Text:
         propertyTypeName = "vt:lpwstr";
         propertyValueString = Convert.ToString(propertyValue);

         break;
      case PropertyTypes.YesNo:
         propertyTypeName = "vt:bool";
         if (propertyValue.GetType() == typeof(System.Boolean))
         {
            // IMPORTANT: This value must be lowercase.
            propertyValueString = Convert.ToBoolean(propertyValue).ToString().ToLower();
         }
         break;
   }

   if (propertyValueString == null)
   {
      // If the code cannot convert the 
      // property to a valid value, throw an exception.
      throw new InvalidDataException("Invalid parameter value.");
   }

   using (WordprocessingDocument wdPackage = WordprocessingDocument.Open(docName, true))
   {

      // This part is working with the custom properties part.
      CustomFilePropertiesPart customPropsPart = wdPackage.CustomFilePropertiesPart;

      // You must manage namespaces to perform XML XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("d", customPropertiesSchema);
      nsManager.AddNamespace("vt", customVTypesSchema);

      Uri customPropsUri = new Uri("/docProps/custom.xml", UriKind.Relative);
      XmlDocument customPropsDoc = null;
      XmlNode rootNode = null;

      // There may not be a custom properties part.
      if (customPropsPart == null)
      {
         customPropsDoc = new XmlDocument(nt);

         // The part does not exist. Create it now.
         customPropsPart = wdPackage.AddCustomFilePropertiesPart();

         // Set up the rudimentary custom part.
         rootNode = customPropsDoc.CreateElement("Properties", customPropertiesSchema);
         rootNode.Attributes.Append(customPropsDoc.CreateAttribute("xmlns:vt"));
         rootNode.Attributes["xmlns:vt"].Value = customVTypesSchema;

         customPropsDoc.AppendChild(rootNode);
      }
      else
      {
         // Load the contents of the custom properties part into an XML document.
         customPropsDoc = new XmlDocument(nt);
         customPropsDoc.Load(customPropsPart.GetStream());
         rootNode = customPropsDoc.DocumentElement;
      }

      // Now that you have a reference to an XmlDocument object that 
      // corresponds to the custom properties part, 
      // check to see if the required property is already there.
      string searchString = string.Format("d:Properties/d:property[@name='{0}']", propertyName);
      XmlNode node = customPropsDoc.SelectSingleNode(searchString, nsManager);

      XmlNode valueNode = null;
      if (node != null)
      {
         // You found the node. Now check its type.
         if (node.HasChildNodes)
         {
            valueNode = node.ChildNodes[0];
            if (valueNode != null)
            {
               string typeName = valueNode.Name;
               if (propertyTypeName == typeName)
               {
                  // The types are the same. 
                  // Replace the value of the node.
                  valueNode.InnerText = propertyValueString;
                  // If the property existed, and its type
                  // has not changed, you are finished.
                  retVal = true;
               }
               else
               {
                  // Types are different. Delete the node
                  // and clear the node variable.
                  node.ParentNode.RemoveChild(node);
                  node = null;
               }
            }
         }
      }

      // The previous block of code may have cleared the value in the 
      // variable named node.
      if (node == null)
      {
         // Either you did not find the node, or you 
         // found it, its type was incorrect, and you deleted it.
         // Either way, you need to create the new property node now.

         // Find the highest existing "pid" value.
         // The default value for the "pid" attribute is "2".
         string pidValue = "2";

         XmlNode propertiesNode = customPropsDoc.DocumentElement;
         if (propertiesNode.HasChildNodes)
         {
            XmlNode lastNode = propertiesNode.LastChild;
            if (lastNode != null)
            {
               XmlAttribute pidAttr = lastNode.Attributes["pid"];
               if (!(pidAttr == null))
               {
                  pidValue = pidAttr.Value;
                  // Increment pidValue, so that the new property
                  // gets a pid value one higher. This value should be 
                  // numeric, but you should confirm that.
                  int value = 0;
                  if (int.TryParse(pidValue, out value))
                  {
                     pidValue = Convert.ToString(value + 1);
                  }
               }
            }
        }

            node = customPropsDoc.CreateElement("property", customPropertiesSchema);
            node.Attributes.Append(customPropsDoc.CreateAttribute("name"));
            node.Attributes["name"].Value = propertyName;

            node.Attributes.Append(customPropsDoc.CreateAttribute("fmtid"));
            node.Attributes["fmtid"].Value = "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}";

            node.Attributes.Append(customPropsDoc.CreateAttribute("pid"));
            node.Attributes["pid"].Value = pidValue;

            valueNode = customPropsDoc.CreateElement(propertyTypeName, customVTypesSchema);
            valueNode.InnerText = propertyValueString;
            node.AppendChild(valueNode);
            rootNode.AppendChild(node);
            retVal = true;
         }

         // Save the properties XML back to its part.
         customPropsDoc.Save(customPropsPart.GetStream());
         //wdPackage.Save();
         }
   return retVal;
}

The code first defines an enumeration of possible custom property types.

public enum PropertyTypes
{
   YesNo,
   Text,
   DateTime,
   NumberInteger,
   NumberDouble,
}

Next, the code example calls the WDSetCustomProperty, passing in a reference to the Word 2007 document, the name of the custom property, the new value to which you want to set the property, and the property type from the enumerated values.

string propertyTypeName = "vt:lpwstr";

Then you set the propertyTypeName variable to a default node name (vt:lpwstr) representing a Text value in the WordprocessingML markup in the CustomFilePropertiesPart part. The document's custom properties reside in the CustomFilePropertiesPart part. This variable eventually references the node that contains the property value you want to set.

Next, a series of Select Case statements (switch statements in Microsoft Visual C#) test the type of the property you want to update. Then, depending on the value of the property, the code sets a variable equal to the name of the specific node that holds that value. The code then formats the value that updates the property.

switch (propertyType)
{
   case PropertyTypes.DateTime:
      propertyTypeName = "vt:filetime";
      // Make sure you were passed a real date, 
      // and if so, format in the correct way. The date/time 
      // value passed in should represent a UTC date/time.
      if (propertyValue.GetType() == typeof(System.String))
      {
         propertyValueString = string.Format("{0:s}Z", Convert.ToDateTime(propertyValue));
      }
      break;
   case PropertyTypes.NumberInteger:
      propertyTypeName = "vt:i4";
      if (propertyValue.GetType() == typeof(System.Int32))
      {
         propertyValueString = Convert.ToInt32(propertyValue).ToString();
      }

      break;
.......
.......
}

Then you create a WordprocessingDocument object from the input document, representing the Office Open XML Format package. Next, the code retrieves the CustomFilePropertiesPart. Then the code creates a namespace manager to set up the XPath query.

The next section of code determines if the custom property part exists.

if (customPropsPart == null)
{
   customPropsDoc = new XmlDocument(nt);

   // The part does not exist. Create it now.
   customPropsPart = wdPackage.AddCustomFilePropertiesPart();

   // Set up the rudimentary custom part.
   rootNode = customPropsDoc.CreateElement("Properties", customPropertiesSchema);
   rootNode.Attributes.Append(customPropsDoc.CreateAttribute("xmlns:vt"));
   rootNode.Attributes["xmlns:vt"].Value = customVTypesSchema;

   customPropsDoc.AppendChild(rootNode);
}
else
{
   // Load the contents of the custom properties part into an XML document.
   customPropsDoc = new XmlDocument(nt);
   customPropsDoc.Load(customPropsPart.GetStream());
   rootNode = customPropsDoc.DocumentElement;
}

If the part does not exist, the code creates a custom property part shell and populates it with basic properties. If the part does exist, you load its contents into a memory-resident XML document and then you set up the search string as an XPath query to search for the d:Properties/d:property node.

if (node != null)
{
   // You found the node. Now check its type.
   if (node.HasChildNodes)
   {
      valueNode = node.ChildNodes[0];
      if (valueNode != null)
      {
         string typeName = valueNode.Name;
         if (propertyTypeName == typeName)
         {
            // The types are the same. 
            // Replace the value of the node.
            valueNode.InnerText = propertyValueString;
               // If the property existed, and its type
               // did not change, you are finished.
               retVal = true;
         }
         else
         {
             // Types are different. Delete the node
             // and clear the node variable.
             node.ParentNode.RemoveChild(node);
             node = null;
          }
      }
   }
}

In this code, the following actions may occur:

  • If you did not find the node, it adds it to the part.

  • If you found the node and the type is different then the new property, it deletes the node.

  • If you found the node and the type is the same as the new property, it replaces the property value. Otherwise, it adds a new node with the new value and type.

  • If you did not find the node, or you found it but its type was incorrect, so you deleted it, it creates the new property node.

string pidValue = "2";

XmlNode propertiesNode = customPropsDoc.DocumentElement;
if (propertiesNode.HasChildNodes)
{
   XmlNode lastNode = propertiesNode.LastChild;
   if (lastNode != null)
   {
      XmlAttribute pidAttr = lastNode.Attributes["pid"];
      if (!(pidAttr == null))
      {
         pidValue = pidAttr.Value;
         // Increment pidValue, so that the new property
         // gets a pid value one higher. This value should be 
         // numeric, but you should confirm that.
         int value = 0;
         if (int.TryParse(pidValue, out value))
         {
            pidValue = Convert.ToString(value + 1);
         }
      }
   }
}

Each property has an id value (pidValue) that has a default value of 2. This value must be one higher than the value for any existing property ids. This code segment finds the value of the existing property node id (if any exist) and ensures that the id of the new property node is one higher.

node = customPropsDoc.CreateElement("property", customPropertiesSchema);
node.Attributes.Append(customPropsDoc.CreateAttribute("name"));
node.Attributes["name"].Value = propertyName;

node.Attributes.Append(customPropsDoc.CreateAttribute("fmtid"));
node.Attributes["fmtid"].Value = "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}";
node.Attributes.Append(customPropsDoc.CreateAttribute("pid"));
node.Attributes["pid"].Value = pidValue;
valueNode = customPropsDoc.CreateElement(propertyTypeName, customVTypesSchema);
valueNode.InnerText = propertyValueString;
node.AppendChild(valueNode);
rootNode.AppendChild(node);
retVal = true;

The remaining code creates the property element, adds its attributes, and then appends the node to root node. The final step returns the Boolean value indicating whether the operation succeeded.

The following code changes the print orientation of a document.

public enum PrintOrientation
{
   Landscape,
   Portrait,
}

public static void WDSetPrintOrientation(string docName, PrintOrientation newOrientation)
{
   // Given a document name, set the print orientation for all the sections of the document.

   // Example:
   // WDSetPrintOrientation(@"C:\Samples\SetOrientation.docx", PrintOrientation.Landscape); 

   const string wordmlNamespace = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";

   using (WordprocessingDocument wdPackage = WordprocessingDocument.Open(docName, true))
   {
      // Get the officeDocument part.
      MainDocumentPart documentPart = wdPackage.MainDocumentPart;

      // Load the officeDocument part into an XML document.
      XmlDocument doc = new XmlDocument();
      doc.Load(documentPart.GetStream());

      // Manage namespaces to perform XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("w", wordmlNamespace);

      XmlNodeList nodes = doc.SelectNodes("//w:sectPr/w:pgSz", nsManager);
      foreach (System.Xml.XmlNode node in nodes)
      {
         // Retrieve the current orientation for the section.
         // Assume the orientation is portrait.
         PrintOrientation orientation = PrintOrientation.Portrait;
         XmlAttribute attr = node.Attributes["w:orient"];
         if (attr != null)
         {
            switch (attr.Value)
            {
               case "portrait":
                  orientation = PrintOrientation.Portrait;
                  break;
               case "landscape":
                  orientation = PrintOrientation.Landscape;
                  break;
            }
         }

         // Compare the current orientation to the requested orientation.
         // If it is the same, get out. Otherwise, make the changes.
         if (newOrientation != orientation)
         {
            if (attr == null)
            {
               // Create the attribute. Although this is not necessary
               // when there is no change in orientation, 
               // setting it has no negative effect.
               attr = node.Attributes.Append(doc.CreateAttribute("w:orient", wordmlNamespace));
            }
            switch (newOrientation)
            {
               case PrintOrientation.Landscape:
                  attr.Value = "landscape";
                  break;
               case PrintOrientation.Portrait:
                  attr.Value = "portrait";
                  break;
            }

            XmlNode pageSizeNode = node.ParentNode.SelectSingleNode("w:pgMar", nsManager);
            if (pageSizeNode != null)
            {
               // Swap page dimensions.
               string width = null;
               string height = null;
               XmlAttribute widthAttr = null;
               XmlAttribute heightAttr = null;

               widthAttr = node.Attributes["w:w"];
               if (widthAttr != null)
               {
                  width = widthAttr.Value;
               }

               heightAttr = node.Attributes["w:h"];
               if (heightAttr != null)
               {
                  height = heightAttr.Value;
               }

               if (widthAttr != null)
               {
                  widthAttr.Value = height;
               }

               if (heightAttr != null)
               {
                  heightAttr.Value = width;
               }

               // Rotate margins. Printer settings determine how far you 
               // rotate when switching to landscape mode. Not having those
               // settings, this code rotates 90 degrees. You can 
               // modify this behavior, or make it a parameter for the 
               // procedure.
               string top = null;
               string bottom = null;
               string left = null;
               string right = null;

               XmlAttribute topAttr = null;
               XmlAttribute leftAttr = null;
               XmlAttribute bottomAttr = null;
               XmlAttribute rightAttr = null;

               topAttr = pageSizeNode.Attributes["w:top"];
               if (attr != null)
               {
                  top = topAttr.Value;
               }
               leftAttr = pageSizeNode.Attributes["w:left"];
               if (attr != null)
               {
                  left = leftAttr.Value;
               }
               rightAttr = pageSizeNode.Attributes["w:right"];
               if (attr != null)
               {
                  right = rightAttr.Value;
               }
               bottomAttr = pageSizeNode.Attributes["w:bottom"];
               if (attr != null)
               {
                  bottom = bottomAttr.Value;
               }

               if (topAttr != null)
               {
                  topAttr.Value = left;
               }
               if (leftAttr != null)
               {
                  leftAttr.Value = bottom;
               }
               if (rightAttr != null)
               {
                  rightAttr.Value = top;
               }
               if (bottomAttr != null)
               {
                  bottomAttr.Value = right;
               }
            }
         }
      }

   // Save the document XML back to its part.
   doc.Save(documentPart.GetStream());
   }
}

The code first defines an enumeration of the two print options.

public enum PrintOrientation
{
   Landscape,
   Portrait,
}

Next, the code calls the WDSetPrintOrientation, passing in a reference to the Word 2007 document and the desired print orientation, either landscape or portrait. Then you set up the WordprocessingDocument object representing the Office Open XML Format package and set a reference to the MainDocumentPart part. You create a memory-resident XML document and load in the contents of the main document part.

Next, you set up a namespace manager by using the XmlNamespaceManager object and by setting a reference to the default WordprocessingML namespace, using the w qualifier. Then you select the printer-specific nodes using the following XPath expression.

XmlNodeList nodes = doc.SelectNodes("//w:sectPr/w:pgSz", nsManager);

Next, you test the w:orient node to determine the current setting. This procedure assumes portrait orientation.

XmlAttribute attr = node.Attributes["w:orient"];
if (attr != null)
{
   switch (attr.Value)
   {
       case "portrait":
          orientation = PrintOrientation.Portrait;
          break;
       case "landscape":
          orientation = PrintOrientation.Landscape;
          break;
   }
}

The procedure then tests to see if the requested orientation is the same as the current orientation, and if so, the procedure exits.

if (newOrientation != orientation)

Otherwise, the remainder of the code changes the print attributes necessary to change the document's orientation.

As this article demonstrates, working with Word 2007 files is much easier with the Welcome to the Open XML Format SDK 1.0. I encourage you to experiment with the code in this series of articles to solve your own programming problems by using the Office Open XML API.

Show:
© 2014 Microsoft