Creating Documents by Using the Open XML Format SDK 2.0 CTP (Part 3 of 3)

Summary: See how to accomplish common scenarios by using the Open XML Format SDK APIs. Create Microsoft Office PowerPoint 2007 presentations from data in a database, assemble Microsoft Office Word 2007 documents from smaller documents, and bind content controls to custom XML, all by using the Open XML Format SDK 2.0 (CTP) APIs (12 printed pages)

Zeyad Rajabi, Microsoft Corporation

Frank Rice, Microsoft Corporation

February 2009

Applies to: Microsoft Office Excel 2007, Microsoft Office Word 2007, Microsoft Office Word 2007

Contents

  • Creating a Presentation Report Based on Data

  • Merging Multiple Word Documents

  • Binding Content Controls to Custom XML

  • Summary

  • Additional Resources

Introduction to This Series of Articles

This series of articles explores the overall design of the Open XML Format SDK 2.0 Community Technical Preview (CTP) with respect to goals and scenarios. Creating Documents by Using the Open XML Format SDK Version 2.0 CTP (Part 1 of 3) dives deeply into the architecture of the Open XML Format SDK. Creating Documents by Using the Open XML Format SDK 2.0 CTP (Part 2 of 3) and Creating Documents by Using the Open XML Format SDK 2.0 (Part 3 of 3) present common scenario and show you lots of sample code.

You can access the Open XML Format SDK 2.0 (CTP) from the following locations:

Download sample: Sample Code downloads for the Open XML Format SDK 2.0 (CTP).

Creating a Presentation Report Based on Data

The Open XML Format SDK provides a set of Microsoft .NET Framework application programming interfaces (APIs) that allows you to create and manipulate documents in the Open XML Formats in both client and server environments without requiring the Microsoft Office client applications.

The following solutions use the Open XML Format SDK 2.0 (CTP).

In the following section, you go through the steps to create a rich Microsoft Office PowerPoint 2007 presentation report from data in a database.

Scenario: Document Assembly

Imagine a scenario where you are a developer working for a fictional company called Adventure Works. In this company, a database is used to store all data pertaining to its sales force. The company needs to track the contact information, territories they own, total sales, and bonuses for the sales team. You will build a report generation tool that can take this data and create a Microsoft Office PowerPoint 2007 presentation. The sales team wants this solution to run on the server, so the PowerPoint client application cannot be used.

Solution

Before discussing the details of the solution, there are two prerequisites:

  • The solution is based on the Adventure Works database built for Microsoft SQL Server 2005. The Adventure Works database can be downloaded at Adventure Works database

  • The solution requires that you update the Adventure Works database to include contact photos.

Download sample: Sample Code downloads for the Open XML Format SDK 2.0 (CTP).

Step 1 – Create a Template

In this scenario you read sales personal data and create a PowerPoint presentation report. First, you need to create a presentation template for the solution. In this instance, the presentation template should contain a slide with placeholder regions for the following information:

  • Sales person contact information, such as name, e-mail address and photo

  • Sales summary information, such as territory owned by sales person, total sales made by sales person, sales quota

  • Extra information, such as bonus amount and commission percentage

The slide template looks like the following figure.

Figure 1. The sales personal contact template

sales personal contact template

Step 2 – Clone the Slide Template

For each sales person in Adventure Works, clone the slide template, shown in figure 1, and fill in the necessary information with data from the database. The code:

  • Creates a slide for each sales person

  • Copies the content from the slide template to a new slide

  • Ensures that the new slide references the slide template's slide layout

  • Adds the new slide to the end of the presentation (similar to adding a node to the end of a linked list)

SlidePart CloneSlidePart(PresentationPart presentationPart, SlidePart slideTemplate) 
{ 
   //Create a new slide part in the presentation. 
   SlidePart newSlidePart = presentationPart.AddNewPart<SlidePart>("newSlide" + i); 
   i++; 
   //Add the slide template content into the new slide. 
   newSlidePart.FeedData(slideTemplate.GetStream(FileMode.Open)); 
   //Make sure the new slide references the proper slide layout. 
   newSlidePart.AddPart(slideTemplate.SlideLayoutPart); 
   //Get the list of slide ids. 
   SlideIdList slideIdList = presentationPart.Presentation.SlideIdList; 
   //Deternmine where to add the next slide (find max number of slides). 
   uint maxSlideId = 1; 
   SlideId prevSlideId = null; 
   foreach (SlideId slideId in slideIdList.ChildElements) 
   { 
      if (slideId.Id > maxSlideId) 
      { 
         maxSlideId = slideId.Id; 
         prevSlideId = slideId; 
      } 
   } 
   maxSlideId++; 
   //Add the new slide at the end of the deck. 
   SlideId newSlideId = slideIdList.InsertAfter(new SlideId(), prevSlideId); 
   //Make sure the id and relid are set appropriately. 
   newSlideId.Id = maxSlideId; 
   newSlideId.RelationshipId = presentationPart.GetIdOfPart(newSlidePart); 
   return newSlidePart; 
}

Step 3 – Swap the Placeholder Text

At this point, you cloned the slide template and added it to the presentation. Next, replace the placeholder text in the new slide with the appropriate data. The following method locates all placeholder locations and replaces the placeholder text with strings from a given slide part.

void SwapPlaceholderText(SlidePart slidePart, string placeholder, string value) 
{ 
   //Find and get all the placeholder text locations. 
   List<Drawing.Text> textList = slidePart.Slide.Descendants<Drawing.Text>().Where(t => t.Text.Equals(placeholder)).ToList(); 
   //Swap the placeholder text with the text from DB 
   foreach (Drawing.Text text in textList) text.Text = value; 
}

Step 4 – Swap out Placeholder Photo

The slide template includes one placeholder picture that is intended to represent the image of a sales person. To replace this photo, the code:

  • Adds an image part to the new slide based on the slide template

  • Inserts the image data into the newly-added image part

  • Places the photo reference from the placeholder image into the new image

//Add an image to the new slide. 
ImagePart imagePart = newSlide.AddImagePart(ImagePartType.Gif, imgId); 
//Add image data to new image part. 
imagePart.FeedData(new MemoryStream(item.Employee.Contact.Photo.ToArray())); 
...
//Swap photo reference and save 
SwapPhoto(newSlide, imgId);

Replacing the photo is simple because images are referenced to by using Ids. In this instance, find the image reference and replace it with the image reference of the new image. The following code accomplishes this.

void SwapPhoto(SlidePart slidePart, string imgId) 
{ 
   //Find the placeholder image. 
   Drawing.Blip blip = slidePart.Slide.Descendants<Drawing.Blip>().First(); 
   //Swap the placeholder image with the image from the database. 
   blip.Embed = imgId; 
   //Save the part. 
   slidePart.Slide.Save(); 
}

Step 5 – Delete Template Slide

To complete this solution, delete the slide template. As you know the relationship Id of the slide template, the following code deletes the slide.

void DeleteTemplateSlide(PresentationPart presentationPart, SlidePart slideTemplate) 
{ 
   //Get the list of slide ids. 
   SlideIdList slideIdList = presentationPart.Presentation.SlideIdList; 
   //Delete the template slide reference. 
   foreach (SlideId slideId in slideIdList.ChildElements) 
   { 
      if (slideId.RelationshipId.Value.Equals("rId3")) slideIdList.RemoveChild(slideId); 
   } 
   //Delete the template slide. 
   presentationPart.DeletePart(slideTemplate); 
}

The Result

Running the code produces the output in the following figure.

Figure 2. The resulting slide deck

resulting slide deck

Merging Multiple Word Documents

A common request for word-processing documents is to merge multiple documents into a single document. In this section, you use altChunks and the Open XML Format SDK 2.0 to create a robust document assembly solution.

Scenario: Document Assembly

In this scenario, you are a developer for a book publisher company that specializes in education based books. This company typically has one or more authors write chapters for a given book. Each of these chapters is written as a separate document. In this scenario, the company commissioned a book on the solar system where the book is divided into a separate chapter each element of the solar system such as the planets and the sun. You need to write a solution to merge all these documents into a single document.

Solution

Before describing the details of this solution, there are a couple of different approaches to solve this problem:

  • Use altChunks to merge the documents. For more information about altChunks, see Leveraging Content in Other Formats.

  • Manually merge documents together by using copy and paste

Using altChunks is the easier of the two choices. Not only can you use altChunks for WordprocessingML documents, you can also use them with HTML, XML, RTF, and plain text.

Manually merging multiple documents is feasible but requires you to handle a number of issues. For example, you need to deal with conflicts related to different styles, bullets, numbering, comments, headers, and footers.

NoteNote

Eric White wrote a blog on how to use altChunks for document assembly using the Open XML Format SDK 1.0: How to Use AltChunks for Document Assembly.

Step 1 – Create a Template

The first step is to create a template for the book. In the template, you merge chapters into specific locations within the template by using content controls. Content controls allow you to uniquely identify a specific region within a document. For more information about content controls, see Meet the Controls.

Each content control you add to the template has the name of the chapter for that location. For example, as shown in the following figure, there is a content control named Earth.

Figure 3. The solar-system template document

solar-system template document

Step 2 – Find Specific Content Controls

The template is complete so you programmatically locate content controls based on the chapter titles. This task is easy with the Open XML Format SDK 2.0. The following code opens the document and finds all content controls represented as SdtBlock with an alias set to the source file to merge.

MergeSourceDocument(string sourceFile, string destinationFile) 
{ 
   using (WordprocessingDocument myDoc = WordprocessingDocument.Open(destinationFile, true)) 
   { 
      MainDocumentPart mainPart = myDoc.MainDocumentPart; 
      //Find content controls that have the name of the source file as 
      // an alias value. 
      List<SdtBlock> sdtList = mainPart.Document.Descendants<SdtBlock>().Where(s =>
         sourceFile.Contains(s.SdtProperties.GetFirstChild<Alias>().Val.Value)).ToList(); 
      ... 
   } 
}

Step 3 – Add altChunk and Replace the Content Control

In this step, you replace the content controls with the document text by using altChunks. You can merge the documents with altChunks by doing the following tasks:

  • Adding the altChunk part to the document package

  • Importing data from the subdocument into the altChunk part

  • Adding a reference to altChunk into the main document part

The following code accomplishes these tasks.

if (sdtList.Count != 0) 
{ 
   string altChunkId = "AltChunkId" + id; 
   id++; 
   AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId); 
   chunk.FeedData(File.Open(sourceFile, FileMode.Open)); 
   AltChunk altChunk = new AltChunk(); 
   altChunk.Id = altChunkId; 
   //Replace the content control with altChunk information. 
   foreach (SdtBlock sdt in sdtList) 
   { 
      OpenXmlElement parent = sdt.Parent; 
      parent.InsertAfter(altChunk, sdt); 
     sdt.Remove(); 
   } 
... 
}

The Result

Running the code produces the solar-system book divided into a chapter for each components of the solar-system. To summarize, using altChunks automatically ensures the following:

  • The final document has consistent styles applied.

  • Images, comments, and tracked changes are all included as part of the merged document.

  • Bullets and numbering work as expected.

The following figure shows the final document.

Figure 4. The assembled solar-system document

assembled solar-system document

Binding Content Controls to Custom XML

In the previous section, you used content controls to provide semantic structure within a document. In this section, you use content controls to bind to custom XML.

Scenario: Generate Sales Contracts on the Server

In this scenario, you are a developer for a law firm that specializes in writing legal contracts to sell various properties. The company uses the same template for all property contracts. The only difference in the contracts is the data used. For example, the contract displays who is selling the property and the address of the property. You need to write a solution that allows lawyers to automatically insert the data into the template. Additional, the solution should generate the document on the server.

Solution

This solution is based on concepts described in a post from the Word team blog: Separate Yet Equal. This solution uses content controls to bind to custom XML. By using bound content controls, you are separating the presentation of the documentation from the data stored in a separate custom XML part. Binding content controls gives the following functionality:

  • When a user types into a bound content control, the data in the custom XML is also updated.

  • When the data in the custom XML part is updated, the content in the bound content control is also updated.

To accomplish the server requirement, the solution is built with Microsoft ASP.NET. The solution’s Web site contains several form fields representing the data to insert into the document. To insert the data into the document, you generate a custom XML file based on this data and insert the file into the WordprocessingML package.

Download sample: Sample Code downloads for the Open XML Format SDK 2.0 (CTP).

Step 1 – Create a Template

The first step is to set up the template. In this instance, the template is a sales contract with content controls located where data is to be inserted. The template looks like the following figure.

Figure 5. The sales contract template

sales contract template

These content controls delineate semantic regions and bind to the custom XML. This binding is accomplished by using the namespace of the custom XML file and using a XPath expression to identify the element to bind to. The following is an example of XML markup that specifies this binding.

<w:dataBinding w:prefixMappings="xmlns:ns0='http://contoso.com/2005/contracts/commercialSale'
 " w:xpath="/ns0:contract[1]/ns0:dateExecuted[1]" w:storeItemID="{ABB284D9-2C5E-41BD-A2F2-B5FC934955A9}"/>

There are three ways to bind to content controls:

  • By using Content Control Tool Kit. This Word add-in makes binding content controls to XML easy and intuitive.

  • By using the Word object model. You can use ContentControl.XMLMapping.SetMapping() method.

  • By directly manipulating the underlying XML. You can use the Open XML Format SDK 2.0 for this.

In this template, you bind the content controls to an empty custom XML file. This custom XML file contains no data. Binding to an empty custom XML file ensures that just the placeholder text of the content controls is shown.

Step 2 – Create the ASP.NET Front-End Web site

The next step is to create a front-end Web site that allows users to insert data into the document. For this step, you create a simple Microsoft ASP.NET site that looks similar to the following figure.

Figure 6. The ASP.NET Web site

ASP.NET Web site

In the back end of the Web site, a custom XML file containing the data is inserted into the Wordprocessing document.

Step 3 – Replacing Custom XML

After the custom XML file is created, it is inserted the WordprocessingML document. The following code opens the document and adds this XML file as a custom XML part.

protected void ReplaceCustomXML(string fileName, string customXML) 
{ 
   using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(fileName, true)) 
   { 
      MainDocumentPart mainPart = wordDoc.MainDocumentPart; 
      mainPart.DeleteParts<CustomXmlPart>(mainPart.CustomXmlParts); 
      //Add a new customXML part and then add the content. 
      CustomXmlPart customXmlPart = mainPart.AddNewPart<CustomXmlPart>(); 
      //Copy the XML into the new part. 
      using (StreamWriter ts = new StreamWriter(customXmlPart.GetStream())) ts.Write(customXML); 
   } 
}

Step 4 – Generating the Resulting Document

After the document contains the data, you save the file as seen in the following code.

protected void GenerateContractButton_Click(object sender, EventArgs e) 
{ 
   string strTemp = Environment.GetEnvironmentVariable("temp"); 
   string strFileName = String.Format("{0}\\{1}.dotx", strTemp, Guid.NewGuid().ToString()); 
   File.Copy(Server.MapPath(@"App_Data/Contract of Sale.dotx"), strFileName); 
   GetData(); 
   string customXml = File.ReadAllText(Server.MapPath(@"App_Data/datatemp.xml")); 
   ReplaceCustomXML(strFileName, customXml); 
   //Return it to the client - strFile has been updated, so return it. 
   Response.ClearContent(); 
   Response.ClearHeaders(); 
   Response.AddHeader("content-disposition", "attachment; filename=Conract of Sale.dotx"); 
   Response.ContentEncoding = System.Text.Encoding.UTF8; 
   Response.TransmitFile(strFileName); 
   Response.Flush(); 
   Response.Close(); 
   //Delete the temp file. 
   File.Delete(strFileName); 
   File.Delete(Server.MapPath(@"App_Data/datatemp.xml")); 
}

You generated the document with content controls bound to the XML data. The resulting file looks similar to the following figure.

Figure 7. The document with data-bound content controls

document with data-bound content controls

Note

In this solution, you only manipulated the parts within the WordprocessingML document. This means this code is also fully functional with the Open XML Format SDK 1.0.

To show how fast this solution is, executing the code to create 100 documents on a server took just 1.166 seconds.

Summary

In Creating Documents by Using the Open XML Format SDK Version 2.0 CTP (Part 1 of 3), Creating Documents by Using the Open XML Format SDK 2.0 CTP (Part 2 of 3), Creating Documents by Using the Open XML Format SDK 2.0 (Part 3 of 3), you see just a sample of ways that the Open XML Format SDK APIs can simplify integrating various data sources with programs in the 2007 version of the Microsoft Office system. With a little imagination, you can modify these applications to fit your own requirements.

Additional Resources

For more information about the Open XML Format SDK see the following resources: