How to: Get all the text in a slide in a presentation (Open XML SDK)
Last modified: July 27, 2012
Applies to: Office 2013 | Open XML
In this article
Getting a PresentationDocument object
Basic Presentation Document Structure
How the Sample Code Works
Sample Code
This topic shows how to use the classes in the Open XML SDK 2.5 for Office to get all the text in a slide in a presentation programmatically.
The following assembly directives are required to compile the code in this topic.
In the Open XML SDK, the PresentationDocument class represents a presentation document package. To work with a presentation document, first create an instance of the PresentationDocument class, and then work with that instance. To create the class instance from the document call the PresentationDocument.Open(String, Boolean) method that uses a file path, and a Boolean value as the second parameter to specify whether a document is editable. To open a document for read/write access, assign the value true to this parameter; for read-only access assign it the value false as shown in the following using statement. In this code, the file parameter is a string that represents the path for the file from which you want to open the document.
The using statement provides a recommended alternative to the typical .Open, .Save, .Close sequence. It ensures that the Dispose method (internal method used by the Open XML SDK to clean up resources) is automatically called when the closing brace is reached. The block that follows the using statement establishes a scope for the object that is created or named in the using statement, in this case presentationDocument.
The basic document structure of a PresentationML document consists of the main part that contains the presentation definition. The following text from the ISO/IEC 29500 specification introduces the overall form of a PresentationML package.
The following XML code segment represents a presentation that contains two slides denoted by the ID 267 and 256.
<p:presentation xmlns:p="…" … >
<p:sldMasterIdLst>
<p:sldMasterId
xmlns:rel="http://…/relationships" rel:id="rId1"/>
</p:sldMasterIdLst>
<p:notesMasterIdLst>
<p:notesMasterId
xmlns:rel="http://…/relationships" rel:id="rId4"/>
</p:notesMasterIdLst>
<p:handoutMasterIdLst>
<p:handoutMasterId
xmlns:rel="http://…/relationships" rel:id="rId5"/>
</p:handoutMasterIdLst>
<p:sldIdLst>
<p:sldId id="267"
xmlns:rel="http://…/relationships" rel:id="rId2"/>
<p:sldId id="256"
xmlns:rel="http://…/relationships" rel:id="rId3"/>
</p:sldIdLst>
<p:sldSz cx="9144000" cy="6858000"/>
<p:notesSz cx="6858000" cy="9144000"/>
</p:presentation>
Using the Open XML SDK 2.5, you can create document structure and content using strongly-typed classes that correspond to PresentationML elements. You can find these classes in the DocumentFormat.OpenXml.Presentation namespace. The following table lists the class names of the classes that correspond to the sld, sldLayout, sldMaster, and notesMaster elements.
PresentationML Element | Open XML SDK 2.5 Class | Description |
|---|---|---|
sld | Presentation Slide. It is the root element of SlidePart. | |
sldLayout | Slide Layout. It is the root element of SlideLayoutPart. | |
sldMaster | Slide Master. It is the root element of SlideMasterPart. | |
notesMaster | Notes Master (or handoutMaster). It is the root element of NotesMasterPart. |
The sample code consists of three overloads of the GetAllTextInSlide method. In the following segment, the first overloaded method opens the source presentation that contains the slide with text to get, and passes the presentation to the second overloaded method, which gets the slide part. This method returns the array of strings that the second method returns to it, each of which represents a paragraph of text in the specified slide.
The second overloaded method takes the presentation document passed in and gets a slide part to pass to the third overloaded method. It returns to the first overloaded method the array of strings that the third overloaded method returns to it, each of which represents a paragraph of text in the specified slide.
The following code segment shows the third overloaded method, which takes takes the slide part passed in, and returns to the second overloaded method a string array of text paragraphs. It starts by verifying that the slide part passed in exists, and then it creates a linked list of strings. It iterates through the paragraphs in the slide passed in, and using a StringBuilder object to concatenate all the lines of text in a paragraph, it assigns each paragraph to a string in the linked list. It then returns to the second overloaded method an array of strings that represents all the text in the specified slide in the presentation.
Following is the complete sample code that you can use to get all the text in a specific slide in a presentation file. For example, you can use the following foreach loop in your program to get the array of strings returned by the method GetAllTextInSlide, which represents the text in the second slide of the presentation file "Myppt8.pptx."
Following is the complete sample code in both C# and Visual Basic.
|
Contribute to this article Want to edit or suggest changes to this content? You can edit and submit changes to this article using GitHub. |