(0) exportieren Drucken
Alle erweitern
EN
Dieser Inhalt ist in Ihrer Sprache leider nicht verfügbar. Im Folgenden finden Sie die englische Version.

OfficeTalk: Working with In-Memory Open XML Documents

Office 2010

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.

Summary:  Working with Open XML Documents without loading from a file or saving to a file is important when you build applications that work with Microsoft SharePoint Server 2010 or Microsoft ASP.NET Web applications. In addition, some interesting scenarios benefit from creating an in-memory copy of an existing document. Learn how to create and work with in-memory copies of Open XML documents. (11 printed pages)

The typical way that you work with Open XML documents that use the Open XML SDK 2.0 for Microsoft Office is to open a file from the disk or a file share, modify or query the document, and then, if you modified the document, save it back to the file system. However, sometimes you want to work with Open XML documents in memory. There are four main scenarios where this is interesting:

  • When working with document libraries by using Microsoft Office SharePoint Server 2007 or SharePoint Server 2010, you retrieve a document from the document library as a byte array. You can then modify it as necessary, and then put it back into the document library, either as a new document, or replacing the original.

  • When working with document libraries using the SharePoint Foundation 2010 Managed Client Object Model you access a document from the document library as a stream. After retrieving the document, you can modify it as necessary, and then stream it back into the document library.

  • When building an ASP.NET application, you may want to create Open XML documents dynamically and serve them up to remote users. You do not want to serialize such temporary documents to the file system. After creating them, you want to send them directly to the user of the Web application.

  • In more traditional Open XML applications, you may want to open an Open XML document, do some temporary transformations on the document, such as accept all revisions, and then query the transformed document. In this case, you do not want to serialize the modified document back to the file system, as the temporary modifications to the document are for your convenience in querying the document, and not intended to be permanent. In this case, you want to make an in-memory copy of the document, transform the in-memory copy, query it, and then close the document without serializing the modified document.

This article describes the various approaches to working with documents in memory. It also presents some code that uses the Open XML SDK 2.0 to show how to implement these scenarios.

One of the overloads of the Open() method enables you to open an Open XML document from a stream. There is one important point to make about this method: you must supply a resizable memory stream to this method. You often have to create a resizable memory stream from a byte array, or you must create a resizable memory stream from a non-resizable memory stream.

When working with the SharePoint Products and Technologies object models, you retrieve documents from a document library as a byte array, and there is a MemoryStream constructor that can create a memory stream from a byte array. However, this constructor creates a non-resizable memory stream. Therefore, you cannot use that constructor to create a memory stream for use with the Open XML SDK 2.0. Instead, you must create a MemoryStream object using a constructor that takes no arguments, and then write the byte array to the newly created memory stream. The following example shows how to create a resizable memory stream from a byte array.

using System.IO;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

class Program
{
    static void Main(string[] args)
    {
        byte[] byteArray = File.ReadAllBytes("Test.docx");
        using (MemoryStream mem = new MemoryStream())
        {
            mem.Write(byteArray, 0, (int)byteArray.Length);
            using (WordprocessingDocument wordDoc =
                WordprocessingDocument.Open(mem, true))
            {
                // Modify the document as necessary.
                // For this example, we insert a new paragraph at the
                // beginning of the document.
                wordDoc.MainDocumentPart.Document.Body.InsertAt(
                    new Paragraph(
                        new Run(
                            new Text("Newly inserted paragraph."))), 0);
            }
            // At this point, the memory stream contains the modified document.
            // We could write it back to a SharePoint document library or serve
            // it from a web server.

            // In this example, we serialize back to the file system to verify
            // that the code worked properly.
            using (FileStream fileStream = new FileStream("Test2.docx",
                System.IO.FileMode.CreateNew))
            {
                mem.WriteTo(fileStream);
            }
        }
    }
}
Imports System.IO
Imports DocumentFormat.OpenXml.Packaging
Imports DocumentFormat.OpenXml.Wordprocessing

Module Module1
    Sub Main()
        Dim byteArray As Byte() = File.ReadAllBytes("Test.docx")
        Using mem As MemoryStream = New MemoryStream
            mem.Write(byteArray, 0, CInt(byteArray.Length))
            Using wordDoc As WordprocessingDocument = WordprocessingDocument.Open(mem, True)
                ' Modify the document as necessary.
                ' For this example, we insert a new paragraph at the
                ' beginning of the document.
                wordDoc.MainDocumentPart.Document.Body.InsertAt( _
                    New Paragraph( _
                        New Run( _
                            New Text("Newly inserted paragraph."))), 0)
            End Using
            ' At this point, the memory stream contains the modified document.
            ' We could write it back to a SharePoint document library or serve
            ' it from a web server.

            ' In this example, we serialize back to the file system to verify
            ' that the code worked properly.
            Using fileStream As FileStream = New FileStream("Test2.docx", _
                System.IO.FileMode.CreateNew)
                mem.WriteTo(fileStream)
            End Using
        End Using
    End Sub
End Module

The blog post Modifying Open XML Documents using the SharePoint Object Model shows how to modify an Open XML document that is stored in a SharePoint document library using the SharePoint Foundation 2010 Managed Client object model. It creates a memory stream from a byte array. Then it opens the document from the memory stream by using the Open XML SDK 2.0.

When working with the SharePoint Foundation 2010 Managed Client object model, you retrieve documents from a document library as a non-resizable memory stream. To create a resizable memory stream, you create a MemoryStream object by using the constructor that takes no arguments, and then copy the non-resizable memory stream to the newly created memory stream. The following example shows how to create a resizable memory stream from a non-resizable memory stream.

using System.IO;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

class Program
{
    static private void CopyStream(Stream source, Stream destination)
    {
        byte[] buffer = new byte[32768];
        int bytesRead;
        do
        {
            bytesRead = source.Read(buffer, 0, buffer.Length);
            destination.Write(buffer, 0, bytesRead);
        } while (bytesRead != 0);
    }

    static void Main(string[] args)
    {
        FileStream fileStream = new FileStream("Test.docx", FileMode.Open);
        using (MemoryStream memoryStream = new MemoryStream())
        {
            CopyStream(fileStream, memoryStream);
            using (WordprocessingDocument doc =
                WordprocessingDocument.Open(memoryStream, true))
            {
                // Insert a new paragraph at the beginning of the
                //document.
                doc.MainDocumentPart.Document.Body.InsertAt(
                    new Paragraph(
                        new Run(
                            new Text("Newly inserted paragraph."))), 0);
            }
            // At this point, the memory stream contains the modified document.
            // We could write it back to a SharePoint document library or serve
            // it from a web server.

            // In this example, we serialize back to the file system to verify
            // that the code worked properly.
            using (FileStream fileStream2 = new FileStream("Test2.docx",
                System.IO.FileMode.CreateNew))
            {
                memoryStream.WriteTo(fileStream2);
            }
        }
    }
}

Imports System.IO
Imports DocumentFormat.OpenXml.Packaging
Imports DocumentFormat.OpenXml.Wordprocessing

Module Module1
    Private Sub CopyStream(ByRef source As Stream, ByRef destination As Stream)
        Dim buffer(32768) As Byte
        Dim bytesRead As Integer = -1
        Do While bytesRead <> 0
            bytesRead = source.Read(buffer, 0, buffer.Length)
            destination.Write(buffer, 0, bytesRead)
        Loop
    End Sub

    Sub Main()
        Using fileStream As FileStream = New FileStream("Test.docx", FileMode.Open)
            Using memoryStream As MemoryStream = New MemoryStream()
                CopyStream(fileStream, memoryStream)
                Using doc As WordprocessingDocument = WordprocessingDocument.Open(memoryStream, True)
                    ' Insert a new paragraph at the beginning of the
                    ' document.
                    doc.MainDocumentPart.Document.Body.InsertAt( _
                        New Paragraph( _
                            New Run( _
                                New Text("Newly inserted paragraph."))), 0)
                End Using
                ' At this point, the memory stream object contains the modified document.
                ' We could write it back to a SharePoint document library or serve
                ' it from a web server.

                ' In this example, we serialize back to the file system to verify
                ' that the code worked properly.
                Using fileStream2 As FileStream = New FileStream("Test2.docx", _
                    System.IO.FileMode.CreateNew)
                    memoryStream.WriteTo(fileStream2)
                End Using
            End Using
        End Using
    End Sub
End Module

The article Using the SharePoint Foundation 2010 Managed Client Object Model shows how to modify an Open XML document that is stored in a SharePoint document library using the SharePoint Foundation 2010 Managed client object model. It uses the technique of creating a resizable memory stream from a non-resizable memory stream, and then opening the document from the memory stream by using the Open XML SDK 2.0.

There are several circumstances where you want to make an in-memory copy of an Open XML document, make some temporary modifications to the in-memory copy, query the in-memory copy, and then close the document without making any modifications to the original document. For example, before querying a document, you may want to first accept all tracked revisions in the document.

Important noteImportant

Processing the contents of a document without first accepting tracked revisions is a very difficult programming task. There are more than 40 elements and attributes that are used in WordprocessingML to track revisions, and some of these elements and attributes have fairly involved semantics. By first accepting tracked revisions, you greatly simplify your programming tasks. However, you may not want to modify the original document. This section shows a simple approach that enables you to make an in-memory copy of a document, accept tracked revisions, and then query the document without modifying the original.

The simplest way to accept tracked revisions is to read the document into a byte array, write the byte array to a resizable memory stream, accept tracked revisions for the document that is open in the memory stream, and then query the document per your requirements.

This example uses the RevisionAccepter class that is part of the PowerTools for Open XML project, which is an open source project on CodePlex. PowerTools for Open XML is licensed under the Microsoft Public License (Ms-PL), which gives you wide latitude in how you use the code. To download the RevisionAccepter class, which the following example uses, see PowerTools for Open XML, click the Downloads tab, and download AcceptRevisions.zip.

The following example shows using the RevisionAccepter class to accept revisions. It then writes the XML for the first paragraph to the console by using the strongly-typed object model of the Open XML SDK 2.0.

using System;
using System.IO;
using System.Linq;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

class Program
{
    static void Main(string[] args)
    {
        byte[] byteArray = File.ReadAllBytes("Test.docx");
        using (MemoryStream mem = new MemoryStream())
        {
            mem.Write(byteArray, 0, (int)byteArray.Length);
            using (WordprocessingDocument wordDoc =
                WordprocessingDocument.Open(mem, true))
            {
                RevisionAccepter.AcceptRevisions(wordDoc);
                Paragraph p = wordDoc.MainDocumentPart.Document.Body
                    .Elements<Paragraph>().FirstOrDefault();
                XElement e = XElement.Parse(p.OuterXml);
                Console.WriteLine(e);
            }
        }
    }
}

To test this code, create a document named Test.docx in the directory where the example will run. Insert a paragraph into the document, turn on revision tracking, and make revisions in the first paragraph. When I run this for the sample document that I created (that included tracked revisions in the first paragraph), the example produces the following output.

<w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
  <w:r>
    <w:t xml:space="preserve">This is a </w:t>
  </w:r>
  <w:r>
    <w:t xml:space="preserve">simple </w:t>
  </w:r>
  <w:r>
    <w:t>test.</w:t>
  </w:r>
</w:p>

As you can see, the output contains no tracked revisions. For detailed information about accepting revisions in an Open XML document see Accepting Revisions in Open XML Word-Processing Documents.

A number of Open XML development scenarios require working with in-memory documents. A few simple techniques enable you to work with in-memory documents as easily as working with documents stored in file systems.

To get started with Open XML, see Open XML Developer Center on MSDN. There is lots of content there. This includes articles, how-to videos, and links to many blog posts. In particular, the following links provide important information to start to work with the Open XML SDK 2.0:

  1. Download: Open XML SDK 2.0

  2. Article: Creating Documents by Using the Open XML Format SDK 2.0 (Part 1 of 3)

  3. Article: Creating Documents by Using the Open XML Format SDK 2.0 (Part 2 of 3)

  4. Article: Creating Documents by Using the Open XML Format SDK 2.0 (Part 3 of 3)

Anzeigen:
© 2014 Microsoft