用 Open XML 对象模型处理 Excel 2007 和 PowerPoint 2007 文件(第 1 部分,共 2 部分)

摘要: Microsoft SDK for Open XML Formats 技术预览是 Open XML Format 文件的访问库。本文是两个系列中的第一个,描述了可用于访问和处理 Microsoft Office Excel 2007 和 Microsoft Office PowerPoint 2007 文件的 Open XML 对象模型。(12 个打印页面)

Frank Rice,Microsoft 公司

2007 年 8 月

适用于:Microsoft Office Excel 2007、Microsoft Office PowerPoint 2007

目录

查看第 2 部分:用 Open XML 对象模型处理 Excel 2007 和 PowerPoint 2007 文件(第 2 部分,共 2 部分)

概述

2007 Microsoft Office 系统引入了基于 XML、名为 Open XML Formats 的新文件格式。Microsoft Office Word 2007、Microsoft Office Excel 2007 和 Microsoft Office PowerPoint 2007 都将这些格式用作默认的文件格式。Open XML formats 非常有用,因为它们是开放的标准并基于众所周知的技术:ZIP 和 XML。作为 Microsoft SDK for Open XML Formats 技术预览System.IO.Packaging 命名空间内 .NET Framework 3.0 技术的一部分,Microsoft 提供了一个库,可以访问这些文件。Open XML 对象模型基于 System.IO.Packaging API 构建,并提供强类型化的部件类来处理 Open XML 文档。SDK 简化了处理 Open XML 包这一任务。Open XML 对象模型封装了许多开发人员在 Open XML 包上执行的通用任务,因此可用很少的几行代码执行复杂的操作。

注意注意:

可以找到处理 Open XML Format 文件的其他示例以及 Open XML 对象模型中所包含的每个成员的引用,该对象模型在 2007 Office System:Microsoft SDK for Open XML Formats 中。

开放包约定标准定义了一组 XML 文件,它们包含内容并为存储在单个包中的所有文档部件定义关系。这些包为支持 Open XML Format 的 2007 Microsoft Office 程序合并了组成文档文件的部件。本文中讨论的 Open XML 对象模型允许创建包并处理组成包的文件。在这一系列文章中,描述了在 Excel 2007 和 PowerPoint 2007 中访问和处理 Open XML 包的代码。

按用户名删除工作簿中的注释

在下列代码中,删除特定用户名的文档注释。

Visual Basic
Public Sub XLDeleteCommentsByUser(ByVal fileName As String, ByVal userName As String)
   Const commentsSchema As String = "http://schemas.openxmlformats.org/spreadsheetml/2006/main"
   Dim doc As SpreadsheetDocument = SpreadsheetDocument.Open(fileName, True)
   '  Manage namespaces to perform XML XPath queries.
   Dim nt As NameTable = New NameTable
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt)
   nsManager.AddNamespace("sh", commentsSchema)
   For Each part As WorksheetPart In doc.WorkbookPart.WorksheetParts
      If (Not (part.WorksheetCommentsPart) Is Nothing) Then
         Dim commentDoc As XmlDocument = New XmlDocument
         commentDoc.Load(part.WorksheetCommentsPart.GetStream)
         Dim searchString As String = String.Format("//sh:authors/sh:author[text()='{0}']", userName)
         Dim node As XmlNode = commentDoc.SelectSingleNode(searchString, nsManager)
         If (Not (node) Is Nothing) Then
            Dim nodes As XmlNodeList = node.SelectNodes("preceding-sibling::sh:author", nsManager)
            Dim authorID As Integer = nodes.Count

            Dim commentListNode As XmlNode = commentDoc.SelectSingleNode("sh:comments/sh:commentList", nsManager)
            If (Not (commentListNode) Is Nothing) Then
               searchString = String.Format("./sh:comment[@authorId='{0}']", authorID)
               Dim comments As XmlNodeList = commentListNode.SelectNodes(searchString, nsManager)
               For Each commentNode As System.Xml.XmlNode In comments
                  commentListNode.RemoveChild(commentNode)
               Next
            End If
         End If
         '  Save the comment XML back to its part.
         commentDoc.Save(part.WorksheetCommentsPart.GetStream(FileMode.Create))
      End If
   Next
End Sub
C#
public static void XLDeleteCommentsByUser(string fileName, string userName)
{
   const string commentsSchema = "http://schemas.openxmlformats.org/spreadsheetml/2006/main";

   using (SpreadsheetDocument doc = SpreadsheetDocument.Open(fileName, true))
   {
      //  Manage namespaces to perform XML XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("sh", commentsSchema);

      foreach (WorksheetPart part in doc.WorkbookPart.WorksheetParts)
      {
         if (part.WorksheetCommentsPart != null)
         {
            XmlDocument commentDoc = new XmlDocument();
            commentDoc.Load(part.WorksheetCommentsPart.GetStream());
            string searchString = string.Format("//sh:authors/sh:author[text()='{0}']", userName);
            XmlNode node = commentDoc.SelectSingleNode(searchString, nsManager);
            if (node != null)
            {
                XmlNodeList nodes = node.SelectNodes("preceding-sibling::sh:author", nsManager);
                int authorID = nodes.Count;

                XmlNode commentListNode = commentDoc.SelectSingleNode("sh:comments/sh:commentList", nsManager);
                if (commentListNode != null)
                {
                   searchString = string.Format("./sh:comment[@authorId='{0}']", authorID);
                   XmlNodeList comments = commentListNode.SelectNodes(searchString, nsManager);
                   foreach (System.Xml.XmlNode commentNode in comments)
                   {
                      commentListNode.RemoveChild(commentNode);
                   }
                }
            }
            //  Save the comment XML back to its part.
       commentDoc.Save(part.WorksheetCommentsPart.GetStream(FileMode.Create));
         }
      }
   }
}

在此程序中传递两个参数:文件的完整路径和要删除其注释的人员的用户名。然后使用 SpreadsheetDocument 对象的 Open 方法,以 Open XML 包的形式打开输入文件。接着,使用 XmlNamespaceManager 对象并通过 sh 限制符设置对默认 SpreadsheetML 命名空间的引用,来设置命名空间管理器。

程序随后检查包中是否存在注释部分。注释部分的内容载入驻留内存的 XML 文档中。然后使用 XPath 表达式 sh:authors/sh:author,按指定的用户名搜索工作薄中的注释。如果找到用户名,则找到用户名的序数值。然后检索选定节点的索引。为此,通过检索这些节点的引用计算前面的节点数。

最后,将更新的文档注释标识流保存回注释部分中。

删除工作表

在下列代码中,从电子表格文档删除一个工作表。

Visual Basic
Public Function XLDeleteSheet(ByVal fileName As String, ByVal sheetToDelete As String) As Boolean
   Dim returnValue As Boolean = False
   '  Open the package with read/write access.
   Dim xlDoc As SpreadsheetDocument = SpreadsheetDocument.Open(fileName, True)
   Dim doc As XmlDocument = New XmlDocument
   doc.Load(xlDoc.WorkbookPart.GetStream)
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(doc.NameTable)
   nsManager.AddNamespace("d", doc.DocumentElement.NamespaceURI)
   Dim searchString As String = String.Format("//d:sheet[@name='{0}']", sheetToDelete)
   Dim node As XmlNode = doc.SelectSingleNode(searchString, nsManager)
   If (Not (node) Is Nothing) Then
      Dim relationAttribute As XmlAttribute = node.Attributes("r:id")
      If (Not (relationAttribute) Is Nothing) Then
         Dim relId As String = relationAttribute.Value
         xlDoc.WorkbookPart.DeletePart(relId)
         node.ParentNode.RemoveChild(node)
         doc.Save(xlDoc.WorkbookPart.GetStream(FileMode.Create))
         returnValue = True
      End If
   End If
   Return returnValue
End Function
C#
public static bool XLDeleteSheet(string fileName, string sheetToDelete)
{
   bool returnValue = false;
   using (SpreadsheetDocument xlDoc = SpreadsheetDocument.Open(fileName, true))
   {
      XmlDocument doc = new XmlDocument();
      doc.Load(xlDoc.WorkbookPart.GetStream());

      XmlNamespaceManager nsManager = new XmlNamespaceManager(doc.NameTable);
      nsManager.AddNamespace("d", doc.DocumentElement.NamespaceURI);

      string searchString = string.Format("//d:sheet[@name='{0}']", sheetToDelete);
      XmlNode node = doc.SelectSingleNode(searchString, nsManager);
      if (node != null)
      {
         XmlAttribute relationAttribute = node.Attributes["r:id"];
         if (relationAttribute != null)
         {
            string relId = relationAttribute.Value;
            xlDoc.WorkbookPart.DeletePart(relId);
            node.ParentNode.RemoveChild(node);
            doc.Save(xlDoc.WorkbookPart.GetStream(FileMode.Create));
            returnValue = true;
          }
      }
   }
   return returnValue;
}

在此程序中传递两个参数:工作薄的完整路径和要删除的工作表的名称。然后使用 SpreadsheetDocument 对象的 Open 方法,以 Open XML 包的形式打开输入文件。接着,将工作薄中的内容载入 XML DOM 文档。接着,使用 XmlNamespaceManager 对象并通过 d 限制符设置对默认 SpreadsheetML 命名空间的引用,来设置命名空间管理器。然后使用 //d:sheet 节点的名称属性搜索指定工作表的文档。

对于所有匹配的节点(如果存在),检索关系 Id 并删除与该 Id 对应的工作表。

最后,将更新的 SpreadsheetML 标识保存回主工作薄部件。

删除工作表中隐藏的行和列

给定一个工作薄和工作表名称,下列代码返回隐藏的行号列表,或返回隐藏的列号列表。

Visual Basic
Public Function XLDetectHiddenRowsOrCols(ByVal fileName As String, ByVal sheetName As String, ByVal detectRows As Boolean) As List(Of Integer)
   Const worksheetSchema As String = "http://schemas.openxmlformats.org/spreadsheetml/2006/main"
   '  Set up the return value list.
   Dim itemList As List(Of Integer) = New System.Collections.Generic.List(Of Integer)
   Dim xlDoc As SpreadsheetDocument = SpreadsheetDocument.Open(fileName, False)
   '  Load the contents of the workbook.
   Dim doc As XmlDocument = New XmlDocument
   doc.Load(xlDoc.WorkbookPart.GetStream)
   '  Create a namespace manager, so you can search.
   '  Add a prefix for the default namespace.
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(doc.NameTable)
   nsManager.AddNamespace("d", worksheetSchema)
   Dim searchString As String = String.Format("//d:sheet[@name='{0}']", sheetName)
   Dim sheetNode As XmlNode = doc.SelectSingleNode(searchString, nsManager)
   If (Not (sheetNode) Is Nothing) Then
      '  Get the relId attribute.
      Dim relationAttribute As XmlAttribute = sheetNode.Attributes("r:id")
      If (Not (relationAttribute) Is Nothing) Then
         Dim relId As String = relationAttribute.Value
         Dim sheetPart As WorksheetPart = CType(xlDoc.WorkbookPart.GetPartById(relId), WorksheetPart)
         '  First, get the relationship between the document and the sheet.
         '  Load the contents of the workbook.
         Dim sheetDoc As XmlDocument = New XmlDocument
         sheetDoc.Load(sheetPart.GetStream(FileMode.Open))
         If detectRows Then
             '  Retrieve the list of hidden rows.
             For Each node As System.Xml.XmlNode In sheetDoc.SelectNodes("//d:row[@hidden='1']", nsManager)
                  '  For each hidden row, add information to the output list.
                  Dim rowAttr As XmlAttribute = node.Attributes("r")
                  If (Not (rowAttr) Is Nothing) Then
                      itemList.Add(Convert.ToInt32(rowAttr.Value))
                  End If
          Next
      Else
          '  Retrieve the list of hidden columns
          For Each node As System.Xml.XmlNode In sheetDoc.SelectNodes("//d:cols/d:col", nsManager)
            Dim hiddenAttr As XmlAttribute = node.Attributes("hidden")
            If (Not (hiddenAttr) Is Nothing) Then
                If (hiddenAttr.Value = "1") Then
                   '  Get the range of columns that are hidden.
                   Dim minAttr As XmlAttribute = node.Attributes("min")
                   Dim maxAttr As XmlAttribute = node.Attributes("max")
                   If ((Not (minAttr) Is Nothing) AndAlso (Not (maxAttr) Is Nothing)) Then
                      '  Finally, add the range of values to the list.
                      Dim minValue As Integer = Convert.ToInt32(minAttr.Value)
                      Dim maxValue As Integer = Convert.ToInt32(maxAttr.Value)
                      Dim i As Integer = minValue
                      Do While (i <= maxValue)
                         itemList.Add(i)
                         i = (i + 1)
                      Loop
                   End If
                 End If
              End If
            Next
         End If
      End If
   End If
   Return itemList
End Function
C#
public static List<int> XLDetectHiddenRowsOrCols(string fileName, string sheetName, bool detectRows)
{
   const string worksheetSchema = "http://schemas.openxmlformats.org/spreadsheetml/2006/main";

   //  Set up the return value list.
   List<int> itemList = new System.Collections.Generic.List<int>();

   using (SpreadsheetDocument xlDoc = SpreadsheetDocument.Open(fileName, false))
   {
      //  Load the contents of the workbook.
      XmlDocument doc = new XmlDocument();
      doc.Load(xlDoc.WorkbookPart.GetStream());

      //  Create a namespace manager, so you can search.
      //  Add a prefix for the default namespace.
      XmlNamespaceManager nsManager = new XmlNamespaceManager(doc.NameTable);
      nsManager.AddNamespace("d", worksheetSchema);

      string searchString = string.Format("//d:sheet[@name='{0}']", sheetName);
      XmlNode sheetNode = doc.SelectSingleNode(searchString, nsManager);
      if (sheetNode != null)
      {
          //  Get the relId attribute.
          XmlAttribute relationAttribute = sheetNode.Attributes["r:id"];
          if (relationAttribute != null)
          {
            string relId = relationAttribute.Value;

            WorksheetPart sheetPart = (WorksheetPart)xlDoc.WorkbookPart.GetPartById(relId);
            //  First, get the relationship between the document and the sheet.

            //  Load the contents of the workbook.
            XmlDocument sheetDoc = new XmlDocument();
            sheetDoc.Load(sheetPart.GetStream(FileMode.Open));

            if (detectRows)
            {
                //  Retrieve the list of hidden rows.
                foreach (System.Xml.XmlNode node in sheetDoc.SelectNodes("//d:row[@hidden='1']", nsManager))
                {
                  //  For each hidden row, add information to the output list.
                  XmlAttribute rowAttr = node.Attributes["r"];
                  if (rowAttr != null)
                  {
                      itemList.Add(Convert.ToInt32(rowAttr.Value));
                  }
                }
            }
            else
            {
               //  Retrieve the list of hidden columns.
               foreach (System.Xml.XmlNode node in sheetDoc.SelectNodes("//d:cols/d:col", nsManager))
               {
                  XmlAttribute hiddenAttr = node.Attributes["hidden"];
                  if (hiddenAttr != null)
                  {
                      if (hiddenAttr.Value == "1")
                      {
                          //  Get the range of columns that are hidden.
                          XmlAttribute minAttr = node.Attributes["min"];
                          XmlAttribute maxAttr = node.Attributes["max"];
                          if (minAttr != null && maxAttr != null)
                          {
                             //  Finally, add the range of values to the list.
                             int minValue = Convert.ToInt32(minAttr.Value);
                             int maxValue = Convert.ToInt32(maxAttr.Value);
                             for (int i = minValue; i <= maxValue; i++)
                             {
                                itemList.Add(i);
                             }
                          }
                      }
                   }
                }
             }
          }
       }
   }
   return itemList;
}

在此程序中传递三个参数:Excel 2007 工作薄的完整路径、工作表的名称和指定是否搜索隐藏行的布尔值。如果 detectRowstrue,则程序返回隐藏行的列表。如果 detectRowsfalse,则返回隐藏列的列表。然后使用 SpreadsheetDocument 对象的 Open 方法,以 Open XML 包的形式打开输入文件。接着,将工作薄中的内容载入 XML DOM 文档。然后使用 XmlNamespaceManager 对象并通过 d 限制符设置对默认 SpreadsheetML 命名空间的引用,来设置命名空间管理器。然后使用 //d:sheet 节点的名称属性搜索代表指定工作表节点的文档。

如果找到匹配节点,即会获得工作薄和工作表之间的关系。首先,加载工作表的内容。如果 detectRowstrue则使用 //d:row 节点的隐藏属性检索隐藏行的列表。

注意注意:

行从 1 开始编号,列从零 (0) 开始编号。

如果 detectRowsfalse,则假定要使用 //d:cols/d:col 节点的隐藏属性搜索隐藏的列。

最后,项的列表返回给调用程序。

从工作表导出图表

给定一个 Excel 2007 工作薄和图表的标题,创建一个 Word 2007 文件并导出图表。

Visual Basic
Public Sub XLExportChart(ByVal inputFileName As String, ByVal outputFileName As String, ByVal chartTitle As String)
   Const drawingMLSchema As String = "http://schemas.openxmlformats.org/drawingml/2006/main"
   Const chartMLSchema As String = "http://schemas.openxmlformats.org/drawingml/2006/chart"
   Dim chartFound As Boolean = False
   Dim xlDoc As SpreadsheetDocument = SpreadsheetDocument.Open(inputFileName, False)
   For Each sheetPart As WorksheetPart In xlDoc.WorkbookPart.WorksheetParts
      If (sheetPart.DrawingsPart Is Nothing) Then
         Continue For
      End If
      '  Loop through all the parts related to the worksheet.
      For Each chartPart As ChartPart In sheetPart.DrawingsPart.ChartParts
         '  A chart is found. Does it have the correct title?
         '  Create a namespace manager, so you can search.
         Dim nt As NameTable = New NameTable
         Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt)
         nsManager.AddNamespace("c", chartMLSchema)
         nsManager.AddNamespace("a", drawingMLSchema)
         '  Load the chart document from the part's stream, and 
         '  search for the requested title.
         Dim chartDoc As XmlDocument = New XmlDocument(nt)
         chartDoc.Load(chartPart.GetStream)
         Dim titleNode As XmlNode = chartDoc.SelectSingleNode("//c:chart//c:title//a:t", nsManager)
         If (Not (titleNode) Is Nothing) Then
            If (String.Compare(titleNode.InnerText, chartTitle, True) = 0) Then
               Dim newDoc As WordprocessingDocument = WordprocessingDocument.Create(outputFileName, WordprocessingDocumentType.Document)
               newDoc.AddMainDocumentPart()
            newDoc.MainDocumentPart.AddPart(Of ChartPart)(chartPart)
               '  Tell the outer loops that you are finished.
               chartFound = True
             End If
         End If
      Next
      '  Because you need to export only a single chart, exit.
      If chartFound Then
         Exit Sub
      End If
   Next
End Sub
C#
public static void XLExportChart(string inputFileName, string outputFileName, string chartTitle)
{
   const string drawingMLSchema = "http://schemas.openxmlformats.org/drawingml/2006/main";
   const string chartMLSchema = "http://schemas.openxmlformats.org/drawingml/2006/chart";

   bool chartFound = false;

   using (SpreadsheetDocument xlDoc = SpreadsheetDocument.Open(inputFileName, false))
   {
      foreach (WorksheetPart sheetPart in xlDoc.WorkbookPart.WorksheetParts)
      {
          if (sheetPart.DrawingsPart == null)
            continue;
          //  Loop through all the parts related to the worksheet.
          foreach (ChartPart chartPart in sheetPart.DrawingsPart.ChartParts)
          {
              //  A chart is found. Does it have the correct title?      

              //  Create a namespace manager, so you can search.
              NameTable nt = new NameTable();
              XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
              nsManager.AddNamespace("c", chartMLSchema);
              nsManager.AddNamespace("a", drawingMLSchema);

              //  Load the chart document from the part's stream, and 
              //  search for the requested title.
              XmlDocument chartDoc = new XmlDocument(nt);
              chartDoc.Load(chartPart.GetStream());

              XmlNode titleNode = chartDoc.SelectSingleNode("//c:chart//c:title//a:t", nsManager);
              if (titleNode != null)
              {
                 if (string.Compare(titleNode.InnerText, chartTitle, true) == 0)
                 {
                    using (WordprocessingDocument newDoc = WordprocessingDocument.Create(outputFileName, WordprocessingDocumentType.Document))
                    {
                        newDoc.AddMainDocumentPart();
                        newDoc.MainDocumentPart.AddPart<ChartPart>(chartPart);
                    }

                    //  Tell the outer loops that you are finished.
                    chartFound = true;
                 }
            }
          }
          //  You need to export only a single chart, so get out now.
          if (chartFound)
          {
            break;
          }
      }
   }
}

在此程序中传递三个参数:工作薄的完整路径、输出文件的完整路径和图表的名称。

注意注意:

在此程序中,Word 2007 文档用作图表部件的临时存储库,这样就没有定义任何关系。因此,文档不能在 Word 中打开。应将此程序中的代码替换为能正确将部件添加到文档包中的代码。有关此类代码的示例,请参阅 2007 Office System:Microsoft SDK for Open XML Formats

首先使用 SpreadsheetDocument 对象的 Open 方法,以 Open XML 包的形式打开输入文件。接着,使用 XmlNamespaceManager 对象并通过 c 限制符设置对默认 chartML 命名空间的引用,通过 a 限制符设置对 drawingML 命名空间的引用,来设置命名空间管理器。

然后程序循环通过每个 chartPart 部件,使用 //c:chart//c:title//a:t XPath 表达式搜索图表标题。如果找到,则创建一个 Word 2007 文档,将 chartPart 部件导出到包中。

获得隐藏工作表的列表

在下列代码示例中,会创建一个工作薄中所有隐藏工作表的列表。

Visual Basic
Public Function XLGetHiddenSheets(ByVal fileName As String) As List(Of String)
   Dim sheets As List(Of String) = New List(Of String)()
   Dim xlDoc As SpreadsheetDocument = SpreadsheetDocument.Open(fileName, False)
   '  Get the main document part (workbook.xml).
   Dim doc As XmlDocument = New XmlDocument
   doc.Load(xlDoc.WorkbookPart.GetStream)
   '  Create a NamespaceManager to handle the default namespace, 
   '  and create a prefix for the default namespace.
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(doc.NameTable)
   nsManager.AddNamespace("default", doc.DocumentElement.NamespaceURI)
   For Each node As System.Xml.XmlNode In doc.SelectNodes("//default:sheets//default:sheet[@state='hidden']", nsManager)
      Dim sheetName As String = node.Attributes("name").Value
      sheets.Add(sheetName)
   Next
   Return sheets
End Function
C#
public static List<string> XLGetHiddenSheets(string fileName)
{
   List<string> sheets = new List<string>();

   using (SpreadsheetDocument xlDoc = SpreadsheetDocument.Open(fileName, false))
   {
      //  Get the main document part (workbook.xml).
      XmlDocument doc = new XmlDocument();
      doc.Load(xlDoc.WorkbookPart.GetStream());

      //  Create a NamespaceManager to handle the default namespace, 
      //  and create a prefix for the default namespace.
      XmlNamespaceManager nsManager = new XmlNamespaceManager(doc.NameTable);
      nsManager.AddNamespace("default", doc.DocumentElement.NamespaceURI);

      foreach (System.Xml.XmlNode node in doc.SelectNodes("//default:sheets//default:sheet[@state='hidden']", nsManager))
      {
         string sheetName = node.Attributes["name"].Value;
         sheets.Add(sheetName);
      }
   }
   return sheets;
}

在此程序中传递一个参数:工作薄的完整路径。然后使用 SpreadsheetDocument 对象的 Open 方法,以 Open XML 包的形式打开输入文件,并将数据载入一个 XML 文档。接着,使用 XmlNamespaceManager 对象并设置对默认 SpreadsheetML 命名空间的引用,来设置命名空间管理器,搜索隐藏的工作表。

然后使用 //default:sheets//default:sheet 节点的状态属性搜索文档的隐藏工作表。每个隐藏工作表的名称被添加到表单列表中。搜索完所有工作表后,表单列表返回调用程序。

检索工作表中的单元格值

在下列代码中,检索工作表中单元格的值。

Visual Basic
Public Function XLGetCellValue(ByVal fileName As String, ByVal sheetName As String, ByVal addressName As String) As String
   Const worksheetSchema As String = "http://schemas.openxmlformats.org/spreadsheetml/2006/main"
   Const sharedStringSchema As String = "http://schemas.openxmlformats.org/spreadsheetml/2006/main"
   Dim cellValue As String = Nothing
   '  Retrieve the stream containing the requested
   '  worksheet's information.
   Dim xlDoc As SpreadsheetDocument = SpreadsheetDocument.Open(fileName, False)
   '  Get the main document part (workbook.xml).
   Dim doc As XmlDocument = New XmlDocument
   doc.Load(xlDoc.WorkbookPart.GetStream)
   '  Create a namespace manager, so you can search.
   '  Add a prefix (d) for the default namespace.
   Dim nt As NameTable = New NameTable
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt)
   nsManager.AddNamespace("d", worksheetSchema)
   nsManager.AddNamespace("s", sharedStringSchema)
   Dim searchString As String = String.Format("//d:sheet[@name='{0}']", sheetName)
   Dim sheetNode As XmlNode = doc.SelectSingleNode(searchString, nsManager)
   If (Not (sheetNode) Is Nothing) Then
      '  Get the relId attribute.
      Dim relationAttribute As XmlAttribute = sheetNode.Attributes("r:id")
      If (Not (relationAttribute) Is Nothing) Then
         Dim relId As String = relationAttribute.Value
         '  Load the contents of the workbook.
         Dim sheetDoc As XmlDocument = New XmlDocument(nt)
         sheetDoc.Load(xlDoc.WorkbookPart.GetPartById(relId).GetStream)
         Dim cellNode As XmlNode = sheetDoc.SelectSingleNode(String.Format("//d:sheetData/d:row/d:c[@r='{0}']", addressName), nsManager)
         If (Not (cellNode) Is Nothing) Then
            Dim typeAttr As XmlAttribute = cellNode.Attributes("t")
            Dim cellType As String = String.Empty
            If (Not (typeAttr) Is Nothing) Then
               cellType = typeAttr.Value
            End If
            Dim valueNode As XmlNode = cellNode.SelectSingleNode("d:v", nsManager)
            If (Not (valueNode) Is Nothing) Then
               cellValue = valueNode.InnerText
            End If
            If (cellType = "b") Then
               If (cellValue = "1") Then
                  cellValue = "TRUE"
               Else
                  cellValue = "FALSE"
               End If
            ElseIf (cellType = "s") Then
               If (Not (xlDoc.WorkbookPart.SharedStringTablePart) Is Nothing) Then
                   Dim stringDoc As XmlDocument = New XmlDocument(nt)
                   stringDoc.Load(xlDoc.WorkbookPart.SharedStringTablePart.GetStream)
                   '  Add the string schema to the namespace manager.
                   nsManager.AddNamespace("s", sharedStringSchema)
                   Dim requestedString As Integer = Convert.ToInt32(cellValue)
                   Dim strSearch As String = String.Format("//s:sst/s:si[{0}]", (requestedString + 1))
                   Dim stringNode As XmlNode = stringDoc.SelectSingleNode(strSearch, nsManager)
                   If (Not (stringNode) Is Nothing) Then
                      cellValue = stringNode.InnerText
                   End If
                End If
            End If
         End If
      End If
   End If
   Return cellValue
End Function
C#
public static string XLGetCellValue(string fileName, string sheetName, string addressName)
{
   const string worksheetSchema = "http://schemas.openxmlformats.org/spreadsheetml/2006/main";
   const string sharedStringSchema = "http://schemas.openxmlformats.org/spreadsheetml/2006/main";

   string cellValue = null;

   //  Retrieve the stream containing the requested
   //  worksheet's info.
   using (SpreadsheetDocument xlDoc = SpreadsheetDocument.Open(fileName, false))
   {
      //  Get the main document part (workbook.xml).
      XmlDocument doc = new XmlDocument();
      doc.Load(xlDoc.WorkbookPart.GetStream());

      //  Create a namespace manager, so you can search.
      //  Add a prefix (d) for the default namespace.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("d", worksheetSchema);
      nsManager.AddNamespace("s", sharedStringSchema);

      string searchString = string.Format("//d:sheet[@name='{0}']", sheetName);
      XmlNode sheetNode = doc.SelectSingleNode(searchString, nsManager);
      if (sheetNode != null)
      {
         //  Get the relId attribute.
          XmlAttribute relationAttribute = sheetNode.Attributes["r:id"];
         if (relationAttribute != null)
         {
            string relId = relationAttribute.Value;
            //  Load the contents of the workbook.
            XmlDocument sheetDoc = new XmlDocument(nt);
            sheetDoc.Load(xlDoc.WorkbookPart.GetPartById(relId).GetStream());

            XmlNode cellNode = sheetDoc.SelectSingleNode(string.Format("//d:sheetData/d:row/d:c[@r='{0}']", addressName), nsManager);
            if (cellNode != null)
            {
               XmlAttribute typeAttr = cellNode.Attributes["t"];
               string cellType = string.Empty;
               if (typeAttr != null)
               {
                  cellType = typeAttr.Value;
               }

               XmlNode valueNode = cellNode.SelectSingleNode("d:v", nsManager);
               if (valueNode != null)
               {
                  cellValue = valueNode.InnerText;
               }
               if (cellType == "b")
               {
                  if (cellValue == "1")
                  {
                     cellValue = "TRUE";
                  }
                  else
                  {
                     cellValue = "FALSE";
                  }
               }
               else if (cellType == "s")
               {
                   if (xlDoc.WorkbookPart.SharedStringTablePart != null)
                   {
                      XmlDocument stringDoc = new XmlDocument(nt);
                      stringDoc.Load(xlDoc.WorkbookPart.SharedStringTablePart.GetStream());
                      //  Add the string schema to the namespace manager.
                      nsManager.AddNamespace("s", sharedStringSchema);

                      int requestedString = Convert.ToInt32(cellValue);
                      string strSearch = string.Format("//s:sst/s:si[{0}]", requestedString + 1);
                      XmlNode stringNode = stringDoc.SelectSingleNode(strSearch, nsManager);
                      if (stringNode != null)
                      {
                          cellValue = stringNode.InnerText;
                      }
                   }
                }
            }
         }
       }
   }
   return cellValue;
}

在此程序中传递三个参数:工作薄的完整路径、工作表的名称和包含要检索值的单元格地址。然后使用 SpreadsheetDocument 对象的 Open 方法,以 Open XML 包的形式打开输入文件,并将数据载入一个 XML 文档。接着,使用 XmlNamespaceManager 对象并通过 d 限制符设置对默认 worksheetSchema 命名空间的引用,通过 s 限制符设置对 sharedStringSchema 的引用,来设置命名空间管理器。sharedStringSchema 命名空间引用 SharedStringTablePart 部件,该部件包含在多个单元格内共享的字符串。

然后,通过选择 //d:sheet 节点的名称属性,在主工作簿部件中检索代表指定工作表的节点。如果找到节点,则检索工作表的关系 Id,并用其将工作表载入 XML 文档。然后检索值。在节点中,如果 t 属性包含 s,则表示是一个共享的字符串,必须在 SharedStringTablePart 部件中查找。否则,就可以直接从节点检索值。

注意注意:

此代码只单独检查布尔字符串值。

最后,程序返回单元格的值或一个布尔值,指定搜索是否成功

结束语

如本文中所示,借助 Microsoft SDK for Open XML Formats 技术预览,可以更为方便地使用 Excel 2007 和 PowerPoint 2007 文件。在本系列文章的第二部分中(即将发表),我说明了使用 Excel 2007 和 PowerPoint 2007 文件时可执行的其他通用任务。

其他资源

标记 :


Page view tracker