Generating Excel 2010 Workbooks by using the Open XML SDK 2.0

Summary: Learn how to use the Open XML SDK 2.0 to manipulate a Microsoft Excel 2010 workbook.

Applies to: Excel 2010 | Office 2010 | Open XML | SharePoint Server 2010 | VBA

Published:  April 2011

Provided by:  Steve Hansen, Grid Logic

Contents

  • Introduction to the Open XML File Format

  • Peeking Inside an Excel File

  • Manipulating Open XML Files Programmatically

  • Manipulating Workbooks using the Open XML SDK 2.0

  • Conclusion

  • Additional Resources

  • About the Author

Download the sample code

Introduction to the Open XML File Format

Open XML is an open file format for the core document-oriented Office applications. Open XML is designed to be a faithful replacement for existing word-processing documents, presentations, and spreadsheets that are encoded in binary formats that are defined by the Microsoft Office applications. Open XML file formats offer several benefits. One benefit is that the Open XML file formats ensure that data that is contained in documents is can be accessed by any program that understands the file format. This helps ensure organizations that the documents they create today will be available in the future. Another benefit is that it facilitates document creation and manipulation in server environments or other environments where it is not possible to install the Office client applications.

True to its moniker, Open XML files are represented by using XML. However, instead of representing a document by using a single, large XML file, an Open XML document is actually represented by using a collection of related files, called parts, that are stored in a package and then compressed in a ZIP archive. An Open XML document package complies with the Open Packaging Conventions (OPC) specification, a container-file technology to store a combination of XML and non-XML files that collectively form a single entity.

Peeking Inside an Excel File

One of the best ways to gain an initial understanding of how everything works together is to open a workbook file and take a look at the pieces. To examine the parts of an Microsoft Excel 2010 workbook package, merely change the file name extension from .xlsx to .zip. As an example, consider the workbook shown in Figures 1 and 2.

Figure 1. Simple workbook

Simple Workbook

This workbook contains two worksheets: Figure 1 shows a worksheet containing sales by year while the worksheet shown in Figure 2 contains a simple chart.

Figure 2. Basic chart in a workbook

Basic chart in a workbook

By changing the name of this workbook from Simple Sales Example.xlsx to Simple Sales Example.zip, you can inspect the structure of parts within the file container or package using Windows Explorer.

Figure 3. Part structure of a simple workbook

Part structure of a simple workbook

Figure 3 shows the primary folders inside the package along with the parts stored in the worksheets folder. Digging a bit deeper, Figure 4 provides a peek at the XML encountered in the part named sheet1.xml.

Figure 4. Example of the XML inside a worksheet part

Example of the XML inside a worksheet part

The XML shown in Figure 4 provides the necessary information that Excel needs to represent the worksheet shown in Figure 1. For example, within the sheetData node there are row nodes. There is a row node for every row that has at least one non-empty cell. Then, within each row, there is a node for each non-empty cell.

Notice that cell C3 shown in Figure 1 contains the value 2008 in bold font. Cell C4, meanwhile, contains the value 182, but uses default formatting and does not contain bold font. The XML representation for each of these cells is shown in the Figure 4. In particular, the XML for cell C3 is shown in the following example.

      <c r="C3" s="1">
        <v>2008</v>
      </c>

To keep the size of Open XML files as compact as possible, many of the XML nodes and attributes have very short names. In the previous fragment, the c represents a cell. This particular cell specifies two attributes: r (Reference) and s (Style Index). The reference attribute specifies a location reference for the cell.

The style index is a reference to the style that is used to format the cell. Styles are defined in the styles part (styles.xml) which is found in the xl folder (see the xl folder in Figure 3). Compare cell C3’s XML with cell C4’s XML shown in the following example.

      <c r="C4">
        <v>182</v>
      </c>

Because cell C4 uses default formatting, you do not have to specify a value for the style index attribute. Later in this article, you learn a little more about how to use style indexes in an Open XML document.

Although it is very helpful to learn more about the nuances of the Open XML file formats, the real purpose of this article is to show how to use the Open XML SDK 2.0 for Microsoft Office to programmatically manipulate Open XML documents, specifically Excel workbooks.

Manipulating Open XML Files Programmatically

One way to programmatically create or manipulate Open XML documents is to use the following high-level pattern:

  1. Open/create an Open XML package

  2. Open/create package parts

  3. Parse the XML in the parts that you need to manipulate

  4. Manipulate the XML as required

  5. Save the part

  6. Repackage the document

Everything except steps three and four can be achieved fairly easily using the classes found in the System.IO.Packaging namespace. These classes are designed to make it easy to handle working with Open XML packages and tasks associated with high-level part manipulation.

The hardest part of this process is step four, manipulating the XML. For this part it is critically necessary for the developer to have a high degree of understanding of the many tedious details required to successfully work with the many nuances of the Open XML file formats. For example, previously you learned that formatting information for a cell is not stored with a cell. Instead, the formatting details are defined as a style in a different document part and the style index associated with the style is what Excel stores inside a cell.

Even with a generous knowledge of the Open XML specification, the thought of manipulating so much raw XML programmatically is not a task that many developers look forward too. That is where the Open XML SDK 2.0 comes in.

The Open XML SDK 2.0 was developed to simplify manipulating Open XML packages and the underlying Open XML schema elements inside a package. The Open XML SDK 2.0 encapsulates many common tasks that developers perform on Open XML packages so that instead of working with raw XML, you can use .NET classes that give you many design-time advantages such as IntelliSense support and a type-safe development experience.

Note

Download the Open XML SDK 2.0 from Microsoft Download Center.

Manipulating Workbooks using the Open XML SDK 2.0

In order to show you the process of manipulating an Excel workbook using the Open XML SDK 2.0, this article walks through building a report generator. Envision that you work for a stock brokerage firm named Contoso. Contoso’s ASP.NET website enables clients to log on and view various portfolio reports online. However, a common user request is the ability to view or download reports in Excel so that they may perform additional ad hoc portfolio analysis.

Note

To simplify trying this code on your own, the following example builds a console-based application. That said, the techniques that are used in this example would be 100% compatible with an ASP.NET website. There is absolutely no requirement for Microsoft Excel in this example.

The desired result is a process that, given a client, generates an Excel portfolio report. There are two general approaches to this kind of process. One approach is to generate all of the document from scratch. For simple workbooks with little or no formatting, this approach is appropriate. The second approach, creating documents that use a template, is generally the preferred method. Note that the use of the Word template here refers not to actual Excel templates (*.xltx). Instead, it refers to the use of a workbook (*.xlsx) that contains all the desired formatting, charts, and so on that are desired in the final workbook. To use the template, the first step of the process is to make a copy of the template file. Then, you add the data associated with the client you are building a report for.

Figure 5. Sample of the Portfolio report

Sample of the Portfolio Report

Setting up the Project

To create a portfolio report generator, open up Microsoft Visual Studio 2010 and create a new Console application named PortfolioReportGenerator.

Note

To download the sample C# and Visual Basic .NET projects, click Download the Code Sample.

Figure 6. Create the Portfolio Report Generator Solution

Create the Portfolio Report Generator Solution

Next, add two classes to the project: PortfolioReport and Portfolio. The PortfolioReport class is the key class that performs all of the document manipulation using the Open XML SDK 2.0. The Portfolio class is basically a data structure that contains the necessary properties to represent a client portfolio.

Note

The Portfolio class is detailed in this change. It is a data container together with some test data and has no code related to Open XML or the Open XML SDK 2.0.

Before you write any code, the first step in any project involving Open XML and the Open XML SDK 2.0 is to add the necessary references to the project. Two specific references are needed: DocumentFormat.OpenXml and WindowsBase.

DocumentFormat.OpenXml contains the classes that are installed with the Open XML SDK 2.0. If you cannot find this reference after you install the Open XML SDK 2.0, you can browse for it. By default it is located at C:\Program Files (x86)\Open XML SDK\V2.0\lib\. This reference is required only if you plan to use the Open XML SDK 2.0. If you would rather manipulate Open XML documents by tweaking raw XML, you do not need this reference.

WindowsBase includes the classes in the System.IO.Packaging namespace. This reference is required for all Open XML projects whether you are using the Open XML SDK 2.0 or not. The classes in the System.IO.Packaging namespace provide functionality to open Open XML packages. In addition, there are classes that enable you to manipulate (add, remove, edit) parts inside an Open XML package.

At this point, your project should resemble Figure 7.

Figure 7. Portfolio Report Generator project after initial project setup

Portfolio Report Generator project after setup

Initializing the Portfolio Report

As mentioned earlier, the report generation process works by creating a copy of the report template and then adding data to the report. The report template is a pre-formatted Excel workbook named PortfolioReport.xlsx. Add a constructor to the PortfolioReport class that performs this process. In order to copy the file, you must also have to import the System.IO namespace. While adding the System.IO namespace, add the namespaces related to the Open XML SDK 2.0.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using DocumentFormat.OpenXml;

namespace PortfolioReportGenerator
{
    class PortfolioReport
    {
        string path = "c:\\example\\";
        string templateName = "PortfolioReport.xlsx";

        public PortfolioReport(string client)
        {
            string newFileName = path + client + ".xlsx";
            CopyFile(path + templateName, newFileName);
        }

        private string CopyFile(string source, string dest)
        {
            string result = "Copied file";
            try
            {
                // Overwrites existing files
                File.Copy(source, dest, true);
            }
            catch (Exception ex)
            {
                result = ex.Message;
            }
            return result;
        }
    }
}

Notice that the PortfolioReport constructor requires a single parameter that represents the client the report is being generated for.

To avoid the need to pass parameters into methods or constantly re-open the document and extract the workbook part, add two class-scoped private variables to the PortfolioReport class. Likewise, add a class scoped private variable to hold a reference to the current Portfolio object whose data is being used to generate the report. By using these variables in place, you can then initialize them inside the PortfolioReport constructor as shown in the following example.

        string path = "c:\\example\\";
        string templateName = "PortfolioReport.xlsx";

        WorkbookPart wbPart = null;
        SpreadsheetDocument document = null;
        Portfolio portfolio = null;

        public PortfolioReport(string client)
        {
            string newFileName = path + client + ".xlsx";
            CopyFile(path + templateName, newFileName);
            document = SpreadsheetDocument.Open(newFileName, true);
            wbPart = document.WorkbookPart;
            portfolio = new Portfolio(client);
        }

This code segment highlights how easy it is to open a document and extract a part using the Open XML SDK 2.0. In the PortfolioReport constructor, the workbook file is opened by using the Open method of the SpreadsheetDocument class. SpreadsheetDocument is part of the DocumentFormat.OpenXml.Packaging namespace. SpreadsheetDocument provides convenient access to the workbook part within the document package via the property named WorkbookPart. At this point in the process, the report generator has:

  1. Created a copy of the PortfolioReport.xlsx file

  2. Named the copy after the name of the client

  3. Opened the client report for editing

  4. Extracted the workbook part

Modifying Worksheet Cell Values using the Open XML SDK

The main task that needs to be solved in order to complete the report generator is to figure out how to modify values inside an Excel workbook by using the Open XML SDK 2.0. When using Excel’s object model with Microsoft Visual Basic for Applications (VBA) or .NET, changing a cell’s value is easy. To change the value of a cell (which is a Range object in Excel’s object model), you modify the value of the Value property. For example, to change the value of cell B4 on a worksheet named Sales to the value of 250, you could use this statement:

ThisWorkbook.Worksheets("Sales").Range("B4").Value = 250

The Open XML SDK 2.0 works a bit differently. One big difference is that using the Excel object model that you can manipulate any cell on a worksheet regardless of whether it has anything in it. In other words, as far as the object model is concerned, all of the cells on a worksheet exist. When working with Open XML, objects do not exist. This is by default. If a cell does not have a value, it does not exist. This makes perfect sense when you think about it from the perspective of specifying a file format. In order to keep the size of a file as small as possible, only relevant information is saved. For example, revisit Figure 4 and observe the first row node underneath sheetData. The first row starts at 3 and skips rows 1 and 2. This is because all of the cells in the first two rows are empty. Likewise, notice that within the first row node (row 3), the address of the first cell is C3. This is because A3 and B3 are empty.

Because you cannot assume that a cell exists in an Open XML document, you must first check whether it exists and then, if it does not, add it to the file. The following example shows a method named InsertCellInWorksheet that performs this function, along with the other methods in the listing. Add these methods to the PortfolioReport class.

Note

Microsoft has code samples for many common Open XML SDK 2.0 tasks. Better yet, these samples are available as code examples that you can use within Visual Studio 2010. Some of the code in this article is based on these code examples. You can Download the sample code here.

        // Given a Worksheet and an address (like "AZ254"), either return a 
        // cell reference, or create the cell reference and return it.
        private Cell InsertCellInWorksheet(Worksheet ws, string addressName)
        {
            SheetData sheetData = ws.GetFirstChild<SheetData>();
            Cell cell = null;

            UInt32 rowNumber = GetRowIndex(addressName);
            Row row = GetRow(sheetData, rowNumber);

            // If the cell you need already exists, return it.
            // If there is not a cell with the specified column name, insert one.  
            Cell refCell = row.Elements<Cell>().
                Where(c => c.CellReference.Value == addressName).FirstOrDefault();
            if (refCell != null)
            {
                cell = refCell;
            }
            else
            {
                cell = CreateCell(row, addressName);
            }
            return cell;
        }
        
        // Add a cell with the specified address to a row.
        private Cell CreateCell(Row row, String address)
        {
            Cell cellResult;
            Cell refCell = null;

            // Cells must be in sequential order according to CellReference. 
            // Determine where to insert the new cell.
            foreach (Cell cell in row.Elements<Cell>())
            {
                if (string.Compare(cell.CellReference.Value, address, true) > 0)
                {
                    refCell = cell;
                    break;
                }
            }

            cellResult = new Cell();
            cellResult.CellReference = address;

            row.InsertBefore(cellResult, refCell);
            return cellResult;
        }

        // Return the row at the specified rowIndex located within
        // the sheet data passed in via wsData. If the row does not
        // exist, create it.
        private Row GetRow(SheetData wsData, UInt32 rowIndex)
        {
            var row = wsData.Elements<Row>().
            Where(r => r.RowIndex.Value == rowIndex).FirstOrDefault();
            if (row == null)
            {
                row = new Row();
                row.RowIndex = rowIndex;
                wsData.Append(row);
            }
            return row;
        }

        // Given an Excel address such as E5 or AB128, GetRowIndex
        // parses the address and returns the row index.
        private UInt32 GetRowIndex(string address)
        {
            string rowPart;
            UInt32 l;
            UInt32 result = 0;

            for (int i = 0; i < address.Length; i++)
            {
                if (UInt32.TryParse(address.Substring(i, 1), out l))
                {
                    rowPart = address.Substring(i, address.Length - i);
                    if (UInt32.TryParse(rowPart, out l))
                    {
                        result = l;
                        break;
                    }
                }
            }
            return result;
        }

Another difference between using Excel’s object model and manipulating an Open XML document is that when you use the Excel object model, the data kind of the value that you supply to the cell or range is irrelevant. When changing the value of a cell using Open XML however, the process varies depending on the data kind of the value. For numeric values, the process is somewhat similar to using Excel’s object model. There is a property associated with a Cell object in the Open XML SDK 2.0 named CellValue. You can use this property to assign numeric values to a cell.

Storing strings, or text, in a cell works differently. Rather than storing text directly in a cell, Excel stores it in something called a shared string table. The shared string table is merely a listing of all the unique strings within the workbook where each unique string is associated with an index. To associate a cell with a string, the cell holds a reference to the string index instead of in the string itself. When you change a cell’s value to a string, you first need to see whether the string is in the shared string table. If it is in the table, you look up the shared string index and store that in the cell. If the string is not in the shared string table, you need to add it, retrieve its string index, and then store the string index in the cell. The following example shows a method named UpdateValue used to change a cell’s values along InsertSharedStringItem to update the shared string table.

        public bool UpdateValue(string sheetName, string addressName, string value, 
                                UInt32Value styleIndex, bool isString)
        {
            // Assume failure.
            bool updated = false;

            Sheet sheet = wbPart.Workbook.Descendants<Sheet>().Where(
                (s) => s.Name == sheetName).FirstOrDefault();

            if (sheet != null)
            {
                Worksheet ws = ((WorksheetPart)(wbPart.GetPartById(sheet.Id))).Worksheet;
                Cell cell = InsertCellInWorksheet(ws, addressName);

                if (isString)
                {
                    // Either retrieve the index of an existing string,
                    // or insert the string into the shared string table
                    // and get the index of the new item.
                    int stringIndex = InsertSharedStringItem(wbPart, value);

                    cell.CellValue = new CellValue(stringIndex.ToString());
                    cell.DataType = new EnumValue<CellValues>(CellValues.SharedString);
                }
                else
                {
                    cell.CellValue = new CellValue(value);
                    cell.DataType = new EnumValue<CellValues>(CellValues.Number);
                }

                if (styleIndex > 0)
                    cell.StyleIndex = styleIndex;
                
                // Save the worksheet.
                ws.Save();
                updated = true;
            }

            return updated;
        }

        // Given the main workbook part, and a text value, insert the text into 
        // the shared string table. Create the table if necessary. If the value 
        // already exists, return its index. If it doesn't exist, insert it and 
        // return its new index.
        private int InsertSharedStringItem(WorkbookPart wbPart, string value)
        {
            int index = 0;
            bool found = false;
            var stringTablePart = wbPart
                .GetPartsOfType<SharedStringTablePart>().FirstOrDefault();

            // If the shared string table is missing, something's wrong.
            // Just return the index that you found in the cell.
            // Otherwise, look up the correct text in the table.
            if (stringTablePart == null)
            {
                // Create it.
                stringTablePart = wbPart.AddNewPart<SharedStringTablePart>();
            }

            var stringTable = stringTablePart.SharedStringTable;
            if (stringTable == null)
            {
                stringTable = new SharedStringTable();
            }

            // Iterate through all the items in the SharedStringTable. 
            // If the text already exists, return its index.
            foreach (SharedStringItem item in stringTable.Elements<SharedStringItem>())
            {
                if (item.InnerText == value)
                {
                    found = true;
                    break;
                }
                index += 1;
            }

            if (!found)
            {
                stringTable.AppendChild(new SharedStringItem(new Text(value)));
                stringTable.Save();
            }

            return index;
        }

One area of interest in the previous code example deals with formatting a cell. As mentioned earlier in this article, a cell’s format is not stored within the cell node. Instead, a cell stores a style index that points to a style that is defined in a different part (styles.xml). When using the template pattern demonstrated in this document and Excel’s object model via VBA or .NET, you typically apply formatting that you want to a range of one or more cells. As you add data to the workbook programmatically, any formatting that you applied within the range is faithfully applied.

Because Open XML files only contain information related to cells that contain data, any time that you add a new cell to the file, if the cell requires any formatting, you must update the style index. Consequently, the UpdateValue method accepts a styleIndex parameter that indicates which style index to apply to the cell. If you pass in a value of zero, no style index is set and the cell uses Excel’s default formatting.

One simple method for determining the appropriate style index for each cell is to format the workbook template file as you want and then open up the appropriate workbook parts in XML mode (shown in Figure 4) and observe the style index of the cells that you formatted.

With the methods from the previous code listing in place, generating the report is now a process of getting the portfolio data and repeatedly calling UpdateValue to create the report. Indeed, if you add the necessary code to do this, things seem to work fine except for one problem - any cell that contains a formula that refers to a cell whose value was changed via Open XML manipulation does not show the correct result. This is because Excel caches the result of a formula within the cell. Because Excel thinks it has the correct value cached, it does not recalculate the cell. Even if you have auto calculation turned on or if you press F9 to force a manual recalculation, Excel does not recalculate the cell.

The solution to this is to remove the cached value from these cells so that Excel recalculates the value as soon as the file is opened in Excel. Add the RemoveCellValue method shown in the following example to the PortfolioReport class to provide this functionality.

        // This method is used to force a recalculation of cells containing formulas. The
        // CellValue has a cached value of the evaluated formula. This
        // prevents Excel from recalculating the cell even if 
        // calculation is set to automatic.
        private bool RemoveCellValue(string sheetName, string addressName)
        {
            bool returnValue = false;

            Sheet sheet = wbPart.Workbook.Descendants<Sheet>().
                Where(s => s.Name == sheetName).FirstOrDefault();
            if (sheet != null)
            {
                Worksheet ws = ((WorksheetPart)(wbPart.GetPartById(sheet.Id))).Worksheet;
                Cell cell = InsertCellInWorksheet(ws, addressName);

                // If there is a cell value, remove it to force a recalculation
                // on this cell.
                if (cell.CellValue != null)
                {
                    cell.CellValue.Remove();
                }
                
                // Save the worksheet.
                ws.Save();
                returnValue = true;
            }

            return returnValue;
        }

To complete the PortfolioReport class, add the CreateReport method shown in the following example to the PortfolioReport class. It uses the CreateReport method UpdateValue to put portfolio information into the desired cells. After updating all of the necessary cells, it calls RemoveCellValue on each cell that needs to be recalculated. Finally, CreateReport calls the Close method on the SpreadsheetDocument to save all the changes and close the file.

        // Create a new Portfolio report
        public void CreateReport()
        {
            string wsName = "Portfolio Summary";

            UpdateValue(wsName, "J2", "Prepared for " + portfolio.Name, 0, true);
            UpdateValue(wsName, "J3", "Account # " + 
                        portfolio.AccountNumber.ToString(), 0, true);
            UpdateValue(wsName, "D9", portfolio.BeginningValueQTR.ToString(), 0, false);
            UpdateValue(wsName, "E9", portfolio.BeginningValueYTD.ToString(), 0, false);
            UpdateValue(wsName, "D11", portfolio.ContributionsQTR.ToString(), 0, false);
            UpdateValue(wsName, "E11", portfolio.ContributionsYTD.ToString(), 0, false);
            UpdateValue(wsName, "D12", portfolio.WithdrawalsQTR.ToString(), 0, false);
            UpdateValue(wsName, "E12", portfolio.WithdrawalsYTD.ToString(), 0, false);
            UpdateValue(wsName, "D13", portfolio.DistributionsQTR.ToString(), 0, false);
            UpdateValue(wsName, "E13", portfolio.DistributionsYTD.ToString(), 0, false);
            UpdateValue(wsName, "D14", portfolio.FeesQTR.ToString(), 0, false);
            UpdateValue(wsName, "E14", portfolio.FeesYTD.ToString(), 0, false);
            UpdateValue(wsName, "D15", portfolio.GainLossQTR.ToString(), 0, false);
            UpdateValue(wsName, "E15", portfolio.GainLossYTD.ToString(), 0, false);

            int row = 7;
            wsName = "Portfolio Holdings";

            UpdateValue(wsName, "J2", "Prepared for " + portfolio.Name, 0, true);
            UpdateValue(wsName, "J3", "Account # " + 
                        portfolio.AccountNumber.ToString(), 0, true);
            foreach (PortfolioItem item in portfolio.Holdings)
            {
                UpdateValue(wsName, "B" + row.ToString(), item.Description, 3, true);
                UpdateValue(wsName, "D" + row.ToString(), 
                            item.CurrentPrice.ToString(), 24, false);
                UpdateValue(wsName, "E" + row.ToString(), 
                            item.SharesHeld.ToString(), 27, false);
                UpdateValue(wsName, "F" + row.ToString(), 
                            item.MarketValue.ToString(), 24, false);
                UpdateValue(wsName, "G" + row.ToString(), 
                            item.Cost.ToString(), 24, false);
                UpdateValue(wsName, "H" + row.ToString(), 
                            item.High52Week.ToString(), 28, false);
                UpdateValue(wsName, "I" + row.ToString(), 
                            item.Low52Week.ToString(), 28, false);
                UpdateValue(wsName, "J" + row.ToString(), item.Ticker, 11, true);
                row++;
            }

            // Force re-calc when the workbook is opened
            this.RemoveCellValue("Portfolio Summary", "D17");
            this.RemoveCellValue("Portfolio Summary", "E17");

            // All done! Close and save the document.
            document.Close();
        }

Using the PortfolioReport Class

The final step (assuming you copied the source for the Portfolio class), is to add some code to the Main method in the Program class. Modify the Main method so that it contains the code shown in the following example. Note that the source for the Portfolio class includes sample data for two clients: Steve and Kelly.

        static void Main(string[] args)
        {
            PortfolioReport report = new PortfolioReport("Steve");
            report.CreateReport();
            report = new PortfolioReport("Kelly");
            report.CreateReport();
            Console.WriteLine("Reports created!");
            Console.WriteLine("Press ENTER to quit.");
            Console.ReadLine();
        }

One of the things that you notice when you run this is how fast the files are generated. This is ideal in a high-volume server scenario. The performance versus similar code that uses the Excel object model to achieve the same results is not even close - the Open XML method is much, much faster.

Conclusion

Beginning with the 2007 Microsoft Office system, the core document-centric Microsoft Office applications switched from proprietary binary file formats to Open XML file formats. The Open XML file formats are open, standards-based file formats based on XML. The switch to Open XML file formats opens up several new development opportunities for developers. That said, taking advantage of these opportunities involved investing lots of time and effort understanding the Open XML specifications and lots of tedious raw XML manipulation.

The Open XML SDK 2.0 helps reduce the learning curve with development technique by encapsulating many of the details of the Open XML specification in an easy-to-use class library for working with Open XML documents. In addition to reducing the learning curve, the Open XML SDK 2.0 lets developers be more productive by providing design-time capabilities such as IntelliSense support and a type-safe development experience.

This article demonstrated how to use the Open XML SDK 2.0 to build a portfolio report generator. This exercise demonstrated a common solution pattern and an approach for common Excel oriented tasks such as opening workbooks, referring to worksheets, retrieving cells, and updating a cell’s value.

Additional Resources

To find more information about the subjects discussed in this article, see the following resources.

About the Author

Steve Hansen is the founder of Grid Logic, a Minnesota-based consulting firm specializing in business intelligence and information worker solutions. A developer, frequent author, and speaker at technical conferences, Steve is a Microsoft MVP for his work with Visual Studio's tools for Office. One part code jockey, one part finance geek; Steve also has an MBA from the University of Minnesota with a concentration in finance.