Adding Value to PowerPivot Data in Excel 2010
Summary: Learn how to add value to your Microsoft PowerPivot for Excel 2010 data by using the data management features and calculated columns in PowerPivot.
Last modified: September 12, 2012
Applies to: Excel 2010 | Office 2010 | VBA
Published: March 2011
Provided by: Steve Hansen, Microsoft Visual Studio MVP and founder of Grid Logic.
The datasets that you import using Microsoft PowerPivot for Excel 2010 are often incomplete or not organized in a way that makes it easy to analyze efficiently. PowerPivot has features that can help you enhance the value of data by removing or filtering unwanted data, creating calculated columns, establishing relationships between tables, and combining data from multiple sources.
The following section discusses some strategies that you can use to improve the quality or usefulness of your data.
Cleaning data is the process of removing unwanted or incorrect data from a dataset. Cleansing data is easy in PowerPivot and is best to do when you import the data. During the import process, you can choose which columns to bring into PowerPivot, and by using filters, you can also limit which rows you import.
Once the data is in PowerPivot, you can delete a column by selecting it and pressing the Delete key. Although you can filter rows as soon as the data is in PowerPivot, any filtered data will still appear in any PivotTables. In other words, filters that you apply within the PowerPivot window only apply to the PowerPivot window.
To remove rows from a table after you import it into PowerPivot, use linked tables in PowerPivot together with the native functionality in Excel.
To use linked Excel tables in PowerPivot
Use this method to display your data in its own tab in PowerPivot. Remember, because this is a linked table, if you delete the data from the Excel worksheet, the data will no longer be available to PowerPivot.
Creating Calculated Columns
PowerPivot calculated columns are useful when you want to enhance the value of a table. For example, consider a column that contains city and state values in each row. Without separating the city and state values into separate columns, you cannot perform an analysis to sum or filter values by state. By using a calculated column, you could extract the state value and store it in a separate column.
To create a calculated column
Combining Tables in PowerPivot
For those occasions where the source data is not in a single location, you can import each source separately and then combine each piece in a single, unified table in PowerPivot. For example, consider a case in which you have two source text files. One file represents sales data from the year 2009 and the other contains sales data from the year 2010. After you import each file into PowerPivot, you can merge the tables into a single table by following these steps.
To combine tables in PowerPivot
Creating Relationships between Tables
You can also create relationships between tables by using PowerPivot. When you import data from a relational database, PowerPivot automatically detects and recreates existing relationships. You can also create relationships manually. This is a great way to combine data from multiple sources. For example, consider a table named FACTS that contains expense data by account ID. It is much more intuitive to show account names instead of account IDs. If there is a table of accounts named ACCOUNTS that contains account IDs and account names, you can create a relationship by performing the following steps.
To create a relationship between tables
The data is now linked through a relationship, and the lookup table will be used to retrieve additional data for the PivotTable or PivotChart.
Arguably one of PowerPivot's most compelling features is its ability to easily import and manipulate data from a wide array of data sources. Oftentimes, data acquisition is followed by the tedious process of manipulating data to your exact needs. This may involve filtering out unnecessary items, rearranging columns, renaming data elements, merging datasets, creating calculated fields, and creating relationships. PowerPivot includes features that help streamline all of these chores.
With a few clicks of the mouse, you can remove unwanted or incorrect columns or rows, rename columns, or add new calculated columns. Likewise, it is very easy to merge tables or create relationships between tables, even if the data for each table comes from different sources.
You can create calculated columns in PowerPivot by using a language called Data Analysis Expressions (DAX). DAX includes many of the functions that are used in Excel formulas and uses a syntax that is nearly identical to the one in Excel, so most Excel users will feel right at home using DAX. Excel formulas operate on Excel ranges; DAX formulas operate on tables and columns.
The ability to create relationships in PowerPivot is extremely useful. A relationship is simply a connection, between two tables of data, based on one or more columns in each table. When importing multiple related tables from a database, PowerPivot can automatically detect and reproduce the relationships in the PowerPivot window. If you import tables from multiple sources, you can also create relationships between tables manually.
Combined, all of these features shorten your data manipulation and preparation phase so that you can proceed to your final objective: data analysis.
About the Author
Steve Hansen is the founder of Grid Logic, a Minnesota-based consulting firm specializing in business intelligence and information worker solutions. A developer, frequent author and speaker at technical conferences, Steve is a Microsoft MVP for his work with Visual Studio. One part code jockey, one part finance geek; Steve also has an MBA from the University of Minnesota with a concentration in finance.
Steve would like to thank Grid Logic's Sean Van Dyk for his assistance with this article.