I love Pivot Tables. You can do some amazing things with them. But some of the limitations annoy the heck out of me. Thankfully, these limitations can be overcome by using Power Pivot and The Data Model.
This post will cover many of the points that are raised and the questions that are asked when I talk about The Data Model in my Excel training sessions.
Questions such as What is the Data Model? Is there is one single Data Model per Excel installation or one Data Model per file? Why would you store data in the Data Model? What does the data stored in the Data Model look like? How does data get into the Data Model and how do you access the data in the Data Model?
To answer the first question, the Data Model is part of an Excel workbook. It’s purpose is to store one or more sets of tabular (i.e. row-and-column-structured) data that can then be used as the source for Pivot Tables. Each Excel file has its own Data Model.
When you open an Excel file, whereas you can see the tabular data stored in a worksheet (assuming the sheet has not been hidden), data that is stored in the Data Model is hidden. To view this data, you use PowerPivot. More on Power Pivot later.
So back to these annoying limitations…
Limitation 1: Number of Worksheet Rows
With a traditional (or Classic) Pivot Table, the data source is usually a table or range in an Excel worksheet. This limits you to just over 1 million rows of data (an Excel worksheet contains 1,048,576 rows).
That is probably enough for many users, however imagine being asked to create a Pivot Table to summarise weekly sales at Amazon. With over 1.6 million orders per day (I Googled that stat) there would be over 11 million rows of data.
According to the Data Model Specifications and Limits Page, the Data Model can hold 2,147,483,647 tables with each table having a maximum of 2,147,483,647 columns and 1,999,999,997 rows.
The data stored in the Data Model is stored in a special compressed format (you don’t have to uncompress It to view it) resulting in a smaller, faster file and a better experience for the end user.
Limitation 2: Single Table Data Source
With a traditional Pivot Table, the data source is usually a single table or a single worksheet range. So what happens when your data is split across more than one table?
In the screenshot above there are 2 tables, both in the worksheet. One contains information about employees. The other contains information about salaries. An employee’s salary is based on their career grade.
To create a Pivot Table displaying total cost of salaries per grade would require an extra column, containing each employee’s salary, to be added to the Employees table. That single table (columns A:D) would then become the source data for the Pivot Table.
However, if both tables are stored in the Data Model instead of the worksheet, and the Data Model then specified as the Pivot Table source, the Pivot Table can be created without the need to add the extra column to the Employees table.
Or to put it another way, if you want to create a Pivot Table from multiple, related tables/datasets, those tables/datasets must reside in the Data Model.
Limitation 3: Calculated Fields
Calculated Fields, a little-known feature of Pivot Tables, allows you to add extra columns into a Pivot Table.
In the Pivot Table above, the Annual Cost of Salaries is based on data stored in a worksheet-based table. The Monthly Cost of Salaries data has been generated using a Calculated Field (Annual Salary divided by 12).
I don’t want to get too specific but there are limitations as to what calculations and functions are supported by the Calculated Fields feature (as an example the IF function is supported but VLOOKUP is not).
When the source data for a Pivot Table is stored in the Data Model, the Calculated Fields option is unavailable (in the Pivot Table Ribbon). However there is an alternative…DAX, which is the formula language of The Data Model.
DAX is way more powerful than the Calculated Fields feature and the good news for Excel users is that DAX formulas are very similar to Excel formulas and many of the functions that are built into Excel (for example, SUM, COUNT, IF etc etc) are available in DAX, although DAX does have functions of its own that you won’t find in Excel. Think of DAX as Excel formulas on steroids!
Originally a free add-in from Microsoft that you had to download and install, Power Pivot is now built into Excel, although only in specific versions. See this Microsoft page for more information.
Power Pivot provides the functionality that you need to be able to view the data in the Data Model (although it actually does more than that).
To use Power Pivot, assuming it is installed on your computer, you have to first “enable it” in Excel’s Options > Add-ins. Once you have done that, the Power Pivot tab appears on the Excel Ribbon.
Note that not only is Power Pivot only available in certain versions of Excel, you will only see the Power Pivot tab on The Ribbon in Excel for Windows. Power Pivot is not available in Excel for Mac, Excel Online or the mobile versions of Excel.
That said, you can open files in any version of Excel on any platform where the data is stored in the workbook’s Data Model. You just won’t be able to view the raw data and take advantage of the other functionality provided by Power Pivot on unsupported platforms or versions.
Loading Data Into The Data Model
In a brand new Excel workbook, although there is a Data Model, it is empty.
You can’t create tables and enter data directly into the Data Model. You have to load the data in from other sources.
If the data is stored in an Excel table in the current workbook, select a single cell in that table and click the Add to Data Model button on the Power Pivot tab on The Ribbon. This copies the data from the worksheet-based table and creates a table with the same name in the Data Model.
If the data is stored externally, for example in another Excel file or a CSV or Text file or a database, select Power Pivot > Manage from the Excel Ribbon to open Power Pivot in a separate window. From the Home tab on the PowerPivot Ribbon, use the appropriate option in the Get External Data section.
The recommended way to load data into the Data Model, however, is to use Power Query, which like Power Pivot, was originally a Microsoft-authored add-in but which, since Excel 2016 has been part of Excel under the not-as-sexy name of “Get Data” in the Get and Transform section of the Data Ribbon.
Power Query / Get Data supports many more data sources compared to loading data from inside Power Pivot and, through the built-in Query Editor, provides the opportunity to clean up and “transform” the data before it gets loaded into the Data Model.
The Data Model – The Way Forward?
Even if your requirements are simple, storing the data in the Data Model future-proofs your workbooks.
Say you have a small table of data in a worksheet and you build a few Pivot Tables based on that data. Six months later you add a second, related table of data to the worksheet which needs to be included in the Pivot Table and you also need to perform some calculations that are outside the scope of the Calculated Fields feature.
You have no choice but to add the 2 tables of data to the Data Model. However, the Pivot Tables that you already have, that are based on a single range of worksheet-based data, would no longer work (in terms of not being refresh-able) so you’d have to recreate them based on the Data Model.
My opinion is that you should be using the Data Model to store your data regardless. It’s the way forward. It’s what Microsoft recommend too.