Unpivot
The Unpivot transform is an incredibly powerful transform that allows you to reorganize your dataset into a more structured format for Business Intelligence. Let's discuss this by visualizing a practical example to help understand the purpose of Unpivot. Imagine you are provided a file that has the last three years of population by US States, and looks like this:
The problem with data stored like this is you cannot very easily answer simple questions. For example, how would you answer questions like, What was the total population for all states in the US in 2018 or What was the average state population in 2016? With the data stored in this format, simple reports are made rather difficult to design. This is where the Unpivot transform can be a lifesaver. Using Unpivot, you can change this dataset into something more acceptable for a BI project, like this:
Data stored in this format can now easily answer the questions posed earlier. To accomplish this in other programming languages can often require fairly complex logic, while the Power Query Editor does it in just a few clicks.
There are three different methods for selecting the Unpivot transform that you should be aware of, and include the following options:
- Unpivot Columns: Turns any selected columns headers into row values and the data in those columns into a corresponding row. With this selection, any new columns that may get added to the data source will automatically be included in the Unpivot transform.
- Unpivot Other Columns: Turns all column headers that are not selected into row values and the data in those columns into a corresponding row. With this selection, any new columns that may get added to the data source will automatically be included in the Unpivot transform.
- Unpivot Only Selected Columns: Turns any selected columns headers into row values and the data in those columns into a corresponding row. With this selection, any new columns that may get added to the data source will not be included in the Unpivot transform.
Let's walk through two examples of using the Unpivot transform to show you a few of these methods, and provide an understanding of how this complex problem can be solved with little effort in Power BI.
- Launch a new instance of the Power BI Desktop, and use the Excel connector to import the workbook called Income Per Person.xlsx found in the book source files. Once you select this workbook, choose the spreadsheet called Data in the Navigator window, and then select Edit to launch the Power Query Editor.
- Now, make the first row of data column headers by selecting the transform called Use First Row as Headers under the Home Ribbon.
- Rename the column GDP per capita PPP, with projections column to Country.
- If you look closely at the column headers, you can tell that most of the column names are actually years and the values inside those columns are the income for those years. This is not the ideal way to store this data because it would be incredibly difficult to answer the question, What is the average income per person for Belgium? To make it easier to answer this type of question, right-click on the Country column and select Unpivot Other Columns.
- Rename the columns Attribute and Value to Year and Income, respectively.
- To finish this first example, you should also rename this query Income.
This first method walked you through what can often be the fastest method for performing an Unpivot transform, which is by using the Unpivot Other Columns option. In this next example, you will learn how to use the Unpivot Columns method.
- Remain in the Power Query Editor, and select New Source from the Home Ribbon to use the Excel connector to import the workbook called Total Population.xlsx found in the book source files. Once you select this workbook, choose the spreadsheet called Data in the Navigator window, and then select OK.
- Like the last example, you will again need to make the first row of data column headers by selecting the transform called Use First Row as Headers under the Home Ribbon.
- Then, rename the column Total population to Country.
- This time, multi-select all the columns except Country, then right-click on one of the selected columns and choose Unpivot Columns. The easiest way to multi-select these columns is to select the first column then hold Shift before clicking the last column.
- Rename the columns Attribute and Value to Year and Population, respectively.
- To finish this first example, you should also rename this query Population.
In this section, you learned about two different methods for performing an Unpivot. To complete the data cleansing process on these two datasets, it's recommended that you continue through the next section on Merging Queries.