Once connected to data most people go right into trying to build their visualizations; however, they get frustrated because the data can be overwhelming and sometimes not even make sense.
Cleaning up your data can seem insignificant, but it is the most important thing you will do with your Power BI report. It serves three main purposes:
- Makes creating visualizations easier because you don’t have to filter through unnecessary data.
Raw tables from SQL or software systems can include anywhere from 50 to over 300 columns. A lot of these columns may be unnecessary for the reports you are looking to create.
- Simplifies end user experience.
Raw tables come over with schema names which may not make sense to end users. It is important to change the names of columns to match the verbiage used within your organization, so that when an end user searches or looks at data they understand what they are looking at.
- Improves report performance.
Pulling over 300+ columns and all the rows of data you have entered over the life of your system can really slow down report loading. Cleaning up your data by removing unnecessary columns and filtering down rows helps to speed up the loading when refreshing or updating your data.
Cleaning up data is done within the Power Query Editor which can be accessed two different ways.
When you first connect to your tables you can choose edit your table versus load it.
From the home tab, in the toolbar select Edit Queries.
The Power Query Editor has a lot of different features, in this post we are going to cover three of the features to help clean up most of your data.
Removing Unnecessary Columns
Columns can be removed one at a time or in bulk. To remove one column, either right click on the column itself and choose remove or with the column highlighted select remove columns from the home tab on the toolbar.
To remove multiple columns, you can use CTRL + Select to highlight multiple columns at once and remove as described above, or you can choose which columns to keep by selecting choose columns from the home tab on the toolbar. This will pop up a box that will allow you to uncheck the columns you want to remove.
Renaming Columns
This is a simple step, but so important to make sure your data reflects verbiage used in your organization. Columns can be renamed by either right clicking on the column and selecting rename or by selecting rename from the transform tab on the toolbar with the column highlighted.
Transform Column Data
Clean, consistent data is important to make sure your charts provide beneficial data. You can transform your data either by right clicking on the column and choosing transform or the options are available under the transform tab on the toolbar. There are different transformations available depending on the data type of the column.
Text Transformations:
Change the text within the cell to all lowercase, all uppercase or capitalize each word.
Trim or clean the text to remove extra spaces or non-printable text.
Length calculates the number of characters within a cell.
Date/Time Transformations:
Change to date or time only.
Pull out year, quarter, month, week or day data. This can be the actual time period or it can calculate the start or end of the time period.
For time you can pull out the hour, minute or second.
Number Transformations:
There are several number transformations available, such as rounding, absolute value, base-10, powers or standard mathematics.
If you make a mistake you can undo your editing under the applied steps section on the right hand side of the screen. To delete click the X next to the step. If you have chosen select columns you can click on the settings gear to the right of the step and it will bring the box back up allowing you to reselect the columns you want included.
Once you have completed all the edits close and apply the updates to your Power BI report.
These are just a few of the many functions available within the Power Query Editor, but they will make a big impact when using Power BI. We will dive into some of the other function in more details in future blog posts.
-Jerica Coleman, CRM and Power BI Consultant
Interested in learning more? Contact us!