The Use-Case for Microsoft Power BI
Previously we presented an introduction to data visualization tools and talked about why you might choose one over another. We then looked at a price comparison of the tools as this may be a limiting factor for your organization. In our last blog post we did a deep dive into a Google Data Studio use-case. This time around we'll take a close look at Microsoft Power BI and describe an ideal use-case for it.
Screenshot of Microsoft Power BI Report View
REASONS TO USE POWER BI
Why choose to use Microsoft Power BI? Perhaps, like in our example use-case we’ll be digging into:
Your organization already uses Office 365 providing both you and your audience cheap and easy access to Power BI
Your skill set and experience lean more toward database design and spreadsheets than toward graphic design
You're familiar with tables and rows and columns and queries and formulas, but not so much with fonts and text alignment and affordance and preattentive attributes
Your data comes from multiple sources and you don't have control over how it is organized, packaged up and delivered to you
If you find yourself relating to one or more of the these statements, Power BI could be a great fit for you.
DATA MODELING VS. DESIGN
Tableau is designed to be a self-service data visualization tool that can largely insulate the visualization developer from the data organization and modeling. If you can connect to a view or spreadsheet that has all the data you need, the design capabilities of Tableau are unmatched. Microsoft Power BI, on the other hand, is a self-service data visualization and data modeling tool that does not insulate the visualization designer from the data organization at all. The Power BI interface is broken down into three views that can be toggled between: the Report view, the Data view, and the Model view.
There is a big upside to this. Power BI has a great feature set built in for transforming messy data and modeling multiple data sources. As a data visualization developer, you are not limited by the data strategy (or lack thereof) of your organization. This is the critical determining factor of our example use-case. While the actual data visualization can often be an afterthought in Power BI and the design limitations can hinder the user experience, the data modeling capabilities can make it the clear choice in some situations. When there is no data, the data visualization capabilities are irrelevant.
Some of the data preparation and modeling features of Microsoft Power BI are connecting to source files, keeping and removing rows, removing empty rows, appending queries, replacing values, formatting data, handling formatting errors, pivoting and unpivoting data, splitting columns, creating groups, creating DIM and FACT tables, creating tables manually, creating conditional columns, managing table relationships and cardinality, cross filter direction, and more.
A MOCK USE-CASE
Let’s say that we sell tins of cookies on the internet. Yummy.
Our customers go through the familiar online shopping experience of browsing to our website, picking a small, medium or large tin of our delicious cookies, adding it to the shopping cart, picking how many tins they want, and then entering their shipping and billing information in order to complete their purchase. Somewhere in the cloud in the database backend of our web server, this sales transaction is recorded.
As a business, we'd like to improve our online cookie business performance. We believe we can do this by visualizing how our sales transaction data is trending relative to our web analytics in order to improve our conversions from web visits to completed sales. In order to accomplish this, we need to supply Power BI with two different data sources:
Sales transaction data from the database backend of our web server
Web analytics data source
In order to build the data visualizations needed to address our business questions, we need to get the data cleaned, organized, and relate the sources to one another. Like with many self-service data visualization projects, we have little control over how the supplied source data is available to us. In an ideal situation for quick and easy data visualization, our sales transaction data could be saved in one large flat table in a cloud database. It would be the equivalent of a 30+ column spreadsheet containing every piece of information associated with a user purchasing a tin of cookies. From left to right the columns would include a transaction number, a user name, a shipping street number, city, state and zip code, a billing street number, city, state and zip code, as well as a credit card number, customer name, expiration date, secret code, and any other details captured with the purchase. Then finally, all the way over to the right, we'd have our last three columns for our small, medium and large cookie tins with the number of tins ordered. This flat sales transaction table would look like the following:
Sample flat sales transaction table
While that looks great to someone looking for a single source with all the data, a flat table is likely not how the actual sales transaction database is organized. Database designers don't typically use just one table, rather they divide the tables up and relate them one to another in order to improve the manageability and performance of the database for a particular activity. Splitting the data up into related tables is referred to as "normalization". So for our cookie business, it's likely that we'd have the transaction data in multiple different tables related one to the other using a unique identifier like customer name or customer ID:
a customer table
an address table
a credit card table
a sales transaction table
We may be able to get this data in a single flat file before we connect to it in Power BI. However, the data may be provided to us in separate tables, like an Excel file with a different sheet for each data set. In this case, we will need to relate the data in each sheet in order to visualize it. In comes the major value of the Model view in Power BI. We can connect up to each individual source and manage the relationships directly in the visualization platform. For those of you more familiar with data modeling, this includes the cardinality and cross filter direction.
With this functionality, we can continue to join in additional data sources to enhance our analysis. For our use-case, we didn’t just want our sales transaction data. We also wanted our website analytics data to know how many people have visited our site and when. So what do we need to do in order to bring our sales transaction data and our web site analytics data sources together? We’ll build off of our existing model. We can connect to our website analytics data source and create a relationship between it and our existing orders table.
IN SUMMARY
Microsoft Power BI is a great tool for the right user and the right use-case. Its strength and weakness both stem from its tight integration of data modeling with data visualization. It doesn’t have the design power of Tableau or the ease of use of Google Data Studio, but it’s the clear winner when it comes to fixing up and mashing up multiple data sources.