How to make dynamic waffle charts in Excel

Waffle charts can add visual appeal to a report. They seem to be popping up everywhere lately! I recently used them in a report and I wanted to share my process in order to show you how easy they are.

finalproduct

Once you break these charts down you can see that they are just 10×10 grids. And we all know that Excel, er, excels at working with grids.

First off, type out your data in two columns. You can see that I have values for Group A, Group B, and Group C. Then you are going to make some 10×10 grids (since I have three groups, I made three grids). These grids will contain the values 1-100%.

step1

Next we are going to resize the grids. Highlight all of the columns in your grid and drag your mouse to resize. The default line height in Excel is 20 pixels so I like to make my cells 20 pixels wide so that things are perfectly square.

step2

At this point I make my number font really tiny (as in size 6) so that it fits into the cells.

Change your background and text color on the grids to whatever you want your default color to be. I chose a light grey.

step3.PNG

Now we are going to change the color of the borders. I find white too harsh of a contrast so I like using a very light grey. Changing border color is a bit quirky. First highlight the columns with your grids. Go to the border button and go down to line color and select the color that you would like.

step4

Your cursor will turn into a little pencil and you will see little black dots in your grids. Higlight the columns that contain the grids again and press the border button. At this point the border colors should be changed.

step5

Next we will use the magic of conditional formatting to fill in our grids.  Highlight the first grid and go to Conditional Formatting and click New Rule.

step6

Next select “Format only cells that contain” and select “less than or equal to” and then select the cell that contains the actual value for Group A. This is telling Excel that you want to change the color of every cell in the grid that is less than or equal to the actual score.

step7

Go to “Format” and change the background fill color AND the font color to whatever color you would like for your group.

Repeat these steps for all of your grids. When you are finished you will have something like this:

step8

Alright, now we’re getting somewhere.

Next we’re going to do some extra Excel kung fu to make pictures from these grids that we can paste anywhere in our workbook (such as a front sheet that you are using to summarize your results).  Not only this, the picture will automatically update if you change your data.

Highlight all of the cells in your first grid and copy. Right click wherever you want your waffle chart to be. Right click, go to paste special, and then go to the little picture with a link on it. This creates something called a linked picture.

step9.PNG

You can easily move this picture around on the worksheet and resize it. If you change your data in the previous worksheet, where we set up our grid, the picture will automatically update. Neat, right?

Once you have created linked pictures for all of your waffle charts let’s add some labels so that we can easily tell what scores the charts are representing.

Go to the first waffle chart and insert a text box. Change the fill and outline of the text box to ‘none’. Make sure the (empty) text box is selected. Go up to the formula bar, type = and then navigate to where the score for that group is stored, click that specific cell, and hit enter. Your text box should now contain that group’s score.

step10.PNG

Like the linked picture, if you change the group’s score, the text box will automatically update to reflect this.

Repeat the steps above to label your other waffle charts. Tweak the formatting to make the text boxes easy to read. Add a headline that tells the reader the main insight from the charts and there you go, you’re done!

finalproduct

Before you get too excited there is an important downside to waffle charts: They take up a lot of space. Each waffle grid is essentially showing one data point. That is a lot of real estate for one data point!

Let me know your thoughts on waffle charts – Love them? Sick of them? Other thoughts?

Non-linear relationships: The importance of examining distributions

Recently I was analyzing some data to help answer the question “what are the demographic differences between program graduates and program drop outs?” I did some modelling and found a few predictors, one of which was age.

I compared the average age between the groups and saw that the drop outs had a lower average age (42 years) than graduates (44 years). Simple enough. But this simplistic explanation didn’t jive with anecdotal information the program staff had given me. I wondered if the relationship between age and program completion was linear (i.e., does a change in age always produce a chance in the likelihood of graduating).

As I mentioned in my last post, I’ve been playing around with R. I recently came across something called a violin plot and I wanted to try it out. A violin plot is kind of like a box plot, except that instead of a plain old box it shows you the distribution of your data.

Here is an example of a box plot:

boxplot

The main thing that I immediately see from this chart is that on average, the drop outs were younger than the graduates.

Here is an example of a violin plot:

violin

I get a different takeaway from this plot. You can see from the violin plot that the distribution of age for the drop outs looks a lot different than the distribution of age for the graduates. The bottom of the drop out violin is wider, indicating that the drop outs skew a lot younger than the graduates. This indicates that we should be exploring the relationship between age and graduation more closely.

But what if you don’t use R and can’t create a violin plot? Histograms are standard tools to show distributions and are much more common. A histogram is essentially a column chart that show the frequency of values in your distribution (so for this example, it would show how many participants were 20 years old, 21 years old, 22 years old, you get the idea). Excel actually has a built in feature to create histograms (click here for instructions). The tool bugs me a lot and it isn’t super intuitive to use, but it gets the job done.

Here is the distribution for age for both the drop outs and graduates. Yes, yes, I know that my x-axes aren’t labelled and that my y-axes use different scales but these choices were intentional because I want you to focus on the shape of the distributions, not the content.

histograms

Again, you can see that the age of the drop outs skews to the left (meaning that there is a higher proportion of younger participants than older). The histogram for the graduated group looks quite different.

All of this evidence points to a non-linear relationship, meaning that age has an effect on whether or not a participant graduates for participants in different age groups.

To take a closer look at this relationship, I calculated the drop out rate for different age groupings and put them on a line chart. Aha! If the relationship between age and program completion was linear, we would expect this line to be straight. But it’s not. You can see that the drop-out rate declines with age until we hit age 40 or so. After that it’s more or less flat until age 70, and then goes down again.

dropouts.PNG

This is an important piece of knowledge for program staff to target retention efforts and something that we wouldn’t have uncovered if we simply had stopped at comparing the average age between the drop-outs and the graduates.

data viz tools

Awhile ago I posted about the data viz catalogue. It’s a neat resource that helps you choose a visualization that best tells the story of your data. The creator has recently posted a roundup of the 20 best tools for data visualization. It includes tools that have no coding required as well as tools for developers. There were definitely a couple that were new to me and I look forward to checking them out.

On my 2016 to-do list: learn enough coding that I can play around with the dev tools.

Data Viz Catalogue

I just came across this great data visualization resource through BetterEvaluation – the Data Visualization Catalogue. Choosing a chart or other visualization type that best tells the story of your findings is the most fundamental part of data visualization. This site helps you by allowing you to search data visualizations by function.

Capture

 

Once you choose a function, it will give you some suggested visualization types:

Capture

 

Very neat!

The site was created by Severino Ribecca and will be added to over time.