Showing two main points on one chart

It’s (usually) fairly straightforward to choose a chart type when you know what the main point you are trying to get across is. Is your message that there has been a change over time? Do you want to show a difference between groups? There are all kinds of online chart choosers to help you do this (here is one of my favourites). But what about when you have two main points to make?

I was recently working on a chart where I wanted to make the following two points:

  1. 2016 was the only year that participants had a statistically signifcant increase in health ratings; and
  2. participants had lower health ratings pre-program in 2016 vs. other years

I started with the chart below. Here the different color used in 2016 really highlights that something different happend that year (half of point #2), but it is difficult to see the change over time (point #1, half of point #2):

chart1

Alright then, let’s change to a line graph. It is much easier to see the change over time. However, the statistical change in pre- and post-test scores was important to the program and they wanted to highlight that. That piece of information isn’t easy to see here.

chart2

I added a transparent rectangle to highlight the difference between pre- and post-test scores and this is the result:

chart3

I think that this chart nicely conveys the two main points that I wanted to make and is a vast improvement over the first chart. It also goes to show that it’s worthwhile to play around with different chart types while working on reporting!

Note: I have changed the results to fictional data to keep things anonymous

Thinking geospatially

Lately I’ve been having a lot of fun making maps in Tableau (What? Everybody doesn’t make data visualizations for fun?). Tableau is a pricy piece of software but you can use Tableau Public for free (and if you are a charity in Canada you can get the desktop version at a very, very discounted rate through TechSoup).

Mapping data is a skill that I’ve been wanting to build for awhile. Lately I’ve been working with community health data and bar charts can only tell me so much. Seeing the data on a map has made a world of difference.

In order to try out working with maps I downloaded data from Toronto’s open data catalogue. The first map I made was a schematic of Toronto’s subway (the TTC). I adjusted the size of the circles representing stations to show the number of daily riders and added a filter so that the viewer could drill down on a specific subway line. Much more interesting than simply looking at a bar chart that lists each stop, right?

ttc
Click to go to data viz

Next up I took a look at service calls place to 311 (the customer service department of Toronto). You can see that some areas of the city have quite a high call volume whereas other areas are relatively low. If you click on a specific area of the map the bar chart below will automatically filter to show you the top 10 reasons for service calls originating in that area. What strikes me the most is that throughout the city the most common reason for calling 311 by far is issues surrounding garbage, recycling, and compost bins.

311
Click to go to data viz

The third map that I want to share is neighbourhood safety. I took a look at major crimes and other safety-related incidents by Toronto neighbourhood. The data is a little old (2011) but you can instantly see that the majority of incidents are concentrated in a few neighbourhoods. You can filter the map by incident type on the right. Changing incident map changes the map pretty drastically. For example, filter on murder and you can see that the red areas change. Like the 311 map, clicking on a neighbourhood will filter the bar chart below.

safety
Click to go to data viz

So far I’ve been having a lot of success with these interactive maps. It is much easier for people to instantly see and understand the data vs. having to look at a chart and then convert the words into geography in their head.

What about you – what has been your experience with presenting geospatial data?

The Importance of Context

Recently I was looking at some data and I noticed a trend in a neighbourhood surrounding a community centre that was evaluating the effectiveness of their poverty reduction work. The number of families classified as having a low income had decreased over recently (Neighbourhood A). Several nearby neighbourhoods (Neighbourhoods B and C) had definitely not seen this decrease.

neighbourhoods

(Shout out to Stephanie Evergreen for forever changing my life with small multiples)

At first glance this looked promising – had the poverty reduction campaign contributed to this? People were excited but I had my reservations about claiming success so quickly.

If you’ve recently visited Toronto you know that there are building cranes everywhere. Neighbourhoods are changing (read: gentrifying) very, very quickly as luxury condos go up and lower income families are driven further and further out of the core. It was possible that the income level of residents hadn’t changed – perhaps the low income residents had moved out and more affluent residents had moved in. First piece of evidence: Neighbourhood A had four condominium projects completed in that time frame whereas Neighbourhood B had one and Neighbourhood C had zero.

Next we looked at demographics. Canada completes a census every five years. We had could compare 2006 and 2011 data as the 2016 is not yet available. Second piece of evidence: Neighbourhood A had decreases in children, youth, and seniors (and families overall) but an increase in working age adults). The change wasn’t near as drastic in Neighbourhoods B and C.

Fortunately we had a lot of other data to look at in order to evaluate the program but I thought that this was a nice illustration of why it’s really important to look at the context behind the data and examine other possible explanations before claiming success.

 

Yet another reason why theory should guide evaluation

The “replication crisis” has been a hot button issue in science for awhile now. Simply put, many experiments are difficult or impossible to replicate. I’m a social psychologist and so that is where I have been following the discourse. For example, many of the “classic” social psychology experiments that you may have learned about in Psychology 101 have failed to be replicated. This study suggests that perhaps we should discount two thirds of published findings in social psychology! This is especially disheartening when I  think about how many studies I read in the course of 9 years studying social psychology in university.

Roger Peng (who teaches great courses over on Coursera by the way, which is how I found his blog) recently wrote a super interesting post about this topic. Peng talks about how in fields with a strong background theory (as well as in fields that do not rely on experimental design) there isn’t a crisis.

This led me to think about evaluation and the importance of having a solid theory of change guide your work. If we evaluate a program and we don’t have a theory of change we call this a “black box evaluation.” Our results can tell us whether or not a program had an effect…but we have no idea why. Was it due to a particular component of the program? Effective staff? Something about the participants? And if we can’t answer why a program did or did not have an effect we certainly can’t replicate the program in other places.

Previous to today I had mostly thought of the replication crisis as a research problem (and one I think about when I wear my “researcher hat”) but I found it super interesting to see how it can also be an evaluation problem (and I will certainly incorporate it into my “evaluator hat” thinking!).

Upcoming free course on Bayesian analysis

The bulk of my statistical training is based on null hypothesis significance testing (NHST – for the non stats geeks out there I’m talking about the tests that return p values, among other things). This knowledge has served me well in the past decade; however, increasingly more and more organizations and publications are moving beyond NHST (here is a statement from the American Statistical Association and here is an example of a publication banning p values).

Bayesian analysis is an alternative to NHST that updates the probability of a hypothesis as you collect more information. I’m not well versed in it so I’m going to steer you to a definition from stata.com. I’ve been curious about how Bayesian analysis can be applied to evaluation and more and more examples of it being used are popping up every day (here is one such example that I have recently read).

I’ve been wanting to learn about Bayesian methods and apply them to a current project but just haven’t had the time to delve into a textbook on my own. I was quite happy to see that Coursera will be offering a course starting Aug 29th and wanted to share it with others who may be interested. You can use either R or Excel for the coursework.

In the meantime, please share any examples of Bayesian analysis in evaluations below! I would love to check them out!

data viz tools

Awhile ago I posted about the data viz catalogue. It’s a neat resource that helps you choose a visualization that best tells the story of your data. The creator has recently posted a roundup of the 20 best tools for data visualization. It includes tools that have no coding required as well as tools for developers. There were definitely a couple that were new to me and I look forward to checking them out.

On my 2016 to-do list: learn enough coding that I can play around with the dev tools.

Recap (and downloads) from the Recreation Connections Manitoba conference

I had a lot of fun yesterday presenting at the Recreation Connections Manitoba conference. I presented a two hour workshop designed to give a “crash course” in developing a program theory and measuring program impact. If you attended the workshop and are looking for the handout, you can download it here.

It was so interesting to hear the wide variety of programs that the attendees were working on…everything from composting to an after-school program with children. I also really enjoyed talking about the different challenges that were faced when it came to measurement, such as response rates, adapting measurement tools for children, juggling limited resources, and survey bias. Rest assured that these are issues that most (all?) evaluators face! We talked about some ideas in the workshop, but I want to expand on these in future blog posts.

Thanks for the great time, Winnipeg!