Thinking geospatially

Lately I’ve been having a lot of fun making maps in Tableau (What? Everybody doesn’t make data visualizations for fun?). Tableau is a pricy piece of software but you can use Tableau Public for free (and if you are a charity in Canada you can get the desktop version at a very, very discounted rate through TechSoup).

Mapping data is a skill that I’ve been wanting to build for awhile. Lately I’ve been working with community health data and bar charts can only tell me so much. Seeing the data on a map has made a world of difference.

In order to try out working with maps I downloaded data from Toronto’s open data catalogue. The first map I made was a schematic of Toronto’s subway (the TTC). I adjusted the size of the circles representing stations to show the number of daily riders and added a filter so that the viewer could drill down on a specific subway line. Much more interesting than simply looking at a bar chart that lists each stop, right?

Click to go to data viz

Next up I took a look at service calls place to 311 (the customer service department of Toronto). You can see that some areas of the city have quite a high call volume whereas other areas are relatively low. If you click on a specific area of the map the bar chart below will automatically filter to show you the top 10 reasons for service calls originating in that area. What strikes me the most is that throughout the city the most common reason for calling 311 by far is issues surrounding garbage, recycling, and compost bins.

Click to go to data viz

The third map that I want to share is neighbourhood safety. I took a look at major crimes and other safety-related incidents by Toronto neighbourhood. The data is a little old (2011) but you can instantly see that the majority of incidents are concentrated in a few neighbourhoods. You can filter the map by incident type on the right. Changing incident map changes the map pretty drastically. For example, filter on murder and you can see that the red areas change. Like the 311 map, clicking on a neighbourhood will filter the bar chart below.

Click to go to data viz

So far I’ve been having a lot of success with these interactive maps. It is much easier for people to instantly see and understand the data vs. having to look at a chart and then convert the words into geography in their head.

What about you – what has been your experience with presenting geospatial data?

The Importance of Context

Recently I was looking at some data and I noticed a trend in a neighbourhood surrounding a community centre that was evaluating the effectiveness of their poverty reduction work. The number of families classified as having a low income had decreased over recently (Neighbourhood A). Several nearby neighbourhoods (Neighbourhoods B and C) had definitely not seen this decrease.


(Shout out to Stephanie Evergreen for forever changing my life with small multiples)

At first glance this looked promising – had the poverty reduction campaign contributed to this? People were excited but I had my reservations about claiming success so quickly.

If you’ve recently visited Toronto you know that there are building cranes everywhere. Neighbourhoods are changing (read: gentrifying) very, very quickly as luxury condos go up and lower income families are driven further and further out of the core. It was possible that the income level of residents hadn’t changed – perhaps the low income residents had moved out and more affluent residents had moved in. First piece of evidence: Neighbourhood A had four condominium projects completed in that time frame whereas Neighbourhood B had one and Neighbourhood C had zero.

Next we looked at demographics. Canada completes a census every five years. We had could compare 2006 and 2011 data as the 2016 is not yet available. Second piece of evidence: Neighbourhood A had decreases in children, youth, and seniors (and families overall) but an increase in working age adults). The change wasn’t near as drastic in Neighbourhoods B and C.

Fortunately we had a lot of other data to look at in order to evaluate the program but I thought that this was a nice illustration of why it’s really important to look at the context behind the data and examine other possible explanations before claiming success.


Yet another reason why theory should guide evaluation

The “replication crisis” has been a hot button issue in science for awhile now. Simply put, many experiments are difficult or impossible to replicate. I’m a social psychologist and so that is where I have been following the discourse. For example, many of the “classic” social psychology experiments that you may have learned about in Psychology 101 have failed to be replicated. This study suggests that perhaps we should discount two thirds of published findings in social psychology! This is especially disheartening when I  think about how many studies I read in the course of 9 years studying social psychology in university.

Roger Peng (who teaches great courses over on Coursera by the way, which is how I found his blog) recently wrote a super interesting post about this topic. Peng talks about how in fields with a strong background theory (as well as in fields that do not rely on experimental design) there isn’t a crisis.

This led me to think about evaluation and the importance of having a solid theory of change guide your work. If we evaluate a program and we don’t have a theory of change we call this a “black box evaluation.” Our results can tell us whether or not a program had an effect…but we have no idea why. Was it due to a particular component of the program? Effective staff? Something about the participants? And if we can’t answer why a program did or did not have an effect we certainly can’t replicate the program in other places.

Previous to today I had mostly thought of the replication crisis as a research problem (and one I think about when I wear my “researcher hat”) but I found it super interesting to see how it can also be an evaluation problem (and I will certainly incorporate it into my “evaluator hat” thinking!).

Upcoming free course on Bayesian analysis

The bulk of my statistical training is based on null hypothesis significance testing (NHST – for the non stats geeks out there I’m talking about the tests that return p values, among other things). This knowledge has served me well in the past decade; however, increasingly more and more organizations and publications are moving beyond NHST (here is a statement from the American Statistical Association and here is an example of a publication banning p values).

Bayesian analysis is an alternative to NHST that updates the probability of a hypothesis as you collect more information. I’m not well versed in it so I’m going to steer you to a definition from I’ve been curious about how Bayesian analysis can be applied to evaluation and more and more examples of it being used are popping up every day (here is one such example that I have recently read).

I’ve been wanting to learn about Bayesian methods and apply them to a current project but just haven’t had the time to delve into a textbook on my own. I was quite happy to see that Coursera will be offering a course starting Aug 29th and wanted to share it with others who may be interested. You can use either R or Excel for the coursework.

In the meantime, please share any examples of Bayesian analysis in evaluations below! I would love to check them out!

data viz tools

Awhile ago I posted about the data viz catalogue. It’s a neat resource that helps you choose a visualization that best tells the story of your data. The creator has recently posted a roundup of the 20 best tools for data visualization. It includes tools that have no coding required as well as tools for developers. There were definitely a couple that were new to me and I look forward to checking them out.

On my 2016 to-do list: learn enough coding that I can play around with the dev tools.

Recap (and downloads) from the Recreation Connections Manitoba conference

I had a lot of fun yesterday presenting at the Recreation Connections Manitoba conference. I presented a two hour workshop designed to give a “crash course” in developing a program theory and measuring program impact. If you attended the workshop and are looking for the handout, you can download it here.

It was so interesting to hear the wide variety of programs that the attendees were working on…everything from composting to an after-school program with children. I also really enjoyed talking about the different challenges that were faced when it came to measurement, such as response rates, adapting measurement tools for children, juggling limited resources, and survey bias. Rest assured that these are issues that most (all?) evaluators face! We talked about some ideas in the workshop, but I want to expand on these in future blog posts.

Thanks for the great time, Winnipeg!

Developing valid self-report measures

Self-reported measures (i.e., respondents read the question and select a response by themselves) are pretty common in evaluation. They are relatively cheap and easy to administer to a large group of people. It’s a lot easier to email a survey link than it is to hire and train a team of research assistants to follow and observe your participants and record their observations.

Some purists are quick to dismiss self-reported data. Studies have shown that people are not very honest when it comes to self-reporting their college grades, height and weight, or seat belt usage, among other things. Some problems with self-report data include:

Social desirability bias: Self-report measures rely on the honesty of your participants. “Social desirability bias” is a fancy way of saying, generally speaking, people want to present themselves in the best light possible. If your survey is asking about a sensitive topic, such as exercise frequency, eating habits, or alcohol consumption, participants might not be truthful in their responses. One way to combat this is to make questionnaires anonymous.

Understanding and interpretation: Self-report measures also rely on participants understanding your questions and the available response options. If your survey item is being misunderstood, your resulting data isn’t going to tell you much.

Memory: Even if a participants are being honest and they perfectly understand your survey questions, the quality of your data is also dependent on participants accurately remembering pertinent details. Human memory is a lot worse than people generally realize.

Response bias: Several other factors can influence how a participant responds to a question. If you are in a good mood, you may be more likely to answer the question positively. The reverse is true as well – a bad mood can predispose you to answer a question negatively. Even your personality can influence how you answer a question!

Yikes, those are some serious problems. So what does this mean for evaluators? Given their many advantages, self-report measures are not going anywhere anytime soon. Thankfully there are some steps we can take to increase the validity of our surveys:

1. Pilot test your measures: Before you “go live” with your survey with your participants, you should pilot test your questionnaire with a small number of people (in a perfect world, this small group of people would be similar to your actual participants. So if your survey is designed for youth, you should be pilot testing it with youth, and so on). As part of your pilot test, you should conduct interviews in order to ensure your items and response options are being interpreted correctly.

2. Make your survey anonymous: Anonymity can encourage participants to be honest. It can also help if the evaluator leaves the room and participants are given privacy while completing the survey. Of course sometimes we need a way to be able to track surveys, as is the case if you are doing a traditional “pre-post” design (you will be matching a participant’s survey from before the program with a survey completed after the program). In this case, a random ID number can be used, although this can add a layer of complexity in your data management.

3. Counterbalance your measures: Counterbalancing means randomizing the order survey questions appear. It could mean the order of every single section is randomized (in which case you would have a lot of different versions of the survey), or it could be as simple as splitting the survey in half and reversing the order with some participants randomly receiving the first version and the others receiving the second version. You might use the two version method with a paper survey but if you are using an online survey, many of the main online survey providers offer ways to randomize question order making it easy to have many different versions.

How about you – do you ever worry about the validity of the self-reported data? If so, what are some techniques you use to increase the quality of your measures?

As a side note, I wonder if self-report measures will be less common in the future, particularly in the realm of health and exercise. ‘Wearable tech’ devices are becoming quite common (check out how these devices are used by Disney) and keep coming down in price. The devices can capture a tremendous amount of data and how exactly they can be used in evaluation will be fascinating. If anyone has an example of an evaluation that used wearable tech data I’d love to see it!