Self-reported measures (i.e., respondents read the question and select a response by themselves) are pretty common in evaluation. They are relatively cheap and easy to administer to a large group of people. It’s a lot easier to email a survey link than it is to hire and train a team of research assistants to follow and observe your participants and record their observations.
Some purists are quick to dismiss self-reported data. Studies have shown that people are not very honest when it comes to self-reporting their college grades, height and weight, or seat belt usage, among other things. Some problems with self-report data include:
Social desirability bias: Self-report measures rely on the honesty of your participants. “Social desirability bias” is a fancy way of saying, generally speaking, people want to present themselves in the best light possible. If your survey is asking about a sensitive topic, such as exercise frequency, eating habits, or alcohol consumption, participants might not be truthful in their responses. One way to combat this is to make questionnaires anonymous.
Understanding and interpretation: Self-report measures also rely on participants understanding your questions and the available response options. If your survey item is being misunderstood, your resulting data isn’t going to tell you much.
Memory: Even if a participants are being honest and they perfectly understand your survey questions, the quality of your data is also dependent on participants accurately remembering pertinent details. Human memory is a lot worse than people generally realize.
Response bias: Several other factors can influence how a participant responds to a question. If you are in a good mood, you may be more likely to answer the question positively. The reverse is true as well – a bad mood can predispose you to answer a question negatively. Even your personality can influence how you answer a question!
Yikes, those are some serious problems. So what does this mean for evaluators? Given their many advantages, self-report measures are not going anywhere anytime soon. Thankfully there are some steps we can take to increase the validity of our surveys:
1. Pilot test your measures: Before you “go live” with your survey with your participants, you should pilot test your questionnaire with a small number of people (in a perfect world, this small group of people would be similar to your actual participants. So if your survey is designed for youth, you should be pilot testing it with youth, and so on). As part of your pilot test, you should conduct interviews in order to ensure your items and response options are being interpreted correctly.
2. Make your survey anonymous: Anonymity can encourage participants to be honest. It can also help if the evaluator leaves the room and participants are given privacy while completing the survey. Of course sometimes we need a way to be able to track surveys, as is the case if you are doing a traditional “pre-post” design (you will be matching a participant’s survey from before the program with a survey completed after the program). In this case, a random ID number can be used, although this can add a layer of complexity in your data management.
3. Counterbalance your measures: Counterbalancing means randomizing the order survey questions appear. It could mean the order of every single section is randomized (in which case you would have a lot of different versions of the survey), or it could be as simple as splitting the survey in half and reversing the order with some participants randomly receiving the first version and the others receiving the second version. You might use the two version method with a paper survey but if you are using an online survey, many of the main online survey providers offer ways to randomize question order making it easy to have many different versions.
How about you – do you ever worry about the validity of the self-reported data? If so, what are some techniques you use to increase the quality of your measures?
As a side note, I wonder if self-report measures will be less common in the future, particularly in the realm of health and exercise. ‘Wearable tech’ devices are becoming quite common (check out how these devices are used by Disney) and keep coming down in price. The devices can capture a tremendous amount of data and how exactly they can be used in evaluation will be fascinating. If anyone has an example of an evaluation that used wearable tech data I’d love to see it!