The decision of how to select cases to observe may present a long list of options, but deciding what specific types of data to collect presents us with infinite options. It seems to me, though, that the kinds of data collection we do in empirical social research all fall in one of three broad categories: asking questions, making direct observations, and collecting secondary data.
Volumes have been written about the craft of asking people questions for research purposes, but we can sum up the main points briefly. Researchers ask people questions face-to-face, by telephone, using self-administered written questionnaires, and in web-based surveys. Each of these modes of administration has its advantages and disadvantages. It’s tempting to think that face-to-face interviewing is always the best option, and often, it is a good option. Talking to respondents face-to-face makes it hard for them to stop midway through the interview, gives them the chance to ask questions if something needs clarifying, and lets you read their body language and facial expressions so you can help if they look confused. A face-to-face interview gives you a chance to build rapport with respondents, so they’re more likely to give good, thorough answers because they want to help you out. That’s a double-edged sword, though: Having you staring a respondent in the face might tempt him to give answers that he thinks you want to hear or that make him seem like a nice, smart, witty guy—the problem of social desirability bias.
Combating bias is one of the most important tasks when designing a research project. Bias is any systematic distortion of findings due to the way that the research is conducted, and it takes many forms. Imagine interviewing strangers about their opinions of a particular political candidate. How might their answers be different if the candidate is African-American and the interviewer is white? What if the respondent is interviewed at her huge fancy house and the interviewer is wearing tattered shoes? The human tendencies to want to be liked, to just get along, and to avoid embarrassment are very strong, and they can certainly affect how people answer questions asked by strangers. To the extent that respondents are affected similarly from interview to interview, the way the research is being conducted has introduced bias.
So, then, asking questions face-to-face may be a good option sometimes, but it may be the inferior option if social desirability bias is a potential problem. In those situations, maybe having respondents answer questions using a self-administered written questionnaire would be better. Completing a questionnaire in private goes a long way in avoiding social desirability bias, but it introduces other problems. Mail is easier to ignore than someone knocking at your door or making an appointment to meet with you in your office. You have to count more on the respondent’s own motivation to complete the questionnaire, and if motivated respondents’ answers are systematically differently than unmotivated nonrespondents, your research plan has introduced self-selection bias. You’re not there to answer questions the respondent may have, which pretty much rules out complicated questionnaire design (such as questionnaires with a lot of skip patterns—”If ‘Yes,’ go to Question 38; if ‘No,’ go to Question 40” kind of stuff). On the plus side, it’s much easier and cheaper to mail questionnaires to every state’s director of human services than to visit them all in person.
You can think through how these various pluses and minuses would play out with surveys administered by telephone. If you’re trying to talk to a representative sample of the population, though, telephone surveys have another problem. Think about everyone you know under the age of 30. How many of them have telephones—actual land lines? How many of their parents have land lines? Most telephone polling is limited to calling land lines, so you can imagine how that could introduce sampling bias—bias introduced when some members of the population are more likely to be included in a study than others. When cell phones are included, you can imagine that there are systematic differences between people who are likely to answer the call and those who are likely to ignore the unfamiliar Caller ID—another source of sampling bias. If you are a counseling center administrator calling all of your clients, this may not be a problem; if you are calling a randomly selected sample of the general population, the bias could be severe.
Web-based surveys have become a very appealing option for researchers. They are incredibly cheap, allow complex skip patterns to be carried out unbeknownst to respondents, face no geographic boundaries, and automate many otherwise tedious and error-prone data entry tasks. For some populations, this is a great option. I once conducted a survey of other professors, a population with nearly universal internet access. For other populations, though—low-income persons, homeless persons, disabled persons, the elderly, and young children—web-based surveys are often unrealistic.
Deciding what medium to use when asking questions is probably easier than deciding what wording to use. Crafting useful questions and combining them into a useful data collection instrument take time and attention to details easily overlooked by novice researchers. Sadly, plentiful examples of truly horribly designed surveys are easy to come by. Well-crafted questions elicit unbiased responses that are useful for answering research questions; poorly crafted questions do not.
So, what can we do to make sure we’re asking useful questions? There are many good textbooks and manuals devoted to just this topic, and you should definitely consult one if you’re going to tackle this kind of research project yourself. Tips for designing good data collection instruments for asking questions, whether questionnaires, web-based surveys, interview schedules, or focus group protocols, boil down to a few basics.
Perhaps most important is paying careful attention to the wording of the questions themselves. Let’s assume that respondents want to give us accurate, honest answers. For them to do this, we need to word questions so that respondents will interpret them in the way we want them to, so we have to avoid ambiguous language. (What does often mean? What is sometimes?) If we’re providing the answer choices for them, we also have to provide a way for respondents to answer accurately and honestly. I bet you’ve taken a survey and gotten frustrated that you couldn’t answer the way you wanted to.
I was once asked to take a survey about teaching online. One of the questions went something like this:
Do you think that teaching online is as good as teaching face-to-face?
❑ I think they’re about the same
I’ve taught online lot, I’ve read a lot about online pedagogy, I’ve participated in training about teaching online, and this was a frustrating question for me. Why? Well, if I answer no, my guess is that the researchers would infer that I think online teaching is inferior to face-to-face teaching. What if I, or one of my fellow respondents, am an online teaching zealot? By no, I may mean that I think online teaching is superior to face-to-face! There’s a huge potential for disconnect between the meaning the respondent attaches to this answer and the meaning the researcher attaches to it. That’s my main problem with this question, but it’s not the only one. What is meant, exactly, by as good as? As good as in terms of what? In terms of student learning? For transmitting knowledge? My own convenience? My students’ convenience? A respondent could attach any of these meanings to that phrase, regardless of what the researcher has in mind. Even if I ignore this, I don’t have the option of giving the answer I want to—the answer that most accurately represents my opinion—it depends. What conclusions could the researcher draw from responses to this question? Not much, but, uncritical researchers would probably report the results as filtered through their own preconceptions about the meanings of the question and answer wording, introducing a pernicious sort of bias—difficult to detect, particularly if you’re just casually reading a report based on this study, and distorting the findings so much as to actually convey the opposite of what respondents intended. (I was so frustrated by this question and fearful of the misguided decisions that could be based on it that I contacted the researcher, who agreed and graciously issued a revised survey.) Question wording must facilitate unambiguous, fully accurate communication between the researcher and respondent.
Just as with mode of administration, question wording can also introduce social desirability bias. Leading questions are the most obvious culprit. A question like Don’t you think public school teachers are underpaid? makes you almost fall over yourself to say “Yes!” A less leading question would be Do you think public school teachers are paid too much, paid too little, or paid about the right amount? To the ear of someone who doesn’t want to give a bad impression by saying the “wrong” answer, all of the answers sound acceptable. If we’re particularly worried about potential social desirability bias, we can use normalizing statements: Some people like to follow politics closely and others aren’t as interested in politics. How closely do you like to follow politics? would probably get fewer trying-to-sound-like-a-good-citizen responses than Do you stay well informed about politics?
Closed-ended questions—questions that give answers for respondents to select from—are susceptible to another form of bias, response set bias. When respondents look at a range of choices, there’s subconscious pressure to select the “normal” response. Imagine if I were to survey my students, asking them:
How many hours per week do you study?
❑ Less than 10
❑ 10 – 20
❑ More than 20
That middle category just looks like it’s the “normal” answer, doesn’t it? The respondent’s subconscious whispers “Lazy students must study less than 10 hours per week; more than 20 must be excessive.” This pressure is hard to avoid completely, but we can minimize the bias by anticipating this problem and constructing response sets that represent a reasonable distribution.
Response sets must be exhaustive—be sure you offer the full range of possible answers—and the responses must be mutually exclusive. How not to write a response set:
How often do you use public transportation?
❑ Every day
❑ Several times per week
❑ 5 – 6 times per week
❑ More than 10 times per week
(Yes, I’ve seen stuff this bad.)
Of course, you could avoid problems with response sets by asking open-ended questions. They’re no panacea, though; closed- and open-ended questions have their advantages and disadvantages. Open-ended questions can give respondents freedom to answer how they choose, they remove any potential for response set bias, and they allow for rich, in-depth responses if a respondent is motivated enough. However, respondents can be shockingly ambiguous themselves, they can give responses that obviously indicate the question was misunderstood, or they can just plain answer with total nonsense. The researcher is then left with a quandary— what to do with these responses? Throw them out? Is that honest? Try to make sense of them? Is that honest? Closed-ended questions do have their problems, but the answers are unambiguous, and the data they generate are easy to manage. It’s a tradeoff: With closed-ended questions, the researcher is structuring the data, which keeps things nice and tidy; with open-ended questions, the researcher is giving power to respondents to structure the data, which can be awfully messy, but it can also yield rich, unanticipated results.
Choosing open-ended and closed-ended questions to different degrees gives us a continuum of approaches to asking individuals questions, from loosely structured, conversational-style interviews, to highly standardized interviews, to fill-in-the-bubble questionnaires. When we conduct interviews, it is usually in a semi-structured interview style, with the same mostly open-ended questions asked, but with variations in wording, order, and follow-ups to make the most of the organic nature of human interaction.
When we interview a small group of people at once, it’s called a focus group. Focus groups are not undertaken for the sake of efficiency—it’s not just a way to get a lot of interviews done at once. Why do we conduct focus groups, then? When you go see a movie with a group of friends, you leave the theater with a general opinion of the movie—you liked it, you hated it, you thought it was funny, you thought it meant .... When you go out for dessert afterward and start talking with your friends about the movie, though, you find that your opinion is refined as it emerges in the course of that conversation. It’s not that your opinion didn’t exist before or, necessarily, that the discussion changed your opinion. Rather, it’s in the course of social interaction that we uncover and use words to express our opinions, attitudes, and values that would have otherwise lain dormant. It’s this kind of emergent opinion that we use focus groups to learn about. We gather a group of people who have something in common—a common workplace, single parenthood, Medicaid eligibility—and engage them in a guided conversation so that the researcher and participants alike can learn about their opinions, values, and attitudes.
Asking questions is central to much empirical social research, but we also collect data by directly observing the phenomena we’re studying, called field research or simply (and more precisely, I think) direct observation. We can learn about political rallies by attending them, about public health departments by sitting in them, about public transportation by riding it, and about judicial confirmation hearings by watching them. In the conduct of empirical social research, such attending, sitting, riding, and watching aren’t passive or unstructured. To prepare for our direct observations, we construct a direct observation tool (or protocol), which acts like a questionnaire that we “ask” of what we’re observing. Classroom observation tools, for example, might prompt the researcher to record the number of students, learning materials available in the classroom, student-teacher interactions, and so on.
The advice for developing useful observation tools isn’t unlike the advice for developing useful instruments for asking questions; the tool must enable an accurate, thorough, unbiased description of what’s observed. Likewise, a potential pitfall of direct observation is not unlike social desirability bias: When people are being observed, their knowledge of being observed may affect their behavior in ways that bias the observations. This is the problem of participant reactivity. Surely the teacher subjected to the principal’s surprise visit is a bit more on his game than he would have been otherwise. The problem isn’t insurmountable. Reactivity usually tapers off after a while, so we can counter this problem by giving people being observed enough time to get used to it. We can just try to be unobtrusive, we can make observations as participants ourselves (participant observation), or, sometimes, we can keep the purpose of the study a mystery so that subjects wouldn’t know how to play to our expectations even if they wanted to.
Finally, we can let other people do our data collection for us. If we’re using data that were collected by someone else, our data collection strategy is using secondary data. Social science researchers are fortunate to have access to multiple online data warehouses that store datasets related to an incredibly broad range of social phenomena. In political science, for example, we can download and analyze general public opinion datasets, results of surveys about specific public policy issues, voting data from federal and state legislative bodies, social indicators for every country, and on and on. Popular data warehouses include Inter-University Consortium for Political and Social Research (ICPSR), University of Michigan’s National Elections Studies, Roper Center for Public Opinion Research, United Nations Common Database, World Bank’s World Development Indicators, and U.S. Bureau of the Census. Such secondary data sources present research opportunities that would otherwise outstrip the resources of many researchers, including students.
A particular kind of secondary data, administrative data, are commonly used across the social sciences, but are of special interest to those of us who do research related to public policy, public administration, and other kinds of organizational behavior. Administrative data are the data collected in the course of administering just about every agency, policy, and program. For public agencies, policies, and programs, they’re legally accessible thanks to freedom of information statutes, and they’re frequently available online. Since the 1990s, these datasets have become increasingly sophisticated due to escalating requirements for performance measurement and program evaluation. Still, beware: Administrative datasets are notoriously messy—these data usually weren’t collected with researchers in mind, so the datasets require a lot of cleaning, organizing, and careful scrutiny before they can be analyzed.