1.6.3: Formal research designs
- Page ID
Simply collecting data is insufficient to answer research questions; we must have a plan, a research design to enable us to draw conclusions from our observations. Different methodologists divvy up the world of research designs different ways; we’ll use five categories: cross-sectional, longitudinal, experimental, quasi-experimental, and case study.
Cross-sectional research design is the simplest. Researchers following this design are making observations at a single point in time; they’re taking a “snapshot” of whatever they’re observing. Now, we can’t take this too literally. A cross-sectional survey may take place over the course of several weeks. The researcher won’t, however, care to distinguish between responses collected on day 1 versus day 2 versus day 28; it’s all treated as having been collected in one wave of data collection. Cross-sectional research design is well suited to descriptive research, and it’s commonly used to make cross-case comparisons, like comparing the responses of men to the responses of women or the responses of Republicans to the responses of Democrats. If we’re interested in establishing causality with this research design, when we have to be sure that cause comes before effect, though, we have to be more careful. Sometimes it’s not a problem. If you’re interested in determining whether respondents’ region of birth influences their parenting styles, you can be sure that the respondents were born wherever they were born before they developed any parenting style, so it’s OK that you’re asking them questions about all that at once. However, if you’re interested in determining whether interest in politics influences college students’ choice of major, a cross-sectional design might leave you with a chicken-and-egg problem: Which came first? A respondent’s enthusiasm for following politics or taking her first political science course? Exploring causal research questions using cross-sectional design isn’t verboten, then, but we do have to be cautious.
Longitudinal research design involves data collection over time, permitting us to measure change over time. If a different set of cases is observed every time, it’s a time series research design; if the same cases are followed over time, with changes tracked at the case level, it’s a panel design.
Experimental research design is considered by most to be the gold standard for establishing causality. (This is actually a somewhat controversial statement. We’ll ignore the controversy here except to say that most who would take exception to this claim are really critical of the misapplication of this design, not the design itself. If you want to delve into the controversy, do an internet search for federally required randomized controlled trial program evaluation designs.) Let’s imagine an experimental-design study of whether listening to conservative talk radio affects college students’ intention to vote in an upcoming election. I could recruit a bunch of students (with whichever sampling plan I choose) and then have them all sit in a classroom listening to MP3 players through earbuds. I would have randomly given half of them MP3 players with four hours of conservative talk radio excerpts and given the other half MP3 players with four hours of muzak. Before they start listening, I’ll have them respond to a questionnaire item about their likelihood of voting in the upcoming election. After the four hours of listening, I’ll ask them about their likelihood of voting again. I’ll compare those results, and if the talk radio group is now saying they’re more likely to vote while the muzak group’s intentions stayed the same, I’ll be very confident in attributing that difference to the talk radio.
My talk radio experiment demonstrates the three essential features of experimental design: random assignment to experimental and control groups, control of the experimental setting, and manipulation of the independent variable. Control refers to the features of the research design that rule out competing explanations for the effects we observe. The most important way we achieve control is by the use of a control group. The students were randomly assigned to a control group and an experimental group. The experimental group gets the “treatment”—in this case, the talk radio, and the control group gets the status quo—in this case, listening to innocuous muzak. Everything else about the experimental conditions, like the time of day and the room they were sitting in, were controlled as well, meaning that the only difference in the conditions surrounding the experimental and control groups was what they listened to. This experimental control let me attribute the effects I observed—increases in the experimental group’s intention to vote—to the cause I introduced—the talk radio.
The third essential feature of experimental design, manipulation of the independent variable, simply means the researcher determines which cases get which values of the independent variable. This is simple with MP3 players, but, as we’ll see, it can be impossible with the kinds of phenomena many social researchers are interested in.
Experimental methods are such strong designs for exploring questions of cause and effect because they enable researchers to achieve the three criteria for making causal claims—the standards we use to assess the validity of causal claims: time order, association, and nonspuriousness. Time order is the easy one (unless you’re on the starship Enterprise). We can usually establish that cause preceded effect without a problem. Association is also fairly easy. If we’re working with quantitative data (as is usually the case in experimental research designs), we have a whole arsenal of statistical tools for demonstrating whether and in what way two variables are related to each other. If we’re working with qualitative data, good qualitative data analysis techniques can convincingly establish association, too.
Meeting the third criterion for making causal claims, nonspuriousness, is trickier. A spurious relationship is a phony relationship. It looks like a cause-and-effect relationship, but it isn’t. Nonspuriousness, then, requires that we establish that a cause-and-effect relationship is the real thing—that the effect is, indeed, due to the cause and not something else. Imagine conducting a survey of freshmen college students. Based on our survey, we claim that being from farther away hometowns makes students more likely to prefer early morning classes. Do we meet the first criterion? Yes, the freshmen were from close by or far away before they ever registered for classes. Do we meet the second criterion? Well, it’s a hypothetical survey, so we’ll say yes, in spades: Distance from home to campus and average class start time are strongly and inversely correlated.
What about nonspuriousness, though? To establish nonspuriousness, we need to think of any competing explanations for this alleged cause-and-effect relationship and rule them out. After running your ideas past the admissions office folks, you learn that incoming students from close by usually attend earlier orientation sessions, those from far away usually attend later orientation sessions, and—uh-oh—they register for classes during orientation. We now have a potential competing explanation: Maybe freshmen who registered for classes later are more likely to end up in early morning classes because classes that start later are already full. The students’ registration date, then, becomes a potentially important control variable. It’s potentially important because it’s quite plausibly related to both the independent variable (distance from home to campus) and the dependent variable (average class start time). If the control variable, in fact, is related to both the independent variable and dependent variable, then that alone could explain why the independent and dependent variables appear to be related to each other when they’re actually not. When we do the additional analysis of our data, we confirm that freshmen from further away did, indeed, tend to register later than freshmen from close by, that students who register later tend to end up in classes with earlier start times, and, when we control for registration date, there’s not an actual relationship between distance from home and average class start time. Our initial causal claim does not achieve the standard of nonspuriousness.
The beauty of experimental design—and this is the crux of why it’s the gold standard for causal research—is in its ability to establish nonspuriousness. When conducting an experiment, we don’t even have to think of potential control variables that might serve as competing explanations for the causal relationship we’re studying. By randomly assigning (enough) cases to experimental and control groups and then maintaining control of the experimental setting, we can assume that the two groups and their experience in the course of the study are alike in every important way except one—the value of the independent variable. Random assignment takes care of potential competing explanations we can think of and competing explanations that never even occur to us. In a tightly controlled experiment, any difference observed in the dependent variable at the conclusion of the experiment can confidently be attributed to the independent variable alone.
“Tightly controlled experiments,” as it turns out, really aren’t that common in social research, though. Too much of what we study is important only when it’s out in the real world, and if you try to stuff it into the confines of a tightly controlled experiment, we’re unsure if what we learn applies to the real thing. Still, experimental design is something we can aspire to, and the closer we can get to this ideal, the more confident we can be in our causal research. Whenever we have a research design that mimics experimental design but is missing any of its key features— random assignment to experimental and control groups, control of the experimental setting, and manipulation of the independent variable—we have a quasi-experimental design.
Often, randomly assigning cases to experimental and control groups is prohibitively difficult or downright impossible. We can’t assign school children to public schools and private schools, we can’t assign future criminals to zero tolerance states and more lax states, and we can’t assign pregnant women to smoking and nonsmoking households. We often don’t have the power to manipulate the independent variable, like deciding which states will have motor-voter laws and which won’t, to test its effects on voting behaviors. Very rarely do we have the ability to control the experimental setting; even if we could randomly assign children to two different kindergarten classrooms to compare curricula, how can other factors—the teachers’ personalities, for instance—truly be the same?
Quasi-experimental designs adapt to such research realities by getting as close to true experimental design as possible. There are dozens of variations on quasi-experimental design with curious names like regression discontinuity and switching replications with nonequivalent groups, but they can all be understood as creative responses to the challenge of approximating experimental design. When we divide our cases into two groups by some means other than random assignment, we don’t get to use the term control group anymore, but comparison group instead. The closer our comparison group is to what a control group would have been, the stronger our quasi-experimental design. To construct a comparison group, we usually try to select a group of cases similar to the cases in our experimental group. So, we might compare one kindergarten classroom enjoying some pedagogical innovation to an adjacent kindergarten classroom with the same old curriculum or Alabama drivers after a new DUI law to Mississippi drivers not bound by it.
If we’re comparing these two groups of drivers, we’re also conducting a natural experiment. In a natural experiment, the researcher isn’t able to manipulate values of the independent variable; we can’t decide who drives in Mississippi or Alabama, and we can’t decide whether or not a state would adopt a new DUI law. Instead, we take advantage of “natural” variation in the independent variable. Alabama did adopt a new DUI law, and Mississippi did not, and people were driving around in Alabama and Mississippi before and after the new law. We have the opportunity for before-and-after comparisons between two groups, it’s just that we didn’t introduce the variation in the independent variable ourselves; it was already out there.
Social researchers also conduct field experiments. In a field experiment, the researcher randomly assigns cases to experimental and comparison groups, but the experiment is carried out in a real-life setting, so experimental control is very weak. I once conducted a field experiment to evaluate the effectiveness of an afterschool program in keeping kids off drugs and such. Kids volunteered for the program (with their parents’ permission). There were too many volunteers to participate all at once, so I randomly assigned half of them to participate during fall semester and half to participate during spring semester. The fall semester kids served as my experimental group and, during the fall semester, the rest of the kids served as my comparison group. At the beginning of the fall semester, I had all of them complete a questionnaire about their attitudes toward drug use, etc., then the experimental group participated in the program while the control group did whatever they normally did, and then at the end of the semester, all the kids completed a similar questionnaire again. Sure enough, the experimental group kids’ attitudes changed for the better, while the comparison group kids’ attitudes stayed about the same (or even changed a bit for the worse). All throughout the program, the experimental group and comparison group kids went about their lives—I certainly couldn’t maintain experimental control to ensure that the only difference between the two groups was the program.
Very strong research designs can be developed by combining one of the longitudinal designs (time series or panel) with either experimental or quasi-experimental design. With such a design, we observe values of the dependent variable for both the experimental and control (or comparison) groups at multiple points in time, then we change (or observe the change of) the independent variable for the experimental group, and then we observe values of the dependent variable for both groups at multiple points in time again.
That’s a bit confusing, but an example will clarify: Imagine inner-city pharmacies agree to begin stocking fresh fruits and vegetables, which people living nearby otherwise don’t have easy access to. We might want to know whether this will affect area residents’ eating habits. There are lots of ways we could go about this study, but probably the strongest design would be an interrupted time series quasi-experimental design. Here’s how it might work: Before the pharmacies begin stocking fresh produce, we could conduct door-to-door surveys of people in two inner-city neighborhoods—one without a pharmacy and one with a pharmacy. We could survey households once a month for four months before the produce is stocked, asking folks about how much fresh produce they eat at home.
(A quick aside: We’d probably want to talk to different people each time since, otherwise, just the fact that we keep asking them about their eating habits, they might change what they eat—an example of a measurement artifact, which we try to avoid. We want to measure changes in our dependent variable, eating habits, that are due to change in the independent variable, availability of produce at pharmacies, not due to respondents’ participation in the study itself.)
After the pharmacies begin stocking fresh produce, we would then conduct our door-to-door surveys in both neighborhoods again, perhaps repeating them once a month for another four months. Once we’re done, we’d have a very rich dataset for estimating the effect of available produce on eating habits. We could compare the two neighborhoods before the produce was available to establish just how similar their eating habits were before, and then we could compare the two neighborhoods afterward. We might see little difference one month after the produce became available as people became aware of it, then maybe a big difference in the second month in response to the novelty of having produce easily available, and then maybe a more moderate, steady difference in the third and fourth months as some people returned to their old eating habits and others continued to purchase the produce. With this design, we can provide very persuasive evidence that the experimental and comparison groups were initially about the same in terms of the dependent variable, which increases our confidence that any changes we see later are indeed due to the change in the independent variable. We can also capture change over time, which is frequently very important when we’re measuring behavioral changes, which tend to diminish over time.
Case study research design is the oddball of the formal research designs. Many researchers who feel comfortable with all the other designs would feel ill equipped to undertake a case study. A case study is the systematic study of a complex case that is in-depth and holistic. Unlike the other designs, we’re just studying a single case, which is usually something like an event, such as a presidential election, or a program, such as the operation of a needle exchange program. With the other designs, we usually rely on a single data collection method, but with case study research design, we use multiple data collection methods, with a heavy emphasis on collecting qualitative data. In the course of a single case study, we might conduct interviews, conduct focus groups, administer questionnaires, survey administrative records, and conduct extensive direct observations. We make enough observations in as many different ways as necessary to enable us to write a rich, detailed description of our case. This written report is, itself, called a case study.
The richness of case studies highlights another key difference between this and the other research designs. The contrast with experimental design is sharpest: If you think about experimental design, its beauty lies in ignoring complexity. If I were to randomly assign a bunch of teenagers to experimental and control groups, my express intention would be to ignore all their pimply, hormonal, awkward, exuberant complexity and the group dynamics that would undoubtedly emerge in the two groups. I count on random assignment and experimental control to make all differences between the two groups a complete wash except the difference in the independent variable. With case studies, though, we embrace this complexity. The whole point is to describe this rich complexity, bringing only enough organization to it to make it understandable to people who can’t observe it directly—those people who will ultimately read our written case studies.
There are many elaborations on these formal research designs. A few more, along with a system of notation for depicting research designs, are presented in Appendix B.