2.3: Case Selection (Or, How to Use Cases in Your Comparative Analysis)

Last updated
Save as PDF

Page ID: 150427

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Learning Objectives

By the end of this section, you will be able to:

Discuss the importance of case selection in case studies.
Consider the implications of poor case selection.

Introduction

Case selection is an important part of any research design. Deciding how many cases, and which cases to include, clearly help determine the outcome of our results. Large-N research is when the number of observations or cases is large enough that we would need mathematical, usually statistical, techniques to discover and interpret any correlations or causations. In order for a large-N analysis to yield any relevant findings, a number of conventions need to be observed.

First, the sample needs to be representative of the studied population. Thus, if we wanted to understand the long-term effects of COVID, we would need to know the approximate details of those who contracted the virus. Once the parameters of the population are known, we can then determine a sample that represents the larger population. For example, women make up 55% of all long-term COVID survivors. Thus, any sample we generate needs to be at least 55% women.
Second, some kind of randomization technique needs to be involved. In other words, there must be randomly selected people within the sample. Randomization would help to reduce bias in the study. Also, when cases (people with long-term COVID) are randomly chosen they need to ensure a fairer representation of the studied population.
The sample needs to be large enough, hence the large-N designation, for any conclusions to have any external validity. Generally speaking, the larger the number of observations/cases in the sample, the more validity in the study. There is no magic number. However, the sample of long-term COVID patients should be at least over 750 people, with an aim of around 1,200 to 1,500 people.

When it comes to comparative politics, we rarely ever reach the numbers typically used in large-N research. There are approximately 195 fully recognized countries, a dozen partially recognized countries, and even fewer areas or regions of study, such as Europe or Latin America. Given this, what is the strategy when one case, or a few cases, is being studied? What happens if we are only wanting to know the COVID-19 response in the United States, and not the rest of the world? How do we randomize to ensure the results are not biased? These questions are legitimate issues that many comparativist scholars face when completing research.

Does randomization work with case studies? Gerring suggests that it does not, as “any given sample may be widely representative” (pg. 87). Thus, random sampling is not a reliable approach when it comes to case studies. And even if the randomized sample is representative, there is no guarantee that the gathered evidence would be reliable.

In large-N research, potential errors and/or biases may be ameliorated (make better or more tolerable), especially if the sample is large enough. Incorrect or biased inferences are less of a worry when we have 1,500 cases versus 15 cases. In small-N research, case selection simply matters much more.

According to Blatter and Haverland (2012), “case studies are ‘case-centered’, whereas large-N studies are ‘variable-centered’". In large-N studies, the concern is with conceptualization and operationalization of variables. So which data should be included in the analysis of long-term COVID patients? A survey might be an option, with appropriately constructed questions. Why? For almost all survey-based large-N research, the question responses become the coded variables used in the statistical analysis.

Case selection can be driven by a number of factors in comparative politics.

First, it can derive from the interests of the researcher(s). For example, if the researcher lives in Germany, they may want to research the spread of COVID-19 within the country, possibly using a subnational approach comparing infection rates among German states.
Second, case selection may be driven by area studies. Researchers may pick areas of study due to their personal interests. For example, an European researcher may study COVID-19 infection rates among European Union member-states.
Finally, the selection of cases selected may be driven by the type of case study that is utilized.
- Compare their similarities or their differences.
- Compare the typical or atypical (deviate from the norm).

Types of Case Studies: Descriptive vs. Causal

John Gerring (2017) suggests that the central question posed by the researcher dictates the aim of the case study. Is the study meant to be descriptive? If so, what is the researcher looking to describe? How many cases (countries, incidents, events) are there? Or is the study meant to be causal, where the researcher is looking for a cause and effect? Given this, Gerring categorizes case studies into two types: descriptive and causal.

Descriptive case studies are “not organized around a central, overarching causal hypothesis or theory” (pg. 56). Researchers simply seek to describe what they observe. They are useful for transmitting information regarding the studied political phenomenon. For a descriptive case study, a scholar might choose a case that is considered typical of the population, such as the effects of the pandemic on medium-sized cities in the US. This city would have to exhibit the tendencies of medium-sized cities throughout the entire country.

First, we would have to conceptualize what we mean by ‘a medium-size city’.

Second, we would then have to establish the characteristics of medium-sized US cities, so that our case selection is appropriate. Alternatively, cases could be chosen for their diversity. In keeping with our example, maybe we want to look at the effects of the pandemic on a range of US cities, from small, rural towns, to medium-sized suburban cities to large-sized urban areas.

Causal case studies are “organized around a central hypothesis about how X affects Y” (pg. 63). The context around a specific political phenomenon allows for researchers to identify the aspects that set up the conditions, and the mechanisms for that outcome to occur. Scholars refer to this as the causal mechanism. Remember, causality is when a change in one variable verifiably causes an effect or change in another variable. Thus, Gerring divides the mechanisms into three categories. The differences revolve around how the central hypothesis is utilized in the study.

Exploratory case studies are used to identify a potential causal hypothesis. Researchers will single out the independent variables that seem to affect the outcome, or dependent variable. Context is more about hypothesis generating as opposed to hypothesis testing. Case selection can vary widely depending on the goal of the researcher. For example, if the scholar is looking to develop an ‘ideal-type’, they might seek out an extreme case. Thus, if we want to understand the ideal-type capitalist system, we would investigate a country that practices a pure or ‘extreme’ form of the economic system.
Estimating case studies start with a hypothesis already in place. The goal is to test the hypothesis through collected data/evidence. Researchers seek to estimate the ‘causal effect’. In other words, is the relationship between the independent and dependent variables positive, negative, or none existent.
Diagnostic case studies help to “confirm, disconfirm, or refine a hypothesis” (Gerring 2017). Case selection can vary. For example, scholars can choose a least-likely case, or a case where the hypothesis is confirmed even though the context would suggest otherwise. A good example would be looking at Indian democracy, which has existed for over 70 years. India has a high level of ethnolinguistic diversity, is relatively underdeveloped economically, and has a low level of modernization through large swaths of the country. All of these factors strongly suggest that India should not have democratized, should have failed to stay a democracy in the long-term, or have disintegrated as a country.

Most Similar/Most Different Systems Approach

Single case studies are valuable as they provide an opportunity for in-depth research on a topic that requires it. However, in comparative politics, our approach is to compare. Given this, we are required to select more than one case. Challenges quickly emerge. First, how many cases do we pick? Second, how do we apply the case selection techniques, descriptive vs. causal? Do we pick two extreme cases if using an exploratory approach, or two least-likely cases if choosing a diagnostic case approach?

English scholar John Stuart Mill developed several approaches to comparison with the explicit goal of isolating a cause within a complex environment. Two of these methods, the 'method of agreement' and the 'method of difference' have influenced comparative politics.

In the 'method of agreement', two or more cases are compared for their commonalities. The scholar looks to isolate the common characteristic, or variable, which is then established as the cause for their similarities.
In the 'method of difference', two or more cases are compared for their differences. The scholar looks to isolate the characteristic, or variable, that the cases do not have in common.

From these two methods, comparativists have developed two approaches.

What Is the Most Similar Systems Design (MSSD)?

Derived from Mill’s ‘method of difference’, the Most Similar Systems Design Design (MSSD) compares cases but the outcomes differ in result. In this approach, an attempt is made to keep as many of the variables the same across the selected cases. Remember, the independent variable (cause) is the factor that doesn’t depend on changes in other variables. The dependent variable (effect) is affected by, or dependent on, the independent variable. In a most similar systems approach, the variables of interest should remain the same.

There is no national healthcare system in the United States. Meanwhile, New Zealand, Australia, Ireland, UK, and Canada have robust, publicly accessible national health systems. All these countries have similar systems: English heritage and language use, liberal market economies, strong democratic institutions, and high levels of wealth and education. Yet, despite these similarities, the end results vary. The US does not look like its peer countries.

Just for fun! Try your hand at some cause-and-effect scenarios:

cause and effect.PNG

Source: Upperelementary Snapshots

What Is the Most Different Systems Design (MDSD)?

In a Most Different System Design, the cases selected are different from each other, but result in the same outcome. Thus, the dependent variable is the same. Different independent variables exist between the cases, such as democratic v. authoritarian regime, liberal market economy v. non-liberal market economy. Or it could include other variables such as societal homogeneity (uniformity) vs. societal heterogeneity (diversity), where a country may find itself unified ethnically/religiously/racially, or fragmented along those same lines.

An example would be countries that are classified as economically liberal. The Heritage Foundation lists countries Singapore, Taiwan, Estonia, Australia, New Zealand, Switzerland, Chile, and Malaysia as either free or mostly free. Yet, these countries differ greatly from one another. Singapore and Malaysia are considered flawed or illiberal democracies (see chapter 5 for more discussion), whereas Estonia is still classified as a developing country. Australia and New Zealand are wealthy, Malaysia is not. Chile and Taiwan became economically free countries under authoritarian military regimes, which is not the case for Switzerland. In other words, why do we have different systems producing the same outcome?