# 4.1: Correlation and Causation

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$ $$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$$$\newcommand{\AA}{\unicode[.8,0]{x212B}}$$

Learning Objectives

By the end of this section, you will be able to:

• Remember the definition of theory
• Understand how a theory is generated
• Apply a model theory
• Analyze increasingly complex theories
• Evaluate statements to determine if they are theories or not
• Create a theory

Before diving into theories, hypotheses, variables, and units, it’s important to highlight two broader concepts: correlation and causation. Correlation can be defined as a “process of establishing a relationship or connection between two or more measures” (“Correlation - Google Search” n.d.). For example, imagine a car is waiting at a road intersection. When the traffic light turns green, we observe the car move forward. It can be argued that there is a correlation between the color displayed on the traffic light and the movement of the vehicle. The traffic light–car example is relatively clear, but the question is: does the traffic light color cause the car to move? This question brings forward the concept of causation. Causation can be defined “as the action of causing or producing” (“Definition of Causation | Dictionary.com” n.d.). While the movement of the car corresponds to the color of the traffic light, what causes the movement of the traffic light is the driver pressing down on the accelerator pedal. Doing so, fuel is released into the engine which powers the turning of the wheels.

Why is correlation and causation important to political science? Correlation is important because it lets us establish connections between political ideas, actors, institutions, and processes. When we observe the world, our mind is primed to make connections between things. Doing so helps us give meaning to the world and develop our understanding of it.

For example, let’s explore the relationship between demographics and congressional representation. Below is a map of the United States. Each state is shaded in a color of sky-blue which denote the percentage of women who reside in each state. Using the legend in the bottom left corner of the map, we see that the lightest shade of sky-blue represents 47.9% to 50% of a state’s population is woman. The darkest shade means that women account for 51.5% to 52.6% of a state’s population. In other words, lighter shades mean a lower percentage of women and darker shades mean a higher percentage of women.

The next map of the United States displays information about the representation of women in the 116th Congress. In reviewing the map, we see variation in the number of women who represent different states. For example, we see that California has 20 women representing it in Congress. While this map doesn’t differentiate between the Senate and the House of Representatives, we know that California has two female senators and eighteen Congresswomen. You will notice that the following states have no female representation: Idaho, Montana, North Dakota, South Dakota, Utah, Arkansas, Louisiana, Kentucky, South Carolina, Vermont, Rhode Island, and Maryland.

Seeing these two maps lets us establish a connection between the two concepts represented by the maps. The question we ask ourselves is does there appear to be a correlation between the percent of women living in a state and the number of women representing that state and Congress? In reviewing both maps, it would be fair to suggest that there does appear to be a correlation between the two. For example, we see that Idaho, Montana, and the Dakotas have 50% or fewer women living in these states. Then when we look at the congressional map, we see that those states have no females representing them in Congress. Therefore, we have some evidence to suggest that there is a relationship.

In political science, we are interested in exploring this relationship further. A question we can ask ourselves is: as the percentage of women increases in a state, do we see an increase in the number of women in Congress? And using the language of causation, we could ask: do greater numbers of women cause an increase in the number of women representatives? The figure below is a visualization of a correlation between our two concepts. As we will explore later in this chapter, this is an example of what we call a causal model.

There is a commonly repeated adage that correlation does not equal causation. In political science, we take this adage to heart because it is important to be critical of what we perceive to be connections between two concepts and not making the inferential leap that one is caused by the other. Unlike our peers in the natural sciences, we study individuals, institutions, and processes that are inherently complex and intertwined. We, like most others, can be susceptible to presuming that there is a causal relationship between objects we are observing. Therefore, it is important to take to heart that correlation is a prerequisite to causation, but there are other conditions that need to be satisfied for us to make the inference of causality.

## Four Conditions of Causality

There are four conditions of causality: logical time ordering, correlation, mechanism, and nonspuriousness. Logical time ordering refers to the idea that one variable needs to precede another variable in time for the first variable to influence the second variable. For example, throughout the world, people are protesting their governments. In some countries, governments respond with the metaphorical yawn. However, in other countries, the governments may respond with repressive tactics. The question is do the protest precede the government response? On its face, the answer is yes because why would the government respond to silence?

The second condition of causality is correlation. As we explored above, correlation is a connection between two variables. Correlation is a prerequisite to establishing a causal relationship because if two variables do not move together, then it is difficult to suggest that one influences the other. Maintaining our example of public protest and government response, we often see that when people protest, the government pays attention. This is due to mainstream media coverage and social media activity of the protest. Since governments typically have responsibility for maintaining peace and security, anytime there are activities that may disrupt peace, the government will likely pay attention to what the media is covering and decide whether to respond.

Our third condition of causality is mechanism. A causal mechanism is an explanation for how one variable influences the other. Explanations can vary from relatively straightforward to exhaustively complex. There is utility in employing both types of explanations to describe the influence of one variable on the next variable. The reason is it may be straightforward to some while the government responds to protesters. However, underlying this interaction, there may be other actors, decisions, and actions that may shape engagement between the government and protesters. For example, the Arab Spring starting in 2010 provides a contemporary example where people throughout countries in the Middle East publicly protested for changes in their political leadership and government systems. How did these protesters come together? Some researchers point to social media, like Facebook and Twitter, which helped people collectively organize their protesting efforts. Thus, we have a mechanism that shows how protest formed, and how that initiated reaction from governments.

The final condition of causality is non-spuriousness. Non-spuriousness means that another variable is not having an influence. With our example of protest and government response, we must be careful to consider that other factors may influence this relationship. What else could influence a government’s response to a protest within its country? A government may be hesitant to respond with lethal force if it knows it’s being observed by an international media. An international media outlet serves as a third-party observer to the activities within a country. As the media records through video and first-hand accounts, they can begin to share that information with the rest of the world. A government that uses lethal weapons on people who are peacefully protesting could result in an outcry from the international community. Thus, are protests the only thing that is influencing the government’s response? Or is there a spurious factor, such as the international media outlet, that having the government question how it should respond?

As you can see, from a running example of public protest and government action, establishing a causal relationship between two variables is difficult. The difficulty doesn’t mean we don’t work through these four conditions, both using reason and evidence, rather it represents a rigorous way to determine a causal relationship.

This page titled 4.1: Correlation and Causation is shared under a CC BY-NC license and was authored, remixed, and/or curated by Josue Franco.