Learning Objectives
By the end of this section, you will be able to:
- Consider the process by which concepts are operationalized to begin collecting relevant data in the “real” world
- Understand aspects of data collection - What, why, how
Operationalize a concept
After putting a name to observations of the world – creating concepts – the next step is to “operationalize” those concepts. Operationalization is the process by which a researcher defines a concept in measurable terms. In other words, “to operationalize a concept means to put it in a form that permits some kind of measurement of variation.”4 Variation implies that the measure selected will take on different values. For example, one operationalization of the concept “regime” might be to focus on the number of leaders in power. This might be measured by counting individuals in power. Observing real world country cases, it would appear that this ranges in number from a single leader (such as Zimbabwe’s Robert Mugabe, who was either prime minister or president from 1980 to 2017) to many (such as China’s Politburo Standing Committee, which has varied from five to eleven decision-makers since 1949).
Note the importance of variation when operationalizing a concept. Without variation, it is difficult to identify patterns of association such as correlation and causation. If “regime” were operationalized more broadly (and poorly) as “presence of a government,” then there would be no variation on this measure in the contemporary world. It would then be difficult to ascertain the causal effect of regime type on some outcome of interest (i.e., dependent variable), such as interstate war, if the operationalization of that concept did not vary.
A constant – the presence of government – cannot therefore explain something that varies, which in this example is the presence or absence of interstate war. This problem also arises if we treat this operationalization of regime as the outcome of interest. Again, an absence of variation makes it difficult to ascertain determinants of that constant. Imagine asking whether levels of economic growth have some effect on regime type. Economic growth varies by country, but if regime type is operationalized as the presence of a government, this constant cannot be explained by other social phenomena which vary.
Operationalizing a concept must be done with some additional considerations in mind, specifically identifying valid and reliable measures of that concept. These considerations will be taken up in section 5.3 of this chapter. At the moment, the important thing is to think about ways to measure a concept and be sure that there is variation on that measure. Returning to the example of Aristotle, he first conceptualizes something we refer to today as “regime,” then operationalizes regime by suggesting two measures: how many leaders are in power and in whose interest they rule. For the first measure, Aristotle offers “one, few, [and] the many [rulers]” as three categories for measuring this concept. For the second measure, Aristotle offers two categories, whether a ruler is ruling in the name of “private” or “common” interests. A third measure that is commonly used today to operationalize the concept of regime is the presence of free and fair elections. This is a binary measure: does a country hold competitive elections or not? With these three measures as starting points, a researcher can embark on the process of data collection.
Collecting data
Data collection is the gathering of relevant information to inform a research topic or question. Ideally, collected data will help with answering a research question, but the process of data collection may entail learning about many aspects of a research topic before a question crystallizes. Chapters 6 and 7 will explore in more depth quantitative and qualitative methods for data collection. For our purposes here, the central questions will be,
- What kind of data should I collect?
- Why am I collecting this data?
- How can I collect this data?
Determining what kind of data to collect hinges of the operationalization of a concept. There are also practical scope considerations to resolve before embarking on data collection. These usually have to do with time and space: which period of time and which parts of the world (if not the entire world) to focus on. For beginning researchers, the best strategy for answering these questions is asking, what am I interested in? And do I have any prior knowledge that I can bring to bear on answering these questions of research scope? The first question is the more important one and reflecting on personal interest and taste is a good start.
Research and especially data collection require sustained effort and often present unexpected challenges, hence a genuine interest can help motivate a researcher through rough patches. The second question can also help relieve some of the challenges with data collection (e.g., overcoming linguistic constraints, knowledge of existing data sources, contextual expertise) but is of secondary importance. Research and data collection can certainly be about creating new knowledge on entirely unfamiliar topics, and unbridled curiosity is encouraged.
A second set of considerations hinges on whether a researcher is interested in quantitative, qualitative, or mixed sources of data. Chapters 7 and 8 take up qualitative and quantitative research methods, respectively, and here the focus is on which methods to pursue. The method often hinges on how a concept has been operationalized. If we operationalize regime as a simple count of how many leaders are in power in a country, then this lends itself to building a quantitative dataset. If we are interested in collecting the titles of those political offices, this suggests a more qualitative approach is needed. But perhaps both the number of leaders and their titles might be useful, which suggests collecting a mix of quantitative and qualitative data.
Taking up the second question, “Why am I collecting this data?” a researcher might return to first principles. What is the underlying concept of interest in this research project? How has that concept been operationalized, and does the proposed measure (or measures) vary in value? Data collection always demands resources, be it time or money or carbon emissions or all the above, hence it is important to question from the outset what kind of data might be ideal for understanding underlying concepts. Having a research question formulated can also help with this, as the proposed data collection can be more sharply evaluated when thinking about whether the ideal data might help to answer a central question of interest.
Finally, the third question a researcher might ask is, “How can I collect this data?” An important first step is conducting a literature review. As the saying goes, “Don’t reinvent the wheel.” A literature review is the process of reading relevant scholarly work on a research topic or research question of interest. This is often conducted with the assistance of other experts, for example professors, librarians, and colleagues. When reviewing relevant literature, a researcher can ascertain whether relevant data has already been collected and exists in an accessible dataset.
Or they might identify whether related research, and accompanying datasets, might be available and used in part to build a new dataset. There are many publicly available quantitative datasets available for download from the internet. Governments and international organizations such as the United Nations and World Bank are also common repositories of useful social science data. Librarians are also excellent resources and often know where to locate data within a library’s holdings. Figure 5.2 offers a starting point for locating common social science statistical datasets.
Some common sources of data for research in the social sciences
- Government Statistics: National governments are often the only institutions with the resources (and authority) to collect comprehensive social statistics, and thus publish the overwhelming majority of social statistics available. Most countries have a national statistical agency that collects and publishes statistics, and simply perusing that agency's website or publications catalog is often the best way to find their statistics. The US is more complicated, since responsibility for statistics is spread among many federal agencies. Wikipedia has a list of the principal federal statistical agencies. The United Nations and other international government organizations collate and publish comparative statistical data from their member nations. Most state, provincial, and municipal governments also collect and publish some statistics.
- Public Opinion Polls: News and political organizations routinely conduct or commission opinion polls on a variety of topics. Many of those poll results can be found at the ICPSR or other poll archives for which university libraries often have subscriptions.
- Academic Research: Social science researchers often gather data as part of their studies. The results are usually presented in the published academic literature. Search any of the major article databases to find these articles. Most articles will only contain summary data, but the complete datasets can often be obtained from the original researchers.
- Commercial Market and Business Research: Many corporations and trade organizations collect economic statistics and sell them for profit. Often a very hefty profit, which means university libraries purchase only a limited number of these data products.
Source: UCLA Library. Research Guides, “Social Statistics and Data,” Available online at http://guides.library.ucla.edu/data/sources
Statistical datasets are often available for download from the internet or via subscription from a university or college library. Qualitative datasets are generally more difficult to come by. In the course of conducting a literature review, a scholar may cite a qualitative dataset (typically their own), and these are sometimes available on scholars’ personal webpages or the webpages of affiliated research centers. It also doesn’t hurt to contact a scholar directly if you are interested in their data; the scholarly spirit is to share knowledge, after all.
4 Hoover, Kenneth and Todd Donovan. 2011. The Elements of Social Scientific Thinking. Tenth Edition. Wadsworth Cengage Learning, p. 42.