# Appendix E: Research Methods Glossary

- Page ID
- 122956

This glossary provides definitions for the research methods jargon found in this summary and for some other terms you might encounter as you learn more about research methods.

**Accuracy, level of **(in sampling): The breadth of the interval in which parameters can be estimated using statistics with a given level of confidence

**Administrative data**: Data collected in the course of implementing a policy or program or operating an organization

**Alternative hypothesis**: See *hypothesis testing *

**Analytic generalizability**: The extent to which a theory applies (“generalizes”) to a given case; demonstrating analytic generalizability is held by some researchers as a goal for qualitative research

**Antecedent variable**: An independent variable that causes changes in the key independent variable, which, in turn, causes change in the dependent variable

**Association**: A probabilistic relationship between two or more variables

**Axial coding**: Organizing the themes that emerge from open coding, frequently by combining them into general themes subdivided into more specific themes and identifying additional relationships among codes, resulting in an organized set of codes that can be used in subsequent analysis of qualitative data

**Bias**: The systematic distortion of findings due to a shortcoming of the research design

**Case study comparison research design**: Research design in which multiple case studies are conducted and compared

**Case study research design**: Systematic study of a complex case (such as an event, a program, a policy) that is in-depth, holistic, using multiple data sources/methods/collection techniques

**Case**: An object of systematic observations; an entity to which we assign values for variables

**Census**: (1) A sample comprised of the entire population; (2) a study in which the sample is comprised of the entire population

**Chunking**: Identifying short segments of meaningful qualitative data to be coded and analyzed

**Closed-ended question**: A survey or interview question that requires respondents to select from a set of predetermined responses

**Cluster sampling**: A probability sampling design in which successively narrower aggregates of cases are selected before ultimately selecting cases for inclusion in the sample

**Coding**: See axial coding, open coding, selective coding

**Concept**: An abstraction derived from what many instances of it have in common

**Concurrent validity**: A type of criterion validity describing the extent to which a variable (or set of variables intended to operationalize a single concept) relates to another variable measured at the same time as would be expected if the variable accurately measures what it is intended to measure

**Confidence, level of **(in sampling): The certainty, expressed as a percentage, with which parameters can be estimated using statistics with a given level of accuracy; the percentage of times an estimated parameter would be expected to be within a given range (the level of accuracy) if calculated using data collected from a large number of hypothetical samples

**Confidence interval**: The range of values we estimate a population parameter to fall in at a given level of confidence

**Content validity**: An aspect of operational validity describing the extent to which the operationalization of an abstract concept measures the full breadth of meaning connoted by the concept

**Control variable**: A variable that might threaten nonspuriousness when examining the causal relationship between an independent variable and dependent variable; control variables are plausibly related to both the independent and dependent variables and could thus explain an observed association between them; in an experiment or quasi-experiment, control variables are those variables held constant so that they cannot affect the dependent variable while the independent variable is manipulated

**Convenience sampling**: A nonprobability sampling design in which cases are selected because they are convenient for the researcher

**Conversational interviews**: Interview conducted following a very flexible protocol outlining general themes but permitting the interview to evolve like a natural conversation between the researcher and respondent

**Criterion validity**: An aspect of operational validity describing the extent to which a variable (or set of variables intended to operationalize a single concept) is associated with another variable as would be expected if the variable accurately measures what it is intended to measure

**Cross sectional research design**: A formal research design in which data are collected in one “wave” of data collection, with data analysis making no distinction among data collected at different times

**Data analysis**: Systematically finding patterns in data

**Dependent variable**: A variable with values that are dependent on the values of another variable; in a cause-and-effect relationship, the variable representing the effect

**Descriptive data analysis**: Quantitative data analysis that summarizes characteristics of the sample

**Discriminate validity**: An aspect of operational validity describing the extent to which the operationalization of an abstract concept discriminates between the target concept and other concepts

**Disproportionate stratified sampling**: A probability sampling design in which the proportions of cases in the population demonstrating known characteristics are intentionally and strategically different for the cases in the sample, usually to permit comparisons among subsets of the sample that may otherwise have had too few cases

**Dissemination**: To share the results of a study and how it was conducted widely, usually by publication

**Double-barreled question**: A question, such as in an interview or survey, that is actually asking two questions at once

**Effect size**: A quantitative measure of the magnitude of a statistical relationship

**Empirical research**: Generating knowledge based on systematic observations

**Empirical**: Based on systematic observation

**Empiricism**: The stance that the only things that are “real” and therefore matter are those things that can be directly observed; not to be confused with *empirical *

**Experimental research design**: A formal research design in which cases are randomly assigned to at least one experimental group and one control group with the researcher determining the values of the independent variables that will be assigned to each group and the dependent variable measured after (and usually before as well) manipulation of the independent variable

**External validity**: The generalizability of claims generated by empirical research beyond cases directly observed

**Face validity**: An aspect of operational validity describing the extent to which a variable (or set of variables intended to operationalize a single concept) appears to measure what it is intended to measure

**Fact-value dichotomy**: The naïve view that *fact *and *value *are always wholly distinct categories

**Focus group**: A group of individuals who share something in common of relevance to the research project who are interviewed together and encouraged to interact to allow themes to emerge from the group discourse

**Generalize**: To make claims beyond what can be claimed based on direct observation, such as making claims about an entire population based on observations of a sample of the population

**Hawthorne effect**: Bias resulting from changes in research participants’ behavior effected by their awareness of being observed

**Hypothesis**: A statement describing the expected relationship between two or more variables

**Hypothesis testing**: A method used in inferential statistics wherein the statistical relationships observed in sample data are compared to a hypothetical distribution of data in which there is no analogous relationship to generate an estimate of how likely or unlikely the observed relationship is; the observed relationship being tested is stated as the *alternative hypothesis*, which is compared to the statement of no relationship, the *null hypothesis *

**Independent variable**: A variable with values that, at least in part, determine values of another variable; in a cause-and-effect relationship, the variable representing the cause

**Inferential data analysis**: Quantitative data analysis that uses statistics to estimate parameters

**Informed consent**: An individual’s formal agreement to participate in a study after receiving information about the study’s risks and benefits, assurances that participation is voluntary, what participation will entail, confidentiality safeguards, and whom to contact if they have questions or concerns about the study

**Institutional Review Board**: A committee responsible for ensuring compliance with ethical standards for conducting research at an institution, such as a university

**Internal validity**: The truth of causal claims inferred from empirical research

**Interval scale of measurement**: Describes a variable with numeric values but no natural zero

**Intervening variable**: An independent variable that itself is affected by the key independent variable and then, in turn, causes change in the dependent variable

**Interview protocol**: The set of instructions and questions used to guide interviews

**Latent variable**: A variable that cannot be directly observed, such as an abstract concept, attitude, or private behavior

**Literature review**: (1) The process of finding and learning from previous research as one of the early steps in the research process; (2) a paper that summarizes, structures, and evaluates the existing body of knowledge addressing a research question; (3) a section of a larger research report that summarizes, structures, and evaluates the existing body of knowledge being addressed by the research and locates the research being reported in that larger body of knowledge

**Logic model**: A diagram depicting a program’s inputs, activities, outputs, and outcomes

**Manifest variable**: A variable that can be observed and is thought to indicate the values of latent variable

**Memoing**: Writing notes to document the qualitative researchers’ thought processes associated with every step of qualitative research and their evolving ideas about what is being learned during the course of data analysis

**Meta-analysis**: A method of synthesizing previous research using statistical techniques that combine the results from multiple separate studies; the results of research using this method

**Mixed methods research**: Research using both qualitative and quantitative data

**Natural experiment**: A quasi-experimental design that capitalizes on “naturally” occurring variation in the independent variable

**Nominal scale of measurement**: Describes a variable with categorical values that have no inherent order

**Nonparametric data analysis**: Analysis of quantitative data using statistical techniques suitable because the data do not have an underlying normal distribution, homogeneous variance, and independent error terms

**Nonprobability sampling design**: A strategy for selecting a sample in which the probability of cases being selected is either unknown or not considered when selecting cases for inclusion in the sample, with sample selection made for some other reason (see *convenience sampling*, *purposive sampling*, *quota sampling*, and *snowball sampling*)

**Nonspurious**: Not attributable to any other factor

**Null hypothesis**: See *hypothesis testing *

**Open coding**: Assigning labels/descriptors/tags to “chunks” of qualitative data that note the data’s significance for addressing the research question; a first step in identifying important themes that emerge from qualitative data

**Open-ended question**: A survey or interview question without any predetermined responses

**Operational validity**: The extent to which a variable (or set of variables intended to operationalize a single concept) accurately and thoroughly measures what it is intended to measure

**Operationalize**: To describe how observations will be made so that values can be assigned to variables for cases

**Ordinal scale of measurement**: Describes a variable with categorical values that have an inherent order

**Panel research design**: A formal research design in which data are collected at different points across time from the same sample

**Parameter**: A quantified summary characteristic of a population

**Parametric data analysis**: Analysis of quantitative data using statistical techniques suitable only because the data have an underlying normal distribution, homogeneous variance, and independent error terms

**Peer review**: The process of having a research report (or other form of scholarship) reviewed by scholars in the field, usually as a prerequisite for publication

**Plagiarism**: The written misrepresentation of someone else’s words or ideas as one’s own

**Point estimate**: A statistic calculated from sample data used to estimate the population parameter; usually referred to in distinction to the *confidence interval *

**Policy model**: An explanation of how a policy is supposed to work, including its inputs, how it is intended to be implemented, its intended outcomes, and the assumptions that undergird the intended change process

**Population**: Total set of cases of interest; all cases to which the research is intended to apply

**Predictive validity**: A type of criterion validity describing the extent to which a variable (or set of variables intended to operationalize a single concept) predicts future change in another variable as would be expected if the variable accurately measures what it is intended to measure

**Probability sampling design**: A strategy for selecting a sample in which every case in the population has a known (or knowable) nonzero probability of being included in the sample

**Proportionate stratified sampling**: A probability sampling design in which the proportions of cases in the population demonstrating known characteristics are replicated in the sample

**Purposive sampling**: A nonprobability sampling design in which cases are selected because they are of interest, typical, or atypical as suits the purposes of the research

**Qualitative data**: Textual data

**Quantitative data**: Numeric data

**Quasi-experimental research design**: A formal research design similar to experimental research design but with assignment to experimental and comparison groups made in a nonrandom fashion

**Quota sampling**: A nonprobability sampling design in which cases are selected as in convenience sampling but such that the sample demonstrates desired proportions of characteristics, either to replicate known population characteristics or permit comparisons of subsets of the sample

**Ratio scale of measurement**: Describes a variable with numeric values and a natural zero

**Reliability**: The extent to which hypothetical repeated measures of variables would generate the same values for the same cases

**Research design**: 1) Generally, a description of the entire research process; 2) more narrowly, the formal research design used to structure the research, including cross-sectional, time series, panel, experimental, quasi-experimental, and case study research designs

**Response set bias**: Bias resulting from a response set that leads respondents to select responses other than more accurate responses

**Response set**: The set of responses that respondents may select from when answering a closed-ended question

**Sample**: Subset of population used to learn about the population; the cases which are observed

**Sampling error**: The difference between a statistic and its corresponding parameter

**Sampling frame**: List of cases from which a sample is selected

**Secondary data**: Data collected by someone other than the researcher, usually without having anticipated how the data would ultimately be used by the researcher

**Selective coding**: Assigning a set of codes (such as a system of codes developed through axial coding) to “chunks” of qualitative data

**Semi-structured interviews**: Interviews conducted following an interview protocol that specifies questions and potential follow-up questions but permitting flexibility in the order and specific wording of questions

**Simple random sampling**: A probability sampling design in which every case in the population has an equal probability of being selected for inclusion in the sample

**Snowball sampling**: A nonprobability sampling design in which one case is selected for the sample, which then leads the researcher to another case for inclusion in the sample, then another case, and so on (also called *network sampling *when cases are people)

**Social desirability bias**: The tendency of interviewees to provide responses they think are more socially acceptable than accurate responses

**Standardized interview**: Interviews conducted following an interview protocol requiring identical wording and question order for all respondents

**Statistic**: A quantified summary characteristic of a sample

**Systematic sampling**: A probability sampling design in which every kth case in the sampling frame is selected for inclusion in the sample where k equals the number of cases in the population divided by the number of cases desired to be in the sample

**Theory**: A set of concepts and relationships among those concepts posited in a formal statement to describe or explain the phenomenon of interest

**Time series research design**: A formal research design in which data are collected at different points across time from independent samples

**Unit of analysis**: The entity—the whom or what—that is being studied; the entity for which observations are being recorded in a study

**Validity**: Truthfulness of claims made based on research; see *operational validity*, *face validity*, *content validity*, *discriminate validity*, *criterion validity*, *concurrent validity*, *predictive validity*, *internal validity*, *external validity *

**Variable**: Logical groupings of attributes; the category to which these attributes belong; a factor/quality/condition that can take on more than one value/state