Cronbach's alpha, often notated with the Greek lowercase alpha letter 𝜶, is a common measure of index reliability. Do the variables that make up the index all measure the same overall construct? You will likely come across it when you read quantitative academic studies, though it is much more frequently used in psychology statistics compared to sociology statistics. Overall, just know that 0 means not reliable, 1 means perfect reliability, and anything at 0.7 or higher is good enough.
Codebook: A document that tells you the codes for each variable in your dataset and what attributes they represent
Codes: Numbers that represent variable attributes
Data: Information. In statistics, usually the values for variable(s).
Data is a plural word. The singular is datum.
Dataset: A collection of data organized for analysis
Index: An index is when multiple variables are combined together into one construct.
Interquartile range (IQR): The inter-quartile range (IQR) is distance spanning the middle 50% of data, the distance between the lower and upper quartile values.
Level of measurement: a variable characteristic classifying the relationship among a variable's values
Nominal: Nominal variables have values that are qualitative categories. They do not have any meaningful rank or order.
Ordinal: Ordinal variables have values that are qualitative categories. They do have a meaningful rank or order.
Ratio: Ratio-level variables have values that are numeric. They have a true zero point, meaningful ratios, consistent intervals between them, and the numbers function arithmetically.
Interval: Interval-level variables have equal intervals between values, but do not meet all the criteria to be classified as ratio-level variables. Some definitions require interval-level variables to have numeric values.
Indicator: Indicator variables, also called dummy variables, are dichotomous/binary, meaning they only have two categories, and these two values are coded as 0 and 1.
Median: The middle number in a group of ordered numbers.
Outlier: An extreme value; a case with a value that substantively differs from most other values.
Percentile: The proportion of data that falls below a given value. The nth percentile is the value below which n% of observations fall.
Q1, the lower quartile, is the median of the lower half of the data. It is the 25th percentile, greater than 1/4 or 25% of the data (and less than 3/4 or 75% of the data).
Q2, the upper quartile, is the median of the upper half of the data. It is the 75th percentile, greater than 3/4 or 75% of the data (and less than 1/4 or 25% of the data).
Range: The range is the distance between the minimum and maximum values.
Statistics: data, the practice of using data for research
Unit of analysis: Who or what you are analyzing
Variable: Anything that can vary (differ), as opposed to something that must be constant (the same).
Other words you will come across that have a form of the word variable in them (variate) include:
univariate: having to do with one (uni) variable
bivariate: having to do with two (bi) variables
multivariate: having to do with multiple (multi) variables