5.4: Interpreting Results (p-values and NHST)
- Page ID
- 266293
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)- Describe inferential statistics, and how they are related to null hypothesis significance testing.
- Interpret the results of null hypothesis significance tests in research article.
Remember?
Most students reading this textbook have already taken (or are currently taking) a statistics course. This section will remind you of what you learned that is most relevant for reading a quantitative research article. If you'd like a more in-depth refresher, LibreTexts offers a selection of openly-licensed textbooks on statistics (https://stats.libretexts.org/), including textbooks for different social sciences (https://stats.libretexts.org/Bookshe...ied_Statistics).
To understand how to interpret the results of statistical analyses, you should remember what a p-value represents. But to remember that, you might need a refresher on what a null hypothesis is.
Null Hypothesis
In general, the null hypothesis is the idea that nothing is going on: there is no effect of our treatment, no relation between our variables, and no difference in our sample mean from what we expected about the population mean. This is always our baseline starting assumption, and it is what we seek to reject (more on that in the next paragraph on Null Hypothesis Significance Testing). Until we have evidence against it, we must use the null hypothesis as our starting point. In sum, the null hypothesis always states that:
There is no difference between the groups’ means
OR
There is no relationship between the variables.
This is enough background for our purposes, but you are welcome to learn more by reviewing the section of the openly-licensed, online textbook that this information came from (7.3: The Research Hypothesis and the Null Hypothesis in Oja, 2022).
Null Hypothesis Significance Testing
So far, so good? We develop a research hypothesis of what we expect will happen, and we have a null hypothesis that says that nothing will happen. It’s at this point that things get somewhat counterintuitive; the null hypothesis seems to correspond to the opposite of what we expect, and then we focus exclusively on that. It seems like we're neglecting the thing that we're actually interested in (the research hypothesis). Based on Hall et al.'s (2021) research validating the Digital Stress Scale (DSS), we could develop a null hypothesis that says that people who experience higher levels of digital stress will have similar levels of depression than those with lower levels of digital stress (no difference between group means) or a null hypothesis could be that there is no linear relationship between scores on the DSS and scores on a depression scale (no relationship between the variables). But, we actually do think that digital stress would be related to depression, so the alternative to this null hypothesis is that those with higher scores on the DSS will have higher depression scores than those with lower scores on the DSS (there is a difference between the means) or we predict that as scores on the DSS increase, depression also increases (there is a positive, linear relationship between the variables). The important thing to recognize is that the goal of a hypothesis test is not to show that the research hypothesis is (probably) true; the goal is to show that the null hypothesis is (probably) false. Most people find this pretty weird.
The best way to think about it, in my experience, is to imagine that a hypothesis test is a criminal trial… the trial of the null hypothesis. The null hypothesis is the defendant, the researcher is the prosecutor, and the statistical test itself is the judge. Just like a criminal trial, there is a presumption of innocence: the null hypothesis is deemed to be true unless you, the researcher, can prove beyond a reasonable doubt that it is false. You are free to design your experiment however you like, and your goal when doing so is to maximize the chance that the data will yield a conviction… for the crime of being false. The catch is that the statistical test sets the rules of the trial, and those rules are designed to protect the null hypothesis – specifically to ensure that if the null hypothesis is actually true, the chances of a false conviction are guaranteed to be low. This is pretty important: after all, the null hypothesis doesn’t get a lawyer. And given that the researcher is trying desperately to prove it to be false, someone has to protect it.
In sum, the purpose of null hypothesis significance testing is to be able to reject the expectation that the means of the two groups are the same.
- Reject the null hypothesis:
- The means are different.
- or
- There is relationship between the variables.
- Retain the null hypothesis:
- The means are similar (The means are not different).
- or
- The is no relationship between the variables.
Finally, you reject or retain the null hypothesis, and you support or or don’t support the research hypothesis.
Why predict that two things are similar?
Because each sample’s mean will vary around the population mean (see the first few sections of this chapter to remind yourself of this), we can’t tell if our sample’s mean is within a “normal” variance. But we can gather data to show that this sample’s mean is different (enough) from the population’s mean. This is rejecting the null hypothesis. We use statistics to determine the probability of the null hypothesis being true.
Why can’t we prove that the mean of our sample is different from the mean of the population? Since different samples from the same population have different statistical qualities, we can't guarantee that the sample that we're analyzing actually has the qualities of the population. Researchers are a conservative bunch; we don't want to stake our reputation on a sample mean that could be fluke; what if our sample had a lot of people with very lower scores on digital stress? Then we couldn't be sure that what our sample showed us would also happen in the population.
But what we can show is that our sample is so extreme that it is statistically unlikely to be similar to the population. Null hypothesis significant testing is like how courts decide if defendants are Guilty or Not Guilty, not their Guilt v. Innocent. Similarly, we decide if the sample is similar to the population or not.
This is a tough concept to grasp, so we'll keep working on it. And if you never get it, that's okay, too, as long as you remember the pattern of rejecting or retaining the null hypothesis, and supporting or not supporting the research hypothesis. You are also welcome to learn more by reviewing the section of the openly-licensed, online textbook that this information came from (7.4: Null Hypothesis Significance Testing in Oja, 2022).
p-value
Okay, so you've remembered about null hypotheses and a little about Null Hypothesis Significance Testing, but how does that relate to reading the results of a quantitative research article? If you look at a research article's statistical results, you probably saw p-values. We use p-values to show us the probability of obtaining the result from any one sample if the null hypothesis were true.
What does the null hypothesis say?
- Answer
-
The null hypothesis says that the means of the groups are similar OR that there is no relationship between the groups.
The p-value shows us the probability to determine if any effects could have happened by chance. The p-value tells us the probability of getting an effect this different if the sample is the same as the population. If the probability (p-value) is small enough (p< .05), then we conclude that the sample probably is from a different population.
Without having to understand everything about probability distributions and the Standard Normal Distribution, what do the p-values tell us?
- A small p-value means a small probability that the means are similar or that there is no relationship between the variables. This suggests that that the means are different or that this is a relationship.
- A large p-value means a large probability that the means are similar or that there is no relationship between the variables. This suggests that there is nothing going on..
In other words:
Reject null = p<.05
Retain null = p>.05
If the probability is less than 5% that you would get a sample that is this different from the population if the sample really is from the population, the sample is probably not from that population. If you took 100 samples from a population, less than 5 of the samples would be this different from the population if the samples were different in reality. So, the sample is probably not from the population. Probably.
You have a 5% chance that your results are wrong. You have a (small) chance that your sample is this different from the population, but the sample is still actually from the population.
Statisticians are okay with being wrong 5% of the time!
Let's practice!
For each:
- Determine whether to report “p<.05 ” or “p>.05”
- Determine whether to retain or reject the null hypothesis.
- Determine whether something is happening in the sample, or if the sample is from the population.
Hint:
Reject null = p<.05
Retain null = p>.05
1. p < .05
2. p = .138
3. p = .510
Solution
- p < .05
- “p<.05 ”
- Reject the null hypothesis.
- Something is happening; the sample and population are different.
- p = .138
- “p>.05” (because 0.138 > 0.05, 0.138 is bigger than 0.05)
- Retain the null hypothesis.
- Nothing is happening; the sample and population are similar.
- p = .510
- “p>.05” (We’re comparing to 0.05, or 5%, not .50, or 50%)
- Retain the null hypothesis.
- Nothing is happening; the sample and population are similar.
Now, try it yourself:
For each:
- Determine whether to report “p<.05 ” or “p>.05”
- Determine whether to retain or reject the null hypothesis.
- Determine whether something is happening in the sample, or if the sample is from the population.
Hint:
Reject null = p<.05
Retain null = p>.05
1. p > .05
2. p = .032
3. p = .049
- Answer
-
- p > .05
- “p>.05 ”
- Retain the null hypothesis.
- Nothing is happening; the sample and population are similar.
- p = .032
- “p<.05 ” or “p>.05” (because 0.32 < .05, 0.032 is less than 0.05)
- Reject the null hypothesis.
- Something is happening; the sample and population are different.
- p = .049
- “p<.05 ” (even though it’s close, .049 is smaller than 0.05)
- Reject the null hypothesis.
- Something is happening; the sample and population are different.
- p > .05
Now that you've been reminded of what you probably already learned, let's lock in to see how this relates to reading a research article.
Reading Research Results
Even if you're still a little lost about null hypotheses and p-values, you can still interpret the results of many research articles. The main task right now is to narrow your focus. The results section can be confusing, with lots of tables and graphs and numbers. Rather than reading it all at once, go back to the introduction section to identify the research questions or research hypotheses then only look for analyses that are answering these questions. The research question or research hypothesis will imply some IVs (groups that the researcher thinks are the cause of changes) and the DV (the outcome that was measured that the researcher wants to improve). Can you identify the results of how the IV affected the DV (or didn't) in the graphs, tables, or written paragraphs? If so, then you have found the most important information in this section! Dr. MO often tries to answer the research question by only looking at the graphs or tables; if she can do that, then she understand what was found.
There may be a ton of other analyses and data provided. These may provide additional context for the results of the main research hypotheses, or they may be interested results that weren't the main point of the study at all. When first reading the results section, it is perfectly okay to ignore anything that isn't related to the original research questions. Don't get bogged down in all of the results; focus on the results related to why you originally wanted to read the article.
While narrowing your focus on the research questions or research hypotheses, you then narrow your focus to identify which analyses show p<.05.
What does p<.05 tell you about whether or not the intervention worked?
Hint: It's about the null hypothesis.
- Answer
-
When p<.05, it means that there is a small probability that the there is no difference between those who received an intervention and those who did not. This suggests that there is something happening, that the intervention did work.
Often, tables will identify which analyses have p<.05 with an asterisk; so easy! Once you find which null hypotheses are rejected, you can find which research hypotheses (or parts of a research hypothesis) are supported. At this point, a self-check is in order. The next step after narrowing your focus to the research questions/hypotheses and p<.05 is to see if you can explain the findings to someone else. This can be someone in your class, one of your professors, or a family member. The key is that you should have enough of a grasp of the variables and the statistics that you can simplify the results for someone else.
Answer the following questions to ensure that you can simplify the important information:
- List each DV from the Methods section.
- List one hypothesis from the introduction section.
- Describe the results related to each of their hypothesis or purpose. Briefly describe in words what their statistical results show.
If you have trouble with any of this, the Discussion section of the article will help clarify the results. However, keep in mind that the purpose of a Discussion section is different from the purpose of a Results section. You will not get the details and specifics from the Discussion that you should be identifying in the statistical results, so only use the Discussion if you get lost in the results. A second option if you get lost is to ask for help. Your research methods professor, any professor in your department, really, has read many, many research articles. Visit them during their office hours with a printed version of your article, or email them with a PDF of your article and what you think the results are. Another resource is your college's learning center. There are probably free tutors available, so find someone who is familiar with behavioral statistics or who has experience reading social science research articles.
Here's a summary of the tips presented in this last section:
- Narrow your focus to:
- Research Questions or Research Hypotheses
- p<.05
- Explain the findings to someone else.
If you get stuck on either of these, you are encouraged to:
- Review the article's Discussion section for clarification.
- Work with your professor, another professor in your department, or a behavioral statistics tutor.
References
Hall, J. A., Steele, R. G., Christofferson, J. L., & Mihailova, T. (2021). Development of initial evaluation of a multidimensional digital stress scale. Psychological Assessment, 33(3), 230-242. https://doi.org/10.1037/pas0000979
Oja, M. (2022, May 12). Behavioral statistics. LibreTexts. https://stats.libretexts.org/Courses...Sciences_(Oja)


