11.5: Descriptive Statistics (Summary)
- Page ID
- 309688
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Key Takeaways
Key Terms and Concepts
DESCRIPTIVE STATISTICS
Techniques for summarizing and displaying data.
DISTRIBUTION
The set of scores on a variable for a group of individuals.
HISTOGRAM
A bar graph showing the frequency of different scores.
SYMMETRICAL
A distribution where both sides mirror each other.
SKEWED
A distribution with a tail extending more on one side.
OUTLIER
An extreme score that is very different from the others.
CENTRAL TENDENCY
A typical or average score that represents the distribution.
MEAN
The arithmetic average of all scores.
MEDIAN
The middle score when all scores are arranged in order.
MODE
The most frequently occurring score.
VARIABILITY
How spread out or dispersed the scores are.
RANGE
The difference between the highest and lowest scores.
STANDARD DEVIATION
The square root of variance; average distance from the mean.
VARIANCE
The average of squared deviations from the mean.
PERCENTILE RANK
The percentage of scores at or below a given score.
Z SCORE
Difference between an individual score and the mean of the distribution.
EFFECT SIZE
A measure of the magnitude of a relationship or difference.
COHEN’S d
Used to measure the strength of a relationship or effect size.
LINEAR RELATIONSHIPS
Relationships that form a straight line on a scatterplot.
NONLINEAR RELATIONSHIPS
Relationships that form a curved pattern.
RESTRICTION OF RANGE
When one or more variables have a limited range in the sample, relative to the population.
BAR GRAPHS
Graphs using bars to compare groups or categories.
ERROR BARS
Visual representations of variability of each group or condition on graphs.
STANDARD ERROR
The standard deviation of a sampling distribution.
LINE GRAPHS
Graphs showing trends with connected points.
SCATTERPLOTS
Graphs displaying individual data points for two variables.
CORRELATION MATRIX
A table displaying correlations among multiple variables.
RAW DATA
Original, unprocessed measurements as collected.
DATA FILE
An organized dataset with variables as columns and participants as rows.
PLANNED ANALYSIS
Statistical analyses decided upon before data collection.
EXPLORATORY ANALYSIS
Examining data for unexpected patterns not specified in advance.
Test Your Knowledge (answers at end of section)
1. What does a correlation coefficient (r) of -0.85 indicate?
A) A weak negative relationship
B) A strong negative relationship - as one variable increases, the other tends to decrease
C) No relationship
D) A positive relationship
2. Ollendick and colleagues found that children receiving exposure treatment had a mean phobia rating of 3.47 (SD = 1.77) while those receiving education treatment had a mean of 4.83 (SD = 1.52). Cohen's d was 0.82. According to Cohen's guidelines, what does this effect size tell us that the means alone do not?
A) It shows this is a large effect - the groups differ by 0.82 standard deviations, making results comparable across different measures and studies
B) It proves the difference is statistically significant
C) It means the treatment caused the change
D) It indicates the study should be repeated
3. When reporting statistical results in APA style, what information must be included for a correlation?
A) Only the correlation coefficient
B) The correlation coefficient (r), sample size (n), and p-value
C) Only the p-value
D) Just the variables being correlated
4. What is the primary purpose of using statistical software (like SPSS, R, or Excel) in data analysis?
A) To make graphs look prettier
B) Only for very large datasets
C) To avoid learning statistics
D) To accurately and efficiently compute statistics, reducing calculation errors and allowing for complex analyses
5. A researcher finds that one participant has a z-score of +4.2 (reaction time 4.2 standard deviations above the mean). The chapter notes outliers are sometimes defined as scores beyond ±3.00. The chapter gave an example where adding a 5,000 ms reaction time to scores of 200-280 ms raised the mean from 245 ms to 1,445 ms. What does this example teach about handling outliers?
A) Always automatically remove any score with z > ±3.00
B) Never remove outliers because all data are valid
C) Outliers can drastically distort statistics (especially the mean) so they require investigation
D) Replace outliers with the group mean
Answer Key
1. B - A strong negative relationship - as one variable increases, the other tends to decrease
A correlation coefficient of -0.85 indicates a strong negative relationship between two variables. The negative sign indicates the direction: as one variable increases, the other tends to decrease. The magnitude (0.85) indicates the strength: values closer to -1.0 or +1.0 indicate stronger relationships. A correlation of -0.85 is considered strong because it's close to the maximum value of -1.0. For example, hours spent watching TV and GPA might have a correlation of -0.85, meaning students who watch more TV tend to have lower GPAs.
2. A - It shows this is a large effect - the groups differ by 0.82 standard deviations, making results comparable across different measures and studies
Cohen's d = 0.82 represents a large effect size (Cohen's guidelines: ~0.20 = small, ~0.50 = medium, ~0.80 = large).
3. B - The correlation coefficient (r), sample size (n), and p-value
APA style requires reporting the correlation coefficient (r), the sample size (n), and the p-value when presenting correlation results. Example: 'There was a significant positive correlation between study time and exam scores, r(48) = .67, p < .001.' The format r(48) indicates a correlation with 48 degrees of freedom (n - 2 for correlation). The coefficient shows the strength and direction, while the p-value indicates statistical significance.
4. D - To accurately and efficiently compute statistics, reducing calculation errors and allowing for complex analyses
Statistical software serves several important purposes: (1) Accuracy - eliminates manual calculation errors, especially for complex statistics. (2) Efficiency - computes statistics instantly that would take hours by hand. (3) Handles complex analyses - enables sophisticated statistical techniques.
5. C - Outliers can drastically distort statistics (especially the mean) so they require investigation
The chapter's 5,000 ms example powerfully illustrates how a single outlier can render the mean (1,445 ms) greater than 80% of the scores in the distribution and does not seem to represent the behavior of anyone in the distribution very well. Outliers require investigation, not automatic removal or retention.
References
| Japan | United States |
|---|---|
| 25 | 27 |
| 20 | 30 |
| 24 | 34 |
| 28 | 37 |
| 30 | 26 |
| 32 | 24 |
| 21 | 28 |
| 24 | 35 |
| 20 | 33 |
| 26 | 36 |
| Extraversion | Facebook Friends |
|---|---|
| 8 | 75 |
| 10 | 315 |
| 4 | 28 |
| 6 | 214 |
| 12 | 176 |
| 14 | 95 |
| 10 | 120 |
| 11 | 150 |
| 4 | 32 |
| 13 | 250 |
| 5 | 99 |
| 7 | 136 |
| 8 | 185 |
| 11 | 88 |
| 10 | 144 |


