Skip to main content
Social Sci LibreTexts

12.6: Statistical Learning

  • Page ID
    140704
    • Todd LaMarr
    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Statistical Learning Theory of Language Acquisition

    The natural environment presents infants with multiple streams of information occurring simultaneously. They encounter many objects, some moving, some stationary; they hear sounds, some language-related, others from animals and objects; they see people, some interacting with them, others in the background. Despite all of this information, these experiences are not completely random, there are predictable patterns that occur together. An infant’s surroundings contain statistical regularities that they can use to detect structure in a busy environment. Infants are able to detect regularities and co-occurrences in visual shape sequences and visual scenes (Fiser & Aslin, 2002; Kirkham et al., 2002, 2007). For example, take a look at Figure \(\PageIndex{1}\), which shows three object clusters, each cluster is composed of three colored shapes. Can you spot the regularity that co-occurs across the three object clusters? Although the three object clusters are not exactly the same, the regularity that occurs across all three are the yellow and pink connected shapes. In one study (Wu et al., 2011), 9-month-old infants were shown sequences involving these same three-shape object clusters in which two pieces always co-occurred and one piece constantly changed. They found that infants could keep track of which pieces co-occurred, suggesting that infants were able to track the statistical regularities they experience. [1]

    here were three patterns: A, B, and C. Pattern A is shown here. For a more detailed figure, see Wu et al. (2011). Shapes were shown sequentially during familiarization trials. Shapes were shown simultaneously during test trials. The split on the left is consistent with Sequence 1 but inconsistent with Sequence 2. The split on the right is inconsistent with Sequence 1 but consistent with Sequence 2. All stimuli were in full color against a black background.
    Figure \(\PageIndex{1}\): Sample of stimuli from an infant statistical learning task. ([2])

    Just as infants are able to track regularities in the visual environment, they are also able to track the regularities that naturally occur in language. When we talk, we produce a consistent stream of speech sounds. An important task for an infant is to figure out which specific set of sounds within a speech stream go together to create words. One way infants do this is by tracking the statistical regularities of speech sounds that co-occur together. To demonstrate this, a research study (Antovich & Graf Estes, 2018) had 14 month olds listen to a speech stream consisting of four made up words: timay, dobu, kuga, pimo. While listening, the children heard each of the four words repeated 120 times, but the order of the words was mixed up. Despite the random presentation of words in a consistent speech stream, there were regularities of sounds that always co-occurred. For example, ‘ti’ was always followed by ‘may’ (the two syllables in the word ‘timay’), and ‘pi’ always co-occurred with ‘mo’ (the two syllables in the word ‘pimo’). However, as the order of the words was mixed up, the last syllable of a word and the first syllable of the next consecutive word occurred much less frequently as ‘may’ (the last syllable in the word ‘timay’) could be followed up by either of the other three words at any time. The results revealed that even though the words used were made up, the infants were able to track the statistical regularities and segment the words from the continuous speech stream. This research study, along with many others (for a review, see Romberg & Saffran, 2010), demonstrates that infants are able to track the statistical regularities of the language they are exposed to and this helps them in the initial stages of acquiring their native language(s).
    There is now a wealth of data documenting the statistical learning abilities of infants and toddlers (Saffran, 2020; Saffran & Kirkham, 2018). Detecting visual statistical regularities has even been documented in newborns (Bulf et al., 2011). Researchers have argued that statistical learning plays an important role in language learning (Saffran, 2001, 2003). These studies suggest that during their first year of language acquisition, before children begin to produce words, they start learning the patterns of the language they hear, tracking the sound combinations that correspond to potential words. Newman et al. (2006) discovered a relationship between infants’ ability to segment the speech stream into words and language proficiency at 24 months and, even later in childhood, between 4 and 6 years of age. Additionally, there is a growing body of evidence showing that statistical learning recruits the same brain areas as those used in language processing (de Vries et al., 2011; Folia et al., 2011; Petersson et al., 2012). [3] [4]


    [1] Barry, R. A., Graf Estes, K., & Rivera, S. M. (2015). Domain general learning: Infants use social and non-social cues when learning object statistics. Frontiers in Psychology, 6, 551. CC by 4.0

    [2] Image adapted from Barry, R. A., Graf Estes, K., & Rivera, S. M. (2015). Domain general learning: Infants use social and non-social cues when learning object statistics. Frontiers in Psychology, 6, 551. CC by 4.0

    [3] Ellis, E. M., Borovsky, A., Elman, J. L., & Evans, J. L. (2021). Toddlers’ Ability to Leverage Statistical Information to Support Word Learning. Frontiers in Psychology, 12, 641. CC by 4.0 https://www.frontiersin.org/articles/10.3389/fpsyg.2021.600694/full

    [4] Arciuli, J., & Torkildsen, J. V. K. (2012). Advancing our understanding of the link between statistical learning and language acquisition: The need for longitudinal data. Frontiers in psychology, 3, 324. https://www.frontiersin.org/articles/10.3389/fpsyg.2012.00324/full#h9


    This page titled 12.6: Statistical Learning is shared under a mixed 4.0 license and was authored, remixed, and/or curated by Todd LaMarr.