2.7: Learning Meaning

Last updated
Save as PDF

Page ID: 7540

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Why Language Learning Seems Hard

Exercise \(\PageIndex{1}\)

Say a child is presented with a single example of an apple along with the word apple. What would it take for the child to be able to correctly apply the word in the future to other apples and not incorrectly apply it to, say, pears or strawberries or mushrooms?

One of the striking, and very powerful, features of human language is that it is learned. Within the domain of words alone, the fact that language is learned allows people to continually come up with new words for new concepts (by extension or combination of existing words, by borrowing, or by inventing). When words are combined to form sentences, even greater flexibility is possible, as we'll say later on in the book.

In and of itself, learning is not so impressive; after all, some birds learn their songs. Language learning seems amazing for two reasons.

What is learned, a human language, is very complex.
The information that children are provided about the language seems insufficient.

We'll be examining the complexity of language throughout this book. For now, I'll mention just one aspect of language you should already be familiar with. Language is applicable to a potentially infinite range of situations. That is, the learner has to generalize, to go beyond the examples that are seen during learning and apply language to novel situations. Once you know a word like rock, for example, you have the ability to use it to refer to any rock you might encounter and to understand it when someone else does this. Conversely, you know when not to use the word rock; you would not apply it to some sand, for example. And when you hear someone else using the word and there are several possible referents around, you know that the speaker is not referring to a tree or a puddle. The problem is that when you're learning the meaning of the word, you can't possibly be shown all rocks and told that they belong to the semantic category rock and shown everything else and told that it doesn't belong to the category.

Learning What Is and What Isn't a Rock

To see why this would matter, let's consider a very simple "language", consisting of just one word, rock. As a language learner, you have to figure out the meaning of each word. (You also have to figure out how to pronounce each word, but I'll save that kind of complexity for the next chapter.) Let's simplify further by assuming that the information you receive about the language consists of presentations of pairs of objects and words with no distractions of any kind (other objects that could be possible referents for the word, other words that could refer to the object). After lots of these presentations, you should begin to have an idea of what a rock is; that is, you would know how to use the word rock when you see an object that is very similar to the rocks you've been presented. But what about a potential referent that is notso similar, for example, a rock that is much larger than the ones you've seen so far, or a clump of soil? How would you know to apply rock to the first of these and not to the second? In particular, without ever being told that a clump of soil or some sand is not a rock, what would prevent you from using the word to refer to these?

So it seems that to learn word meanings, you'd need two kinds of examples, examples illustrating where particular words apply (rock in the presence of a rock) and examples illustrating where particular words do not apply (not rock in the presence of a clump of soil). The first kind of information is called positive evidence, the second kind negative evidence. Even with all of this help, you might often find yourself stuck if you were faced with an object quite unlike any of the ones you'd seen (though to be fair people sometimes do have trouble deciding which word applies in a given situation).

Distractions: Other Objects and Other Words in the Context

So the learning task itself looks very challenging. What about the information that learners actually get about the words in the target language? First, it obviously does not consist of simple pairings of objects and words. Objects do not present themselves in isolation from the rest of the world. For a given presentation of a noun, there will usually be other objects and masses around in addition to the intended referent of the noun. Furthermore, as we'll see beginning in Chapter 5, words don't just refer to things; they also refer to properties of things or relations between things. Say a child has just heard the word tiger. Even if there is a tiger present, and the child is able to pick out the tiger in the scene, how does she know the word refers to that animal and not, say, the trees around the tiger, the ground beneath the tiger, the sky overhead, the tiger's legs, the relation between the tiger and the ground, the relation between the tiger's legs and the tiger's body, or any number of other noticeable aspects of the current situation? This problem, described in 1960 by the philosopher of language W. V. O. Quine in the context of translation rather than language learning, is sometimes called Quine's problem.

Furthermore, nouns usually do not occur in isolation; they occur together with other words. The child has to segment the noun out from the stream of sounds and figure out which aspects of the situation the different words refer to. If the child already knows all of the other words, this may not be too difficult, but the utterance may contain more than one unfamiliar word.

Finally, although we've seen that negative evidence seems crucial for learning, adults apparently do not provide children with negative evidence, at least not directly. That is, they do not say things like "this is not a rock" very often. Of course they may correct the mistakes that children make ("no, that's not a rock; it's dirt"), but children often seem to ignore the corrections, especially if they have made themselves understood in spite of their error.

So How is Language Learned?

These arguments about the complexity of language and the seeming lack of information available to children have led many researchers to look for constraints on how language is learned. The idea is that if the child somehow "knew" that only certain things were possible in human language, this would make learning simpler because the set of possibilities would be smaller. For example, if children knew from the outset that all languages have a category of words (common nouns) that are used to refer to categories of things in the world, then they could focus the learning process by "looking for" nouns in what they hear and not making misguided hypotheses about what the words are for. Words could work a very different way from the way they work in human language. For example, instead of referring to all instances of a category, a noun could refer (in a sort of metonymic way) to things that go together. So hearing cat for one cat, a learner could assume that the word is used for that cat and the things that go with it, its food, its owner, the place where it sleeps. Of course nouns don't work this way, and the point of the constraint would be to prevent learners from even considering that they do. We'll see other examples of possible constraints on language learning later on.

If we can agree on what some of these constraints are, how would the child get them? One common kind of proposal is that the constraints are innate, that is, that they are in the child's genes and not learned at all. The learner "knows" the constraints on what is possible in human language just because the learner is a human being. Another possibility is that the constraints come through experience in the first few months of life and are already in place when the relevant aspect of language is learned. However, this could only apply to constraints that could be learned by an infant through experience with the world. Many people don't distinguish these two positions because they both make the strong claim that constraints on how language is learned are already in place when the process begins in earnest. For the learning of words and their meanings, this means that the constraints are available by the child is about nine months old.

Innate Constraints vs. Statistical Learning

This view of language learning as constrained by universal principles that are either innate or learned very early on is quite popular. But it is not the only possibility. An alternative position gives a central role to the child's experience with language and the world. I'll refer to this as the "empiricist position". The empiricists believe that the linguistic information children receive is richer than it appears at first glance, that it contains many sorts of regularities, that is, recurring patterns. In addition to the regularities that are the conventions of language, how words are pronounced and combined in meaningful ways, there are regularities in the way people present language to children. For example, people tend to look at an object they are referring to if it is present, and this can help a child figure out what the word refers to.

This account of language learning requires that children be good statistical learners, that they be very sensitive to the regularities in the input around them, and empiricists have made the case this is so. If children can learn this way, then they can not only pick up on the regularities that are present in the environment but can also compensate for the lack of negative evidence by noticing what tends not to occur with what as well as what tends to occur with what and by learning about the consequences of their own mistakes. Finally, empiricists argue that children are predisposed (probably innately) to be social creatures, to notice what other people are attending to and to be interested in what they are doing. Being interested in certain things and not in others constrains the space of things that the learner will attend to or will guess that language is about. For all of these reasons, the empiricists hold that people do not start out with innate knowledge of language but rather pick up what they know about language (or at least most of what they know) through experience. Because experience is so important in their view, some empiricists also argue that children learning different sorts of languages may behave somewhat differently, at least early on.

Many linguistic textbooks dismiss the empiricist position, giving the impression that the issue is solved, that language (though obviously not particular languages) is basically innate. But this is not at all the case; if anything, the controversy is more heated now than it was ten or twenty years ago. As someone squarely on the empiricist side, I will try to show how this position makes sense, here and in other chapters, but I will not be claiming that we have all the answers. The question of how language is learned is still one of the great outstanding questions facing science.

The Shape Bias

Exercise \(\PageIndex{2}\)

How would the knowledge that nouns tend to refer to categories defined partly by shape help the child learning the meaning of the word apple?

In this subsection, we'll look at one example of a possible constraint on learning that seems to be reflected in the behavior of learners, and I'll suggest an alternative explanation for the behavior that doesn't require a specific constraint.

Many of the early words of children learning English and many other languages are nouns, both proper nouns (Daddy) and common nouns for concrete things, especially solid objects (dog, cup, book). In learning these kinds of nouns, it has been shown that children tend to generalize on the basis of shape, rather than material, color, or texture, for example. This tendency is called the shape bias. For example, consider the following experiment, which could be performed with a child of two-and-a-half years, say. The experiment begins with a training phase (above the line in the figure below) in which the child is shown an unfamiliar object and hears it labeled with a new word, such as dax. Next, in the test phase (below the line in the figure), the child is shown a set of objects, each of which matches the original object on one or more dimensions, and asked to find the one that the word best applies to ("show me the dax"). In the figure, the first test object matches the training object on shape, the second matches on color, and the third matches on texture pattern. In other words, the child is being asked to generalize on the basis of one example to another. In experiments like this, children tend to pick the object that matches the original object in shape, in the figure, the first object.

The shape bias is just a tendency; children do not exhibit the bias for (non-solid) masses, and they exhibit less of a shape bias for objects that they believe are animals, for example, things that appear to have eyes.

How the Shape Bias Can Help in Noun Learning

The shape bias can help children in the learning of noun meanings because it restricts the possible semantic categories to those in which shape, rather than color or texture, is a relevant dimension. If a child learning the word flute sees an example of the word, say a silver concert flute, the child can later extend the word to other objects that are similar in shape to the original example but avoid extending it to other objects that are similar only on other dimensions, for example, to a silver teapot. But where does the shape bias come from? One view is that it precedes word learning; it is either innate or it is learned on the basis of the child's early experience with objects.

But researchers led by psychologist Linda B. Smith have shown that there is a simpler account and one that agrees better with children's behavior. There are two details of their behavior that are relevant. First, the shape bias seems only to apply in linguistic tasks, that is, when children are labeling objects. When they are asked to group objects on the basis of their similarity, for example, they don't necessarily base their groupings on the shapes of the objects. This implies that the bias is not necessarily a general cognitive bias but only a bias that is relevant for language and that it could not be learned on the basis of pre-linguistic experience. Second, even for language, the shape bias does not appear until children have learned fifty or so words. The implication is that the shape bias is learned on the basis of the language that the children are exposed to. The research has shown that most of the early nouns children learn refer to categories that are defined by shape, categories such as cup and horse and apple. The idea is that as children learn more and more words of this type, they make the generalization that shape mattersfor nouns and then go on to use this generalization to help them in learning further nouns; shape is what they pay attention to when they are learning new nouns for objects. In other words, for this small aspect of language learning, at least, if children are good statistical learners, special constraints are not required.

For a more in-depth discussion of the shape bias and related issues, see this 2000 paper by Smith and cognitive scientist Eliana Colunga.

Search

Text Color

Text Size

Margin Size

Font Type