Evidence for Learning Phonology
Say you're trying to learn about the phonology of an unfamiliar language, but all you have access to is recordings of a set of sentences. You do not know what the sentences mean, nor do you know where the boundaries between the words are. What could you and could you not learn about the phonology of the language given this information?
To gain a better understand of how phonology works, in this section we'll take the perspective of the learner. What does a learner have to achieve with respect to phonology? The learner has to learn to be both a Speaker and a Hearer, to figure out what the categories (phonemes) are and how they are realized in different contexts. What the learner is presented with is allophones of course, not phonemes directly. There are two kinds of evidence that the learner can use to arrive at what the phonemes are.
First, the learner could just pay attention to what phones tend to occur. As I've said a number of times, there are an infinite number of possible phones and, even within a given language, a very wide, if not infinite, set of possibilities. But for a given language, the phones the learner hears will tend to cluster in particular regions within the space of possibilities. For example, if the language is Spanish, there will be many possible vowel phones, but they will tend to cluster around the vowels [i], [u], [e], [o], and [a]. For example, nothing very close to [æ] or [ey] or to [ω] (an unrounded high back vowel) will occur. There will also be many possible consonant phones, but they will tend to cluster around particular points in the space of possibilities. These will include [d̪] (a voiced dental stop), [ð], [t̪], and [ɾ], but not [r] or [t̪h] (an aspirated dental stop) or [t̪'] (an ejective dental stop). Of course these tendencies would be specific to Spanish. [æ] would occur if the language were English; [t̪] would occur if the language were Amharic.
Apparently it's Impossible to Learn What the Phonemes in a Language are Without Paying Attention to Meaning
But listening to what phones occur and what phones do not does not provide any direct information on how they are used contrastively, that is, to distinguish words. For example, a child exposed to English will hear a variety of stops in contexts following vowels. Some of these contrast, for example, [t] and [p], but others do not, for example, [t] and [ɾ]. The only way to know for sure that they contrast is to pay attention to meaning as well as to the patterns of phones that occur. For example, if the Learner can tell that [ræt] and [ræp] mean different things, they will know that [t] and [p] contrast.
So there are these two sorts of evidence that the Learner can use: what sounds tend to occur and what different sequences of phones mean.
What There is to be Learned
But what sort of knowledge of phonology is there to be learned? We've seen that knowledge of the phonology of a language includes the following.
- Knowledge of what the contrastive categories of the language are, that is, the basic units that are used to make (and distinguish) the words of the language. In spoken languages these include phonemes and supragemental features; in signed languages the contrastive units appear to be syllable types.
- Knowledge of how the contrastive units are realized as particular forms in different contexts. This knowledge needs to be in two forms, one that enables the Speaker to pronounce words and one that enables the Hearer to recognizewords.
- Knowledge of how the contrastive units may (and may not) be combined to form words (phonotactics).
In the rest of this section and the next section, we'll be considering how a person would learn the three types of knowledge. To simplify matters, we'll start by looking at a simple imaginary language, the one that a tribe of Lexies has arrived at in an early stage of the evolution of their system of communication. And rather than looking at data on children's production and comprehension, we'll first look at what kinds of information the child might have access to and might be useful in learning about phonology. In fact what I'll be discussing would apply just as well to a linguist who is trying to figure out the phonology of a previously unresearched language.
Of course infants and adult linguists differ in many important ways. For one thing, linguists are conscious of what they are learning about the language; their conclusions will be things they can describe and write down. Infants, on the other hand, are not conscious of any of this and will not even be conscious of the phonological knowledge they have when they grow up. Second, linguists can elicitdata; that is, they can ask questions to test their hypotheses. Children obviously could not do this even if they were aware of what they were learning. Still, the task of the linguist and the "task" of the infant bear some interesting similarities.
Learning Phonology is like Learning Meaning in Some Ways
Figuring Out Phonotactics
Let's begin by thinking about the third kind of knowledge, phonotactics, because it will help us figure out the other two. Examining the general structure of the words, we see that they can consist of one or two syllables and that all of the syllables consist of a consonant followed by a vowel. This means that consonants appear in two different contexts, beginning a word and in the middle of a word following a vowel (and preceding another vowel). It also means that vowels appear in two different contexts, at the end of words and in the middle of words preceding a consonant (and following another consonant). So the next question we might ask is whether there are any constraints on which consonants and vowels can appear in which contexts or on which combinations of consonants or vowels occur in two-syllable words. We see that all of the consonants appear in the word-initial context but that only the following ones appear in the third position in two-syllable words: [b], [m], [d], [n], [s], [g], [ŋ]. We also see that all of the vowels can appear in either of the two vowel positions.
In addition, it's hard not to notice a striking regularity to the vowels: in two-syllable words, the first and second vowels are always the same. While languages are usually not this extreme, they often do have constraints on how neighboring phones must agree on some feature. This is true for clusters of final consonants in English, for example. When English syllables end with more than one stop or fricative, these consonants must agree in voicing; that is, either all (or both) must be voiced or all must be voiceless. For example, /kt/, /sk/, /fθs/, /bd/, and /gz/ are possible, but /kd/, /zk/, and /vðs/ are not.
While this language apparently has no stress, if it did we could also look at stressed and unstressed syllables to see if there are phones that can occur in one and not the other type of syllable. In English, for example, unstressed syllables are more constrained than stressed syllables in terms of what can occur.
So we can summarize what we've learned about the phonotactics of the language as follows:
- Syllables always consist of a consonant followed by a vowel (CV).
- Words consist of one or two syllables. In words consisting of two syllables,
- the second syllable can only begin with one of the following consonants: [b], [m], [d], [n], [s], [g], [ŋ]
- the vowels of the two syllables must be the same.
Minimal Pairs and Overlapping Distributions
Say a language learner discovers the following forms in the target language:
- [vam] 'break'; [fam] 'snow'
- [lo] 'picture'; [lu] 'picture'
- [kes] 'lip'; [kes] 'radio'
What does the first pair of words tell us about the status of [v] and [f] in the language? What does the second pair of words tell us about the status of [o] and [u] in the language? What does the third pair of words tell us about the language?
Now we need to figure out what the phonemes of the language are and how they're realized. Obviously it's important to know which phones occur (and which do not). As I noted at the beginning of this section, the phones in a language should tend to cluster around particular prototypical places, places that differ from one language to another. The transcriptions of the words above are meant to represent this. So, based on the words heard, the child has the vowels [a], [i], [e], [u], and [o], and the consonants [p], [b], [m], [t], [d], [s], [n], [k], [g], and [ŋ] to deal with. (Note that a lot of the learning process is being left out here; deciding that there are this many phones, no more or less, is no mean feat, and children may in fact not do anything like that early in phonological learning.)
A particular phone P (really a cluster of phones centered on P) is a phoneme in a language only in the sense that it contrasts with the other phonemes in the language, that is, that the difference between P and those other phonemes can make a difference in meaning. This means that we can only establish what the phonemes are by comparing the different phones with one another. But which pairs should we be comparing and what sorts of comparisons should we be making? There's no point in comparing phones that are very different from one another because changing from one of these to another almost certainly changes the meaning. For example, in English we'd never expect the phones [b] and [s] to belong to the same phoneme. Rather what we're interested in are pairs that are relatively similar. For such pairs it is possible that both phones belong to the same phoneme, or that they belong to different phonemes. "Similar" phones will be phones that differ in only one or two features.
A Minimal Pair is the Clearest Evidence that Two Phones are Separate Phonemes
Let's start with the vowels because there are fewer of them. Of the five vowels, pairs that are somewhat similar include [i,e], [u,o], [a,o], and [e,a] ([a] is a low, central vowel). For each pair, we are interested in whether the difference between the two is enough to make a difference in meaning. The best evidence for this would be two words that differ only in that one has one of the phones, and the other has the other. Such a pair of words is called a minimal pair.
We have a minimal pair for [i] and [e], the words [pi] and [pe]. Both of the forms consist of two phones, the first of which is [p]; clearly the only difference is that one has [i], the other [e] in second position. It is important that we not only have two forms that differ in only one way but also that the two forms have different meanings. Otherwise they would not actually be different words. Since [pi] means 'rock' and [pe] means 'sun', and these two meanings are not obviously related to each other, it's clear that [pi] and [pe] are different words. And since the only difference in the forms is the difference between [i] and [e], we can be fairly sure that [i] and [e] are separate phonemes in the language. Let's tentatively call them /i/ and /e/, where the phoneme labels selected are supposed to represent the prototypical allophones. As far as we know so far, these apparent phonemes have only one allophone each, so this is the one we'll select for the phoneme label.
What about [u] and [o], the comparable pair of back vowels of different heights? Looking through the list of words, we find no minimal pairs for [u] and [o]. But this does not necessarily mean that these two phones could not be used contrastively, that is, that they are not separate phonemes. We would have evidence for this if we could show that they are used in the same contexts, that is, that they can appear next to the same phones. If they're used in the same context, then the difference between [o] and [u] can't be due to assimilation or some other process related to context because if this were true, the contexts would have to be different for the two phones. In fact, it would be enough to show that they are both used in one particular context.
The range of contexts that a phone can appear in is called its distribution. We already know that all vowels can appear in one-syllable words and as either vowel in two-syllable words, so a vowel is always preceded by a consonant and sometimes followed by a consonant. What we'd like to know is which consonants can come before and after [o] and which can come before and after [u]. Looking at the words with these vowels, we find that, among other consonants, [p] and [t] can come before both vowels and that [b] can come after both vowels. So the indication is that [o] and [u] occur in the same contexts, or at least that their distributions overlap. Even though there are no pairs of words distinguished only by the difference between [o] and [u], it appears that there could be. For example, based on everything we know about [o] and [u], we could imagine a word pronounced [pu] that would mean something different than the word pronounced [po] 'father'. In other words, it appears that [o] and [u] are separate phonemes. We'll call them /o/ and /u/ tentatively.
We can follow the same procedure for the other vowel pairs. The realization rules for the vowels are simple. Since, as far as we can tell, each vowel phoneme has only one allophone, each vowel is always realized as that allophone.
Establishing the Status of Two Phones Involves Looking at their Phonetic Contexts
Now let's consider the consonants. One possible set of pairs is consonants that differ only by voicing: [p,b], [t,d], [k,g], [s,z]. In many languages such pairs of voiced and voiceless consonants are allophones of the same phonemes. There are no minimal pairs in the list for any of these pairs of phones, so we need to see whether they can appear in the same contexts, as we did for the pair [o,u]. For [p,b], we discover that [p] appears only at the beginning of words, whereas [b] appears only in the middle of two-syllable words, that is, between vowels. In other words, there is no overlap at all in the distributions of [p] and [b]. In this case we say they are in complementary distribution; there is no overlap at all in their distributions.
Two similar phones that are in complementary distribution cannot be separate phonemes because we can't replace one by another in a form to get a different word. That is, if we're right about the distribution of [p] and [b], we can assume that there could be no form [ba] that would make a minimal pair with the existing form [pa] and no form [popo] that would make a minimal pair with the existing form [pobo].
We can conclude that [p] and [b] belong to the same phoneme. We'll call it /p/, though we have no way at this point of knowing whether [p] or [b] is the prototypical allophone. The realization rules for /p/ are fairly simple. It is pronounced [p] at the beginning and [b] in the middle of words. With [p] as the default allophone, we can see the [b] allophone as resulting from assimilation. In the middle of words, the consonant is surrounded by vowels, that is, voiced sounds, so voicing it (changing it from [p] to [b]) makes it agree with the context on the voicing dimension. For this reason, it makes sense to choose [p] as the default allophone for this phoneme.
Languages tend to be systematic, so we should not be surprised when we find the same sort of distributions for the other stop pairs [t,d] and [k,g]. That is, the voiceless phones in each case appear only at the beginnings of words, while the voiced phones appear only in the middle of two-syllable words. Again we conclude that each pair represents a single phoneme. We'll call these phonemes /t/ and /k/. The realization rules are the same as for /p/, so at the point, we can make a more general realization rule for all three of the stops in the language: Pronounce the stop voiceless at the beginning of a word, and pronounce it voiced in the middle of a two-syllable word (between vowels).
Sometimes a Phoneme's Realization Depends on Formality, Rate of Speaking, or Degree of Emphasis on the Word
For [s] and [z] we have what at first glance appears to be a minimal pair, [su] and [zu]. But this is not a minimal pair because the two forms have the same meaning, 'mother'. Apparently these are not different words for 'mother', but alternate ways of saying the same word. It is not clear from the list in what situations the different pronunciations are used. One possibility, similar to what we discovered for the pronunciations of English at, is that the pronunciation depends on what precedes the word. Another is that the difference is related to formality, speed, or emphasis. Something like this happens in English with word-final voiceless stops. The /p/ at the end of a word such as lip would normally not be aspirated or released. But if the speaker is speaking unusually formally or slowly or with a great deal of emphasis on the word in question, the /p/ can be released and aspirated.
In any case it's clear that for the pair [su] / [zu] the difference between the [s] and the [z] is not contrastive; changing from one to the other makes no change in meaning. We should also notice the same thing going on with the two other forms beginning with [s] and [z]: [sama] and [zama]. Again the difference in the initial consonants makes no difference in meaning.
So far the evidence that we have indicates that [s] and [z] belong to the same phoneme. If we examine the other forms in the list that contain either [s] or [z], we find only three others, all containing [s] at the beginning of the second syllable: [toso], [nasa], and [ŋasa]. There are no words with [z] in this position. So, as far as we can tell from this list of words, there are no positions in words in which [s] and [z] make different words, and we conclude that [s] and [z] are allophones of the same phoneme. Because [s] seems to occur in more contexts than [z], we can consider it to be the prototypical allophone, and we'll refer to the phoneme as /s/. The realization rules for this phoneme only need to specify that it is optionally pronounced as [z] when it begins a word. Note that the rule is different than the ones for the stops, which get voiced when they are in the middle position in words.
There are three other consonants, the three nasals produced in the same three places of articulation as the other consonants, [m], [n], and [ŋ]. There is a minimal pair for [n] and [ŋ], [nasa, ŋasa], indicating that they are separate phonemes. For the other two pairs, [m,n] and [m,ŋ], there are no minimal pairs, but clear evidence of overlapping distributions: [maga], [nasa], [ŋasa]; [sama], [pana]. So we conclude that there are three nasal phonemes: /n/, /m/, /ŋ/.
Based on the evidence in the list of words, we can propose that the Lexie language has five vowel phonemes and seven consonant phonemes. But it's important to note that our baby has heard just this short list of words; there is more to the language. All of the generalizations that we have made about the phonemes and the phonotactics of the language could prove wrong with more evidence. In particular, whenever we concluded that something could not occur, the child could later discover that such a thing could occur. For example, we concluded that words consisted of one CV syllable or two CV syllables, but it is always possible that a word not yet heard could have a different structure, such as CVC. Similarly, we condluded that [p] and [b] are not contrastive because [p] always occurs at the beginning of words and [b] always in the middle. But the baby could later encounter (or become aware of) a previously unknown word with a form like [be] or [napa]. In fact, even our minimal pairs are suspect. Let's see how.
We've seen that languages use phonemes to distinguish words from one another. But this is only the usual case. We haven't yet considered another possibility for different words. Notice that in this Lexie language, [mi] can mean 'river' or 'hawk'. It's hard to see how these two meanings are related to one another, so we have to conclude that [mi] is really two different words that happen to have the same pronunciation. Such words are called homophones. Many (perhaps all) languages have homophones, probably mainly the result of historical accidents, changes that happened to bring the forms of two words together. Homophones represent an example of ambiguity, a situation in which a form has more than one possible interpretation. We will meet more ambiguity later in the book. Ambiguity presents a potential problem for Hearers because by itself the form cannot be interpreted. Hearers can normally solve this problem by using the context of the ambiguous form, either the other words that it appears with or the situation that it refers to. However, languages normally do not have very many homophones because of the burden this would place on Hearers.
Now consider a minimal pair like [pi] and [pe] again. Because these have different meanings, they are apparently different words, and we used this fact to conclude that [i] and [e] are separate phonemes. But what if [i] and [e] are just allophones of a single phoneme that varies considerably in how it's pronounced, even in a context at the end of words? In that case, [pi] and [pe] would be an example of a pair of homophones, both realizations of a phonemic form that we could write as /pi/. But homophones are rare (for good reason), and it is much more likely that the difference between [i] and [e] is contrastive, as we originally concluded. Still, more evidence would help us decide.
How to Figure Out the Phonology of a Language (Within Limits)
Now let's summarize what we've learned in the form of a set of instructions for discovering the phonology of a language. You can use these a guide when solving phonology problems concerning real languages. But first, remember this caveat: any sample of words is necessarily incomplete, so learners can't be completely sure of their conclusions. It's better to think of the conclusions as hypotheses. The more data there is, the greater the learner's confidence in the hypotheses.
- Learn something about the phonotactics of the language. Begin by looking at the pattern of consonants and vowels that make up the words. This should tell you about syllable structure and maybe about constraints on the form of words. It may also be possible to make more generalizations by examining subcategories within consonants and vowels, for example, voiced consonants, and by looking at suprasegmental features such as stress. In particular, stressed and unstressed syllables may have different structures.
- Within the list of all of the phones you have heard, pick pairs of similar phones. For each, look for evidence that the phones belong to the same phoneme or different phonemes. For each pair, do the following.
- Look for minimal pairs. If you find them, conclude that the phones are separate phonemes.
- If you don't find minimal pairs, look at the contexts that the phones occur in to see if the contexts overlap. Can they be followed by the same or similar phones? Can they be preceded by the same or similar phones? If there is considerable overlap (you have to use your judgement here), conclude that the phones are separate phonemes.
- If the contexts don't overlap, you probably have allophones of the same phoneme. To make sure, and to figure out what the realization rules for the phoneme are, look for the following.
- Try to see if the distributions of the two phones are complementary (the usual situation for allophones). If so, the realization rules are straightforward: the phoneme is pronounced one way in one context (or set of contexts), the other way in the other context (or set of contexts).
- Look for cases where you have forms with the same meaning differing only with respect to the two phones. If you find them, it may be that the choice of allophone in this context depends on factors such as formality. Alternately, if the phone is at the beginning or end of a word, the choice could depend on what phone comes on the other side of the phone.
But what do we see when we look at actual data from children learning the phonology of their first language and adults learning the phonology of a second language? That's the topic of the next section.