3.3.1 From 3.6 Aspirated Stops in English, in Anderson's Essentials of Linguistics
We know now that we can use the IPA to transcribe speech sounds, and that our transcription can be either broad or narrow. When we make a narrow transcription, we’re including as much detail as possible about how speakers produce sounds, which often means including diacritics. To give an accurate narrow transcription of Canadian English, we would have to include a property that is part of nearly every variety of English – aspiration on voiceless stops.
To illustrate what aspiration is, I’m going to ask you to say a silly sentence: The spy wanted to buy a blueberry pie.
Now say it again, and hold your hand in front of your mouth. The spy wanted to buy a blueberry pie.
Did you feel any differences between the words spy, buy and pie? For native speakers of English, the word pie is produced with a little puff of air as the [p] is released. That puff of air is called aspiration. English speakers systematically produce aspiration on voiceless stops at the beginning of a stressed syllable, but not on voiced stops. To understand why we have to think about voicing and about the manner of articulation.
Remember that voiced sounds are produced by vibrating the vocal folds, whereas voiceless sounds have the vocal folds held open so air can pass freely between them. Remember also that producing a stop involves closing off the vocal tract completely for a moment, then releasing the obstruction and allowing air to flow freely again.
Think about the voiced stop at the beginning of the word buy. The lips are closed – that’s the stop closure – and the vocal folds start vibrating for the voiced [b]. Then the lips open and the stop is released, and the vocal folds keep vibrating for the diphthong [aɪ].
But in the word pie, things work differently. The lips are closed for the bilabial stop. But because [p] is a voiceless stop, the vocal folds are not vibrating. We open the lips to release the stop, but 30 or 40 milliseconds pass before we start vibrating the vocal folds. That 30-40 milliseconds between when the stop closure is released and the voicing begins is called the voice onset time or VOT. In English, voiceless stops in certain positions have a VOT of 30-40 milliseconds, so we say that they’re aspirated. But voiced stops have a much shorter VOT, of about 0-10 milliseconds. In other words, the vocal folds start vibrating at almost exactly the same time as the stop closure is released, so voiced stops in English are unaspirated. The diacritic to indicate aspiration on a stop is a little superscript h, like so: [ph, th, kh].
But to make matters even more complicated, it’s not all voiceless stops that get aspirated in English – only voiceless stops at the beginning of a stressed syllable. In words like appear and attack, the voiceless stop isn’t the first sound in the word, but it comes at the beginning of a stressed syllable so it gets aspirated. [əphiɹ] [əthæk]
But in the words apple and nickel, the voiceless stop comes after a stressed syllable and before an unstressed syllable, so it doesn’t get aspirated. [æpəl] [nɪkəl]
We don’t aspirate voiceless stops at the ends of words, like in brick. [bɹɪk]
And we don’t aspirate voiceless stops following an [s], even if they’re at the beginning of a stressed syllable:
Aspiration of voiceless stops is something that native speakers do so regularly and so automatically that it’s very hard for us to perceive it because it’s just always there. To convince you, I’m going to record someone saying this sentence and show you the waveforms. This program is known as a waveform editor. And here’s Kendrick’s voice saying that sentence.
The spy wanted to buy a blueberry pie.
Here’s the waveform: this is a visual representation of the sound waves that Kendrick just produced. See that I can select certain parts of the sentence and play them back. spy, buy, pie
Look first at buy – you can see that there’s very a brief silence: that’s where Kendrick’s lips were closed for the bilabial stop. Then when he releases his lips the waveform gets nice and big for the sonorous vowel [aɪ].
Look over here at pie. You see the same silence where the lips are closed, and the same big waveform for the vowel [aɪ] but before the vowel, there’s this noisy burst of turbulence – that’s the aspiration.
And now look at spy. We see the turbulence at the beginning for the fricative [s], followed by the silence while the lips are closed and the nice sonorous vowel. But there’s no burst of noise following the release of the lips because the [p] in spy was not aspirated. In fact, if I select just the -py portion of spy, what does it sound like? To a native speaker of English, this part sounds like buy, because the [p] is unaspirated.
When you’re transcribing words with the voiceless stops [p t k], your challenge will be to figure out if the stops are aspirated or unaspirated, so you can indicate the aspiration in your narrow transcription. In most varieties of English, aspiration happens in these predictable environments.
- Voiceless stops are aspirated at the beginning of a word, and at the beginning of a stressed syllable.
- Voiceless stops are unaspirated at the beginning of an unstressed syllable. They’re also unaspirated in any other position, like at the end of a syllable or the end of a word.
- And even if a syllable is stressed, a voiceless stop is unaspirated if it follows [s].
- In English, voiced stops are never aspirated. They’re always unaspirated.
One thing that I want you to remember is that this pattern of aspiration is particular to the grammar of English, but stops behave differently in other languages. In French and Spanish, for example, voiceless stops are almost always unaspirated. And some languages, like Thai, actually have a three-way distinction between voiced, unaspirated voiceless, and aspirated voiceless stops.
Hint: [th] has to occur at the beginning of a stressed syllable.
Hint: [kh] has to occur at the beginning of a stressed syllable.
Hint: [ph] has to occur at the beginning of a stressed syllable.
3.3.2: From 3.8 Other Articulatory Processes, in Anderson's Essentials of Linguistics
In our last unit, we talked about assimilation, when speech segments become more similar to nearby sounds because of coarticulation. There are other articulatory processes that shape the words that we say. Some of these processes occur simply as a result of speaking quickly and naturally. Some of them make speech more clear for a listener. Some of them happen over time within a dialect, as speakers start unconsciously changing the way they produce sounds.
While we were learning to do IPA transcription we talked about vowel reduction. It’s a very common process in rapid, natural speech. In English, the vowel in an unstressed syllable often gets reduced to the mid-central vowel schwa [ə]. This happens in lots of words. For example, we don’t usually pronounce this word electric as [ilɛktɹɪk]. Instead, because the first syllable is unstressed, the vowel gets reduced, and we say [əlɛktɹɪk]. Likewise, this word today doesn’t get pronounced as [tudeɪ]. The vowel in the first, unstressed syllable gets reduced and we say [tədeɪ].
In fact, sometimes an unstressed vowel gets reduced so much that it disappears altogether! This process is called, obviously, deletion. In some varieties of English, reduced vowels are systematically deleted in certain predictable environments, like in police or garage. Deletion can also occur within consonant clusters. It’s pretty common for speakers to delete the first [ɹ] in surprise or the [d] in Wednesday. Deletion also happens when we borrow words from other languages. For example, take the Greek word pteron, which means “wing”. When we borrow this word and incorporate it into helicopter, we pronounce both the [p] and the [t]. But when it comes at the beginning of a borrowed word, like pterodactyl, we just delete the [p] altogether, since English doesn’t allow two stops in a syllable onset.
Sometimes when we’re speaking, extra segments find their way into our words, as a result of coarticulation. Can you guess what word I’m saying? [phɹɪnts] Was it prince or prints? Only one of them is spelled with a “t”, but we pronounce them both the same way. In prince, an alveolar stop appears between the alveolar nasal and the alveolar fricative. The articulatory process that inserts an extra sound is called epenthesis. In English, this tends to happen between nasals and stops or between nasals and fricatives. Another example is in the word something, where we often epenthesize a little bilabial stop [p] between the bilabial nasal [m] and the voiceless fricative [θ]. Or when George W. Bush famously pronounced the word nuclear as [nukjəlɚ], he was epenthesizing a [j] between the [k] and [l].
Some articulatory processes result from speech errors. Some of these errors are characteristic of children’s speech, and some of them just occur in everyday rapid speech. Children’s speech often includes the process of metathesis, exchanging the position of speech segments. When my niece was little, she used to pronounce the word hospital as [hɑstɪbəl], exchanging the positions of the two stops. Metathesis can also happen when we borrow words from another language. When English speakers want to buy a burrito from the restaurant chain called Chipotle, we often metathesize the [t] and [l] and say, [tʃəpolti], because the “tl” sequence is rare in English.Many of these articulatory processes are frequent and systematic in natural speech. In the next chapter, we’ll see that they play an important role in our mental grammar.
Hint: There is an extra sound produced or added to the sequence.
Hint: There is a vowel that was deleted.
Hint: The diphthong was reduced to a single vowel sound.
3.3.3 Other Phonological Rules, from Sarah Harmon
Now that we've talked about assimilation and dissimilation, let's talk about other types of phonological rules. To be clear, they're actually quite a few, but we're only going to focus on a certain number of them.
To start off, we're going to do a kind of easy one, in the sense that these are both types of assimilation rules. For example, neutralization. Those of you who are experienced English speakers, you do not actually say 'rider' and 'rider'; you do not pronounce that clearly the end in the middle of those lexicon. Rather, you say both of them exactly the same [̣ɹajɾəɹ]. And notice that you do the same with 'later' and 'ladder'; you frequently just say [læɾəɹ]. What happened to that [t] and that [d]? Well, they neutralized, and specifically in English it happens in the middle of the word when there's two vowels on either side of that stop. They're neutralizing to something very similar but lower in the sound hierarchy. Think about it: the highest position on that sound hierarchy, was a voiceless stop. A voiced stop was pretty high up as well, and are exactly the same sound; it's just a matter of whether your vocal cords are moving or not. They have to neutralize to something that's very similar, just lower on the sound hierarchy. A tap is always going to be lower; it's not quite as an as occlusive, it doesn't stop the air quite as much. It's also in the same place of articulation; it's also an alveolar sound, just like [t] and [d]. They neutralize that position. This is possibly one of the most common assimilation rules.
The other most common assimilation rule is something called palatalization and it's exactly what you think it is. The sound is moving to the palatal region of the mouth. Experienced English speakers, ‘I bet you' is not how we usually say that is it; we usually say 'I betcha'. So what's going on? That [t] in 'bet' is getting pulled from the alveolar region to the palatal region because of the [j] that's coming next to it. That's a palatal glide or palatal approximate. So, because of that we're going to pull that sound. Why does it happen? Well, there's a lot of theories, although none are really proven. Suffice it to say, think about the natural resting place of your tongue when you're not talking, you're not eating, you're not doing anything with your tongue. Where is your tongue? It's in the middle of your mouth, near your palate. That could be the reason that palatalization is the most used assimilation rule there is.
Let's talk about a few others.
While feature subtraction is fairly rare, feature addition is not; it happens quite often. When we talked about aspiration in English, how [pʰ, tʰ, kʰ] are aspirated and there's an extra puff of air when you are putting them at the beginning of a stress syllable. That (feature addition) is something fairly common to do, to add an aspiration, add nasalization, add some kind of feature.
What is also really common, though, is not just adding a feature but adding a segment (segment addition), an extra sound or two. I’m going to go back to Spanish because most all of you have some experience with a native Spanish speaker; maybe you yourself are native Spanish speaker. You know that in Spanish, for anything having to do with writing the verb root is 'scribir'. 'Inscribir', to inscribe; 'transcribir', to transcribe; 'suscribir', to subscribe. Notice we use them in English as well it's a cognate. But the verb meaning to write is not 'scribir'; it is 'escribir'. In Spanish, you have to stick an [e] in front of a certain combination; that combination specifically is an [s] and a consonant. You cannot start a word with an [s] sound and a consonant; it just can't happen in Spanish, so the rule is you epenthesize, you add in that in. By the way, French used to have this rule, too, but then the [s] dropped. Notice the verb in French for writing is also 'écrire', it's also a cognate, but then ‘to write’ is not 'escrire'; it's a 'écrire'. It's just that the S deleted along the way.
Speaking of French, we cannot talk about French without talking about apocope. Apocope is another word for segment subtraction, so getting rid of something. If you have studied French or you are a native speaker French, you know that you drop consonants at the end of a word regularly. In fact, it's one of the jokes of learning French is to look at the how the word is spelled and then not pronounce half the consonants. There's some truth to that joke and it's because of apocope. 'Petite' is the term for small, like petite in English. This is the word for 'rose', you probably guessed that this is the word for a 'sheep'. But how you pronounce the word for small is going to change, depending on whether the word that comes afterwards starts with a consonant or not. If it does, you delete that so you do not say [pɛtit ʀoz], you say [pɛti ʀoz]. You just don't, say that second [t]. Notice that with 'agneau' you do: [pɛtit aɲø]. Apocope is an interesting phenomenon, and if epenthesis is common, apocope is almost as common.
Finally, let's look at metastasis. Sometimes is called permutation; metastasis is the more common term. It's when you reorder sounds. [lehit] in Hebrew is the reflexive pronoun it means 'self'. If you want to say ‘to arrange yourself’ or ‘to use yourself’ or ‘to apologize to yourself’, instead of [lehit] always being the form, it's going to change if the verb it goes with starts with a sibilant, an [s] like sound. Instead of [lehit sader] you switch the [s] and the [t]: [lehis tader]. Instead of [lehit ʃsameʃ] you switch the [s] and the [t]: [lehiʃ tameʃ]. Instead of saying [lehit tsadek], you're going to switch the [t] and the affricate [ts]: [lehts tadek]. Pretty cool right? Lest you think that this doesn't exist in languages like English…it does. For example, the verb 'to ask' well that's Standard American English or Mainstream American English. If you go to African American English, as well as Appalachian English and a number of other dialects, it is [æks]. You might think that [æsk] is the original, but it's not. In Old English, the verb was [aksjan], so clearly in certain dialects of English, including Queen's English and Standard British English some time ago, there was a metastasis between that [k] sound and the [s] sound so that it is [æsk] versus [æks]. However, many dialects continue with the older pronunciation.
Finally, let's talk about sequential constraints. sequential constraints are when a specific language has rules as to what combinations are allowed, and what are not allowed. For example, English cannot start a stressed syllable with a consonant that is not plosive, not aspirated and not voiceless; that refers to the aspiration rule we talked about earlier. You cannot say, for example, [toʷmeʲtoʷ], without pronouncing that aspiration. It doesn't make sense, and it sounds weird. Likewise, we talked about the epenthesis rule in Spanish, that you cannot start a word with an [s] and a consonant; it has to have an [e] in front of it. For those who have studied German, you know that it cannot have a word end with a voiced consonant; it's always going to devoice, no matter how it's spelled. There are always accidental gaps, meaning possible words that follow the sequential constraints of phonemes for that language but that just don't exist. They haven't been used yet in the lexicon. That doesn't mean they won't get used in the future, and in many cases, this is where marketing gurus have fun. They use those accidental gaps to think of different names for different types of products.
But what this all points to is the fact that we have some kind of universal grammar. That grammar is not just how we put words together or how we put phrases together, it also includes how we put sounds together and that this template, this early set of rules gets massaged and added to as we learn our individual native language or native languages.