1.4: Dialects and Languages

Last updated
Save as PDF

Page ID: 6999

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $ $ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $$\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$ $\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$$\newcommand{\AA}{\unicode[.8,0]{x212B}}$

Idiolects and Dialects

Exercise $\PageIndex{1}$

Two Americans are talking about a couple they have just met.

She sounded English to me, but he doesn't seem to have any accent at all.

Two English people are talking about the same couple.

He sounded American to me, but she doesn't seem to have any accent at all.

What's going on here? Who has the accent?

What I know about my language and how to use it is called my idiolect. It almost certainly varies in minor ways from the idiolects of all other speakers. But what is an idiolect? That is, what kinds of things do I know? In one sense, this whole book is an answer to that question, but we need to have a first cut at the answer here to help us get started.

I know words. I have a vocabulary, a set of words which I know how to pronounce and use appropriately. For example, I know how to say the word apple, I know that it refers to a particular type of fruit, I come up with this word when I want to refer to a particular apple, and I understand it when I hear it.
I know how to pronounce words and combinations of words more generally. That is, there are aspects of pronunciation that go beyond individual words. For example, I know to pronounce the ending that we spell -ed like /t/ in words like picked and watched but to pronounce it like /d/ in words like signed and burned.
I know how to put words together into sentences in meaningful ways. For example, I know that if I want to ask when a particular train leaves I can say when does the train leave?, but not when leaves the train?.
I know how to use language appropriately to achieve my goals. I know that if I want a friend to lend me $100, it is better to say I was wondering if you could lend me some money than to say give me $100.

I'll be much more careful later on about how each of these types of knowledge is described, but for now I'll say (informally) that my idiolect involves knowledge about vocabulary, pronunciation, grammar, and usage.

Of course no one is really interested in describing idiolects. Linguists and other language scientists study the speech of communities of people, not of individuals. More specifically, they study the knowledge of vocabulary, pronunciation, grammar, and usage that is shared by the members of a speech community. Because the members of the community agree on this knowledge, because it differs (at least in some ways) from the knowledge shared by other communities, and because it is mostly arbitrary, I will refer to the knowledge as linguistic conventions.

But what is a speech community? I will use this term to refer to any group of people that shares a set of linguistic conventions differing in some noticeable way from the conventions found elsewhere. You may know that in the United States people in some cities have some characteristic features in their pronunciation, although they are easily understood by people elsewhere in the United States. For example, people native to Pittsburgh are known for using you uns(or yinz) to mean 'you plural'. Here's an example from the (partly tongue-in-cheek) "Pittsburghese" website: if yinz wants served, raise your hands. The number of conventions that distinguish Pittsburghers from other English speakers in the northeastern United States is actually pretty small, but because there is such a set of conventions, we can consider these people to be a speech community. The speech patterns, that is, conventions of vocabulary, pronunciation, grammar, and usage, of a speech community are called a dialect, so we can speak of a "Pittsburgh dialect".

Note that a dialect may not be defined entirely on the basis of its physical location. Cities often contain a variety of ethnic and social groups with different speech patterns. For example, the African-American population of many US cities (for example, Pittsburgh) often has a quite different dialect from the Euro-American population of the same cities.

Which Dialect do you Speak? There may be a Number of Possible Answers.

What about larger communities? Pittsburghers share some speech conventions with speakers in other cities of the northeast and north midwest, for example, their pronunciation of the a in a word like hands, as in the example above (more on this pronunciation later on). And people in that larger region share some conventions with people in an even larger region encompassing speakers in most of the northern and western United States, for example, their pronunciation of the long English vowels (bite, beat, bait, boat, etc.). And people in that even larger region share many conventions with English speakers all over North America, including most of their grammar and usage conventions, as well as a number of pronunciation conventions, for example, the tendency to pronounce the words latter and ladder in roughly the same way.

This idea of larger and larger communities, each sharing fewer and fewer conventions, is an over-simplification in one sense. The fact is that the boundaries of the communities overlap in many ways. If we look at particular vocabulary, we may find a region with one boundary, whereas if we look at other vocabulary or at some pronunciation convention, we may find another boundary. For example, Pittsburghers tend to say pop (as opposed to soda or some other word) for carbonated drinks, and they share this convention with many speakers in the northern midwestern cities who also share their pronunciation of the vowel in hands, but not with speakers to the east of them, in New York City, for example, who share the pronunciation but not the word. (New Yorkers tend to say soda rather than pop.) Thus where we draw the boundaries around a dialect depends on which convention or set of conventions we're looking at. For more about soda vs. pop, see this interesting website.

Another way what I've said so far is an over-simplification is that there is great variation within any of these regions. Some of this variation has to do with the constant contact between dialects that is a fact of life in most communities. Some of the variation also has to do with the fact that people often know a range of ways to say things and they may sometimes avoid their local dialect in favor of a standard (see below) in certain situations.

Each of these shared sets of conventions, whether at the level of a small village, a subculture within a city, or a larger region, is a dialect. And a linguist can be interested in describing any level and any aspect of the dialect at any level (pronunciation, vocabulary, grammar, usage). The pronunciation associated with a dialect is called an accent.

Languages

Exercise $\PageIndex{2}$

What is a language? How would you tell someone (say, an alien with no knowledge of human culture) what English is, without using the word language?

We can of course extend the boundaries in our example even further, beyond North America to include England, Scotland, Wales, Ireland, Australia, New Zealand, a number of Caribbean countries, and communities within many other countries. This large speech "community" is not really a community in the usual sense of the word, but it does share many conventions. For example, in all of these places, speakers make a question from a sentence like he ate potatoes by inserting the word did and changing the form of the verb ate: did he eat potatoes?, and of course speakers in all of these places share the word potato for referring to a class of tuberous vegetables. The conventions of this large "community" are what we refer to as "English", which we consider a language. Thus in one sense a language is a set of dialects. In another sense it is (like a dialect) a set of conventions shared by a speech community.

Two Dialects of One Language or Two Separate Languages?

But how do we decide when a collection of dialects is a language and not just another, more general dialect? As we've already seen, a dialect can also be a set of dialects (the North American English dialect consists of Southern dialect, New England dialect, Canadian dialect, etc.). What makes English a language and not just another very general dialect? What makes Canadian English a dialect of English and not a language in its own right?

The answer to this question is complicated. In fact there is no clear answer because the words dialect and language are used in different ways for different purposes. There are two completely different kinds of criteria related to the distinction between dialect and language, linguistic criteria and social or political criteria.

Linguistic Criteria

Given two overlapping sets of linguistic conventions associated with two different speech communities, for example, Mexican Spanish and Argentine Spanish, how do we decide whether they should count as two dialects or two separate languages? One criterion is the degree of overlap: how similar are the vocabulary, the pronunciation, the grammar, and the usage? Unfortunately there's no simple wat to measure this overlap, at least no way that researchers would agree on. One way to have a sense of the overlap, though, is mutual intelligibility, the extent to which speakers from the two or more speech communities can understand each other. Mutual intelligibility is also not easy to measure, and it is often based on the impressions of speakers and hearers, how much they understand when they encounter members of the other group or how long it takes them to get accustomed to the speech of the other group. We also need to establish some sort of intelligibility threshold; no two speakers can be expected to understand each other all of the time. So none of this is precise at all. The idea is simply that if two sets of linguistic conventions are similar enough so that their speakers can usually understand each other, then the two sets of conventions should count as dialects of the same language rather than separate languages. On these grounds, we call Mexican Spanish and Argentine Spanish dialects of the same language (Spanish) because speakers of these dialects normally have little trouble understanding each other.

To find out what should count as a separate language on grounds of mutual intelligibility, a good resource is Ethnologue, an online database of all of the world's known languages, 6,912 according to their current listing. The Ethnologue compilers attempt to use mutual intelligibility to decide what should count as a language. While English is listed as a single language, both German and Italian are listed as multiple languages. Each of these languages, for example, the variety of Italian called Sicilian, is usually referred to as a "dialect", but, according to the Ethnologue compilers, these are distinct enough to be considered separate languages. Again, the criterion of mutual intelligibility is a rough one, and some of Ethnologue's claims are controversial.

Social and Political Criteria

Another sort of criterion for what counts as a dialect is the social or political unity of the group in question. In Bavaria, a state in southern Germany, and in parts of Austria most people speak a dialect called Bavarian or Austro-Bavarian, which on grounds of mutual intelligibility could be considered a language distinct from the speech of Germans and Austrians in other regions. Ethnologue calls Bavarian a language. But Bavarian is clearly closely related to those other dialects and not more closely related to dialects of some other language, and so for mainly political reasons, it is convenient to consider it a dialect of the German "language", rather than a language in its own right. Something similar can be said about the speaking conventions of the older generation in the Ryukyu Islands in southern Japan (because these dialects are dying out, most young people do not speak them). On the basis of mutual intelligibility, we could divide the island dialects into several separate languages, each distinct from the Japanese language (as is done in Ethnologue and in the Wikipedia article on these languages). But the Ryukyu Islands are politically part of Japan, and these dialects are clearly related to Japanese and not related at all to any other known language (unless we consider each of them to be a language). So for political reasons, it is convenient to consider them dialects of Japanese, just as the dialect of Osaka is considered a dialect of Japanese.

Mutually Intelligible "Languages"

At the other extreme are examples like the languages spoken in the northern European countries Sweden, Norway, and Denmark.These "languages" are all related to one another, and speakers from some pairs of countries within these have little difficulty understanding one another when they are speaking the standard dialects of their languages, despite the obvious differences, especially in pronunciation. Thus on grounds of mutual intelligibility, we might consider some of these "languages" to be dialects of a single language. But Swedish, Danish, and Norwegian are official "languages" of separate countries, and there are separate spelling conventions for some of the sounds in the languages.

Actually the situation is even more complex than this because Norway has two official dialects, and a fourth related language, Faroese, is spoken in the Faroe Islands, which are administered by Denmark.

So for mainly political reasons, they are considered separate languages rather than dialects of a single language.

To summarize, the line between dialects of one language and separate languages is somewhat arbitrary. However, wherever we draw the line, three points should be clear.

Every language has multiple dialects.
Every speaker of every language is also a speaker of at least one dialect of that language.
Since the pronunciation conventions of a dialect constitute an accent, every speaker of every language speaks with some accent. There is no such thing as "speaking without an accent".

Standard Dialects

Exercise $\PageIndex{3}$

The following appears on the website of a person who spent some time in Pittsburgh: "probably relating to the rest of Pittsburgh's terrible dialect, which I, fortunately, did not pick up". Why would some dialects be thought of as "terrible"?

Some dialects within a language may be singled out for special status. When we're dealing with a political unit, such as a nation, in which related dialects are spoken by most people, one dialect is often treated as the standard dialect. You know something about this already from the last section of the book. The standard dialect is often the only dialect that is written, and it is the one that is taught in schools and (with some exceptions) used in the media. Thus in Germany, Austria, and the German-speaking part of Switzerland, it is Standard German that is taught in the schools and used in broadcasting, even though most people in this region are not native speakers of the Standard German dialect. This means that most people in the German-speaking countries end up bidialectal. The same situation holds in Japan, where it is Standard Japanese, based on Tokyo dialect, that is taught in the schools and used in the media. Note how this makes it possible to speak of a German or Japanese speech community, even when the native dialects of people in these communities are very different from one another, because all educated speakers in these communities share the standard dialect, often as a second dialect.

So what do we mean when we say "German" or "Japanese"? There are two possibilities. "German" could mean Standard German, that is, one of the set of dialects spoken in Germany and also the basis of written German. Or it could mean the collection of related dialects, some mutually unintelligible, which are spoken in Germany and other countries where Standard German is the official language (Austria) or one of the official languages (Switzerland). When linguists refer to "German" or "Japanese", without specifying the dialect, they normally mean the standard dialect.

Dialects of English in the US and England

In the United States, the situation is somewhat simpler than in Germany or Japan because the differences among most of the dialects are not nearly as great; native speakers of English in the United States have little trouble understanding each other. (An important exception is African-American Vernacular English (AAVE), spoken mainly by many African-Americans.) As in Germany and Japan, we have an (informal) standard dialect, for vocabulary, grammar, and usage, if not for pronunciation. Thus children in Pittsburgh learn in school to write sentences like the school needs to be renovated rather than the school needs renovated, which would be grammatical in their local dialect. Americans tend to be relatively tolerant of differences in accent, however. Teachers in schools throughout the country teach the standard grammar but use their own local pronunciation. If we have a standard accent, it is the one people associate with television announcers, the accent characteristic of much of the Midwest and the West. This accent is called General American. I will have more to say about it later.

The situation in England is similar to that in the United States, and the standard vocabulary, grammar, and usage that children learn to write in English schools are very similar to the American standard. However, in England, there is a stronger idea of a standard accent than in the United States and more pressure for children to learn this accent if it differs from their home accent. This accent is referred to as Received Pronunciation (RP); it is based on the speech of educated speakers in southern England. (Note that RP is standard English English pronunciation, not British English; in Scotland, there is a quite different standard accent.) I'll have more to say about RP and how it differs from General American and other English accents in this section.

"Just" a "Dialect" or a Full-blown "Language"

The existence of a single standard dialect among a set of non-standard dialects has important social implications. The non-standard dialects have less prestige, and their use may be discouraged in formal situations, not just situations in which writing is called for. Sometimes, as in the Ryukyu Islands in Japan or in some regions of France and Spain, this leads to the decline and possible death of the non-standard "dialects" (which would be considered languages by the mutual intelligibility criterion). In other situations, speakers of non-standard dialects retain pride in their local speech patterns, while recognizing that they are not appropriate in certain situations. Finally, this pride, along with other cultural differences separating the speakers of the non-standard dialect from the speakers of other dialects (non-standard or standard), may lead to pressure to have non-standard dialects given official status, especially if they differ significantly from the standard. At this point the words dialect and language become politically charged terms because the supporters of official status for the non-standard dialect may feel the need to argue that it is not "just" a dialect of the larger language but rather a language in its own right. This has happened in the United States with AAVE (here is an essay on this topic by the sociolinguist John Rickford) and in Europe with many languages that are normally considered "dialects" of other languages (this website includes many of them as well as links to other sites concerned with the "minority language" question in Europe and elsewhere).

Language Families

Why Some Languages Resemble Each Other

We've seen how as we extend the boundaries of speech communities, we get fewer and fewer shared conventions. When we reach the level of a language such as English, Spanish, or Mandarin Chinese, we have a speech community which shares a set of conventions (in some cases a standard dialect) which allows people in the community to communicate with one another despite dialect differences. But we can go beyond a language. So for English, we could extend the boundaries to include the Netherlands, Germany, Scandinavia, and some other regions in western Europe. We'd now find a much smaller set of shared conventions. All speakers in this large "community", for example, share a word meaning 'all' which is similar in pronunciation to the English word all. But there would be no reason to call this set of conventions a "language" since the speakers obviously do not understand each other and do not belong to a single political unit with a single standard dialect. Instead we refer to this set of conventions, or set of languages, as a language family, in this case, the Germanic languages. The members of a language family resemble each other because they are genetically related; that is, historically they derived from a common ancestor language. (Note that this use of the word genetic differs somewhat from its use in biology; the speakers of Germanic languages are not necessarily genetically closer to one another than they are to the speakers of other languages.) The ancestor of the modern Germanic languages was not a written language, so we can only infer what it was like.

In most cases we can go even further back; the ancestor languages of two or more families themselves may have had a common ancestor language. Thus the modern Romance languages, including Spanish, French, Italian, Portuguese, Catalan, and Romanian; the modern Germanic languages; and many other languages spoken today in Europe, the Middle East, and South Asia, apparently descended from a much older (and also unwritten) language. This means we can group all of these languages into a single family, in this case the one we call Indo-European. Sometimes, to distinguish the lower from the higher levels within a family tree of languages, we use "language family" only for the largest grouping (for example, Indo-European) and "branch" to refer to groupings within this (for example, Germanic and Romance). Note that there may be many intermediate levels in the family tree of languages. Within Germanic, for example, there is North, including the Scandinavian languages, and West, including English, Dutch, and German.

Note also that languages may resemble each other in one way or another for reasons other than a genetic relationship. The main non-genetic source of similarity is language contact; when the speech communities for two language are in close cultural contact, their languages often influence one another. So modern Japanese vocabulary includes thousands of words borrowed from Chinese and uses the Chinese writing system (as well as writing systems specific to Japanese). But, except in the sense that all human languages may be ultimately related to one another, there is no evidence that Japanese is genetically related to Chinese. A more complicated situation occurred in Western Asia with the complicated cultural influences among people speaking Arabic, Persian, and Turkish. These three languages belong to separate language families (Afro-Asiatic, Indo-European, and Altaic, respectively), which are either unrelated to one another or only very distantly related, but Turkish and Persian have borrowed many words from Arabic, Turkish has also borrowed many words from Persian, and Persian borrowed its writing system from Arabic.