# 8.2: Reconstructions and Analysis

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$

( \newcommand{\kernel}{\mathrm{null}\,}\) $$\newcommand{\range}{\mathrm{range}\,}$$

$$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$

$$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$

$$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$

$$\newcommand{\Span}{\mathrm{span}}$$

$$\newcommand{\id}{\mathrm{id}}$$

$$\newcommand{\Span}{\mathrm{span}}$$

$$\newcommand{\kernel}{\mathrm{null}\,}$$

$$\newcommand{\range}{\mathrm{range}\,}$$

$$\newcommand{\RealPart}{\mathrm{Re}}$$

$$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$

$$\newcommand{\Argument}{\mathrm{Arg}}$$

$$\newcommand{\norm}[1]{\| #1 \|}$$

$$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$

$$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\AA}{\unicode[.8,0]{x212B}}$$

$$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$$

$$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$$

$$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vectorC}[1]{\textbf{#1}}$$

$$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$$

$$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$$

$$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$$

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

$$\newcommand{\avec}{\mathbf a}$$ $$\newcommand{\bvec}{\mathbf b}$$ $$\newcommand{\cvec}{\mathbf c}$$ $$\newcommand{\dvec}{\mathbf d}$$ $$\newcommand{\dtil}{\widetilde{\mathbf d}}$$ $$\newcommand{\evec}{\mathbf e}$$ $$\newcommand{\fvec}{\mathbf f}$$ $$\newcommand{\nvec}{\mathbf n}$$ $$\newcommand{\pvec}{\mathbf p}$$ $$\newcommand{\qvec}{\mathbf q}$$ $$\newcommand{\svec}{\mathbf s}$$ $$\newcommand{\tvec}{\mathbf t}$$ $$\newcommand{\uvec}{\mathbf u}$$ $$\newcommand{\vvec}{\mathbf v}$$ $$\newcommand{\wvec}{\mathbf w}$$ $$\newcommand{\xvec}{\mathbf x}$$ $$\newcommand{\yvec}{\mathbf y}$$ $$\newcommand{\zvec}{\mathbf z}$$ $$\newcommand{\rvec}{\mathbf r}$$ $$\newcommand{\mvec}{\mathbf m}$$ $$\newcommand{\zerovec}{\mathbf 0}$$ $$\newcommand{\onevec}{\mathbf 1}$$ $$\newcommand{\real}{\mathbb R}$$ $$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$$ $$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$$ $$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$$ $$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$$ $$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$$ $$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$$ $$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$$ $$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$$ $$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$$ $$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$$ $$\newcommand{\bcal}{\cal B}$$ $$\newcommand{\ccal}{\cal C}$$ $$\newcommand{\scal}{\cal S}$$ $$\newcommand{\wcal}{\cal W}$$ $$\newcommand{\ecal}{\cal E}$$ $$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$$ $$\newcommand{\gray}[1]{\color{gray}{#1}}$$ $$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$$ $$\newcommand{\rank}{\operatorname{rank}}$$ $$\newcommand{\row}{\text{Row}}$$ $$\newcommand{\col}{\text{Col}}$$ $$\renewcommand{\row}{\text{Row}}$$ $$\newcommand{\nul}{\text{Nul}}$$ $$\newcommand{\var}{\text{Var}}$$ $$\newcommand{\corr}{\text{corr}}$$ $$\newcommand{\len}[1]{\left|#1\right|}$$ $$\newcommand{\bbar}{\overline{\bvec}}$$ $$\newcommand{\bhat}{\widehat{\bvec}}$$ $$\newcommand{\bperp}{\bvec^\perp}$$ $$\newcommand{\xhat}{\widehat{\xvec}}$$ $$\newcommand{\vhat}{\widehat{\vvec}}$$ $$\newcommand{\uhat}{\widehat{\uvec}}$$ $$\newcommand{\what}{\widehat{\wvec}}$$ $$\newcommand{\Sighat}{\widehat{\Sigma}}$$ $$\newcommand{\lt}{<}$$ $$\newcommand{\gt}{>}$$ $$\newcommand{\amp}{&}$$ $$\definecolor{fillinmathshade}{gray}{0.9}$$

## 8.2.1 Reconstruction and Analysis, from Sarah Harmon

### Video Script

In the previous section, we started talking about a proto-language. In this section, we'll talk a little bit more about what reconstruction is and how we start to analyze historical linguistic data. This is just a primer; this is just the beginning. If you ever have a chance to take an historical linguistics course, usually it’s an upper division course, and you're going to learn quite a bit more. For now, suffice it to say that this is your first step into being able to see diachronic change in action.

To start off let's talk about language reconstruction. What is this when we say that we reconstruct a proto-language? What does that mean?

It starts off with the comparative method, which is pretty much what you think it is: it is when you compare data that you can trust, and you come up with your best reconstruction of what the earlier language might have sounded like. Notice, I said, “sounded;” we tend to use the comparative method most with respect to sounds, phonetics and phonology. We can use it for morphology and syntax, although it's less used. We do compare and contrast, but not quite in the way that I’m going to show you. We don't really do this with respect to lexical reconstruction, or talking about how lexicon change over time, because that's a different animal altogether. When we deal with lexical reconstruction, or specifically historical linguistics focusing on semantics and pragmatics, that's a different type of analysis. Mostly we use the comparative method for sound, and the sound is going to help us understand the other parts as well.

When we look at a data set, we have a specific series of steps that we follow. The comparative method always focuses on an individual sound at each corresponding position. I’m going to walk you through a data set soon enough, but just to kind of walk through the process, let's say we are analyzing a language family. We have data from four different languages and the data include 25 lexicons. These are cognates, meaning that they are pretty close with respect to sound and meaning. For example, if you know, a Romance language, the word for ‘water’ is pretty close, if not exactly the same, across the Romance languages. In some cases, they sound exactly the same, and others well there's been a few changes to the sounds. Regardless, the word for ‘water’ is the same, and so, if we were to take agua from Portuguese, agua from Spanish, acqua from Catalan, eau from French, aqua from Italian, and we were to spread those across, we would be able to reconstruct the original lexicon. We start with that [a] first, that first vowel, and go across, and it's the same vowel for pretty much everybody, except for French—probably the earlier version also had an [a] at that same place. That consonant sound right afterwards is close but not exactly the same; sometimes it's a [g], sometimes it's a [k], and sometimes it has disappeared, like in French.

As we continue to reconstruct, we go across, analyzing each individual sound; we see where there's similarities and where there are differences. When there are differences, and specifically if there isn't a ‘majority rules’ situation—meaning all the data but one show the same sound in the same place—if there's a majority, then we know the majority is probably going to win out. If you can't tell, then you use the consonant hierarchy. This is something I showed you when we talked about phonetics; I talked about how certain sounds are stronger than other sounds, and that we go from top to bottom on this list. In the case of agua, acqua, eau, and the other words for ‘water’ in the Romance languages, the big difference consonant-wise is that you either have a [g] or [k] or you have nothing, it's been deleted. We see that deletion at the bottom, so clearly, there used to be some kind of consonant and French just deleted it. Going between a [k] and a [g], you're going to go with the sound that is higher in that list, meaning that is higher in the hierarchy. In that case, voiceless sounds are higher than voiced sounds; [k] is a harder, higher sound then [g]. That means in the earlier version, probably, it was a voiceless [k] that existed and, in fact, that is what we see both in Classical Latin and then Vulgar Latin; that that consonant is a [k], it's voiceless, so we know that it's right.

With respect to vowels, it's hard to exactly reconstruct those, but sometimes we can. For example, in the Polynesian languages, most of them have between three and five vowel sounds, sometimes up to seven. Chances are, in Proto-Polynesian there were probably a few more vowel sounds that merged together; across human languages, we tend to see more merger. That isn't always the case, and I will come back to a really famous example in the next section. Vowels are very hard to do with respect to comparative method; it's much more suited to consonants, but sometimes we catch a break.

The comparative method is one way that we start reconstructing languages proto-languages, we start giving our best hypothesis for what the earlier version looked like or sounded like. However, it's not the only thing that we do in historical linguistics. One of the other areas that we focus on is classification. We'll talk more about classification when we get to topology, which is a further section in this chapter. When we try to see which languages might be related to other languages, there are certain core facts core data that we are looking for. The first is the numerals, and most specifically it's numerals one through 10 for mildly obvious reasons: most of us, typically, are born with 10 digits on our hands, so the numbers one through 10 are going to tend to stay pretty close; you're not typically going to borrow any number one through 10. We also look at nuclear family terms; as we talked about in semantics, family terms can range and sometimes those terms can be borrowed. Close family members—such as parents, children, and siblings, frequently grandparents (not always) and grandchildren—those terms do not tend to be borrowed from one language to another. There are exceptions to this rule; in fact, one really big one is many of the Filipino languages borrowed a number of family terms from Spanish. While I don't actually know the reason why, I would suspect it could be that the Spanish terms gave more specificity than what the then Filipino languages offered up so. For example, in Spanish, you have specific terms for ‘aunt’ and ‘uncle’, for ‘cousin’, for some of these other kinds of family terms, and it could be that in at the time of contact that many of the Filipino languages just didn't have as many specific terms. To be clear, that's hard to understand as to why that is the case. The third area that is really important with respect to genetic classification are the core elements of syntax and morphology. Notice it is specifically those two areas, and if we want to get very specific, it's how a given language embeds a clause or derives nouns and verbs. Why would that be? For example, English is a Germanic language. If you've ever learned a little bit about the etymology of many words in English, you'll know that we have borrowed a very large number, starting from the eighth and ninth century, when the Vikings started raiding the shores of the British Isles. The Angles and Saxons and Jutes, who were there and had established themselves, they were speaking a Germanic language, but it was a different Germanic language than what the Viking spoke, which was Old Norse. And so there were terms borrowed from the Vikings. Then, fast forward to 1066, when William the Conqueror came over from Normandy, which is in France, and he spoke Old (Norman) French—and Old French became the court language, which meant that a slew of terms for anything having to do with court life, government, aspects of a castle—the fact the term castle—all borrowed from Old French at that time. Fast forward a few more years to the Renaissance, you have a massive influx of Latinate terms that everybody shares. English has accumulated a very large number of lexicons from other languages, most of them Indo-European and well beyond with the colonization of various parts of the world. Yet the core makeup of English—specifically how we derive our nouns and verbs, how we rely on certain processes, how we create our phrases—that is very Germanic, and that has not changed in the history of our language with respect to external influences. Internal influences, yes, there have been changes and we'll talk about those coming up. For these reasons, we don't focus on the lexicon with respect to core elements of syntax and morphology. We look at the lexicon as a whole, and by that same token, we don't look at the phonology as much. Certainly, there are characteristics that are typical—the sounds that we have an English are very Germanic. We have a number of fricatives and a lot of vowels; that's very Germanic. However, we can't look specifically at the sounds themselves, because we have borrowed or modified so much over history.

The fourth piece, as I said, the external influences; we do have to look at the migration patterns, we have to look at how the given language has gone through the world. What I have here is the key migrant routes from Africa to Europe, because this explains not just how a number of these African languages have changed over time with their influence or colonization or being conquered by European groups, but also just in the major influences. When we look at the migration patterns and we see a consistent migration pattern, we know that there's going to be some kind of influence of one language over the other. There's so much more to this part of the story, and we'll get to some pieces, as we go through these next several sections.

I could go on forever talking about classification, and I will come back to more when we get to typology. For now, just know that when we talk about a language family, they have quite a few pieces in common. This is how we start figuring that out.

8.2: Reconstructions and Analysis is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by LibreTexts.