Skip to main content
Social Sci LibreTexts

5.3: Structure within the sentence- Phrases, heads, and selection

  • Page ID
    199920
    • Catherine Anderson, Bronwyn Bjorkman, Derek Denis, Julianne Doner, Margaret Grant, Nathan Sanders, and Ai Taniguchi
    • eCampusOntario

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    From words to phrases

    Beyond the order of words, all human languages appear to group words together into constituents. The generalizations about which sentences people find grammatical and which ones they find ungrammatical don’t refer to purely linear properties like “fourth word in a sentence”, but instead to phrases in particular structural positions. In the rest of this section we’ll explore what it means to be a phrase in more detail; in the next section we’ll start talking about structural positions.

    A phrase is a set of words that act together as a unit. Let’s look at the example in (1) to see what this means:

    (1)   All kittens are very cute.

    What other groups of words can appear in the same position as the words all kittens in this sentence?

    (2) a. Puppies are very cute.
      b. The ducklings that I saw earlier are very cute.
      c. These videos of a baby panda sneezing are very cute.

    …and so on. It turns out that lots of different groups of words can go in this position—but not all of them! What all these examples have in common is that we’ve replaced [all kittens] with another group of words that includes at least one plural noun: puppies or ducklings or videos. If we swap in a singular noun, the sentences would be ungrammatical, as we see in (3).

    (3) a. *The puppy are very cute.
      b. *The duckling that I saw earlier are very cute.
      c. *This video of a baby panda sneezing are very cute.

    …but if we change the plural verb are to the singular is they become good again (this is subject agreement inflection, last seen in 5.7 Inflectional morphology):

    (4) a. The puppy is very cute.
      b. The duckling that I saw earlier is very cute.
      c. This video of a baby panda sneezing is very cute.

    It turns out that the groups of words that we can easily substitute here are all ones that have a noun in them. But it’s not enough to just have some noun in the group of words at the front of the sentence, as the examples in (5) show. (5a) is ungrammatical even though the string of words at the beginning includes the pronoun I—and this sentence is ungrammatical whether we try the form is or are or even am. In (5b) the sentence is ungrammatical even though we have the compound noun baby panda, again no matter what form of the verb we try.

    (5) a. *That I saw earlier {is / are / am} very cute.
      b. *Of a baby panda {is / are} very cute.

    What distinguishes the grammatical sentences in (1), (2), and (4) from the ungrammatical sentences in (5) is that in (1), (2), and (4) the group of words at the beginning of these sentence are noun phrases (remember that the sentences in (3) were ungrammatical just because they had the wrong agreement inflection). Noun phrases are groups of words that not only contain a noun, but where the noun is the “most important” element in some sense.

    By “most important” we mean that it’s the noun that determines an important part of the meaning of the subject, but also that it’s this noun that determines the category of the whole phrase, which determines where the phrase can go in relation to other phrases. The noun is the head of the phrase, the same kind of headedness we saw in 5.8 Compounding for compounds, but applied to words in a phrase instead of to morphemes in a word.

    The head of a phrase also determines what else can go in the phrase; in particular it determines whether the phrase contains an object—though for heads that aren’t verbs, we usually use the more general term complement. Recall from the discussion of grammatical terminology in Section 6.2 that we classify verbs by their transitivity—that is, by how many objects they take. Each verb has an opinion about whether and how many objects it allows. By contrast, there’s no verb that cares whether it’s modified by an adverb (and also no verb that cares whether it has a subject or not, because all clauses in English require subjects). The technical term for this is selection: heads select their complements, both whether a complement is required or allowed, and what the complement’s category has to be.

    Headedness is important to the grammar of all languages, not just English. The right kinds of generalizations in syntax are never about single words like nouns or verbs, but instead about phrases like noun phrases or verb phrases.

    Importantly, phrases can contain other phrases of the same type inside of them. So for example, the noun phrase [these videos of a baby panda] contains a second noun phrase inside it, [a baby panda].

    The ability of a structure to contain another structure of the same type inside itself is called recursion. This is another key property of natural language grammars—even though there is some debate among linguists about whether all human languages exhibit recursion, everyone agrees that many or most languages do, and that one of the things we need to explain about our human language capacity is that all humans can acquire a language with recursion. You can learn more about child language acquisition in Chapter 11.

    Variation across languages: Word order within phrases

    As we’ve already seen, languages vary in their word order, but this variation isn’t random—it isn’t the case that anything goes in word order.

    This isn’t just true for the order of major constituents in a sentence (subjects, objects and verbs), but also for the order of elements inside phrases; in particular, the order of heads and what they select (their object or complement).

    In English it is always the case that heads precede their complements. This is true of verbs and their objects, prepositions and their noun phrase complements, and nouns and their prepositional phrase complements.

    (6) a. I [VP ate(V) [NP an apple ].
      b. [PP to(P) [NP Toronto ]
      c. [NP picture(N) [PP of a robot ]

    In contrast to English, Japanese is an SOV language. And in Japanese, heads always follow their complements. In other words, heads in Japanese don’t appear in the middle of their phrases like in English, but instead always at the end of their phrases.

    (7) a. Watasi-wa [VP [NP ringo-o ] tabe-ta. ]
        I-TOPIC     apple-ACC   eat-PAST  
        “I ate (an) apple.”
      b. [PP [NP Tokyo ] e ]
            Tokyo   to  
        “to Tokyo”
      c. [NP [PP robotto no ] shasin ]
            robot of   picture  
        “picture of (a) robot”

    This is the reverse of the order we get in English.

    Technically words like e (“to”) in Japanese would be postpositions instead of prepositions, and sometimes the more general term adpositions is used for both languages like English and languages like Japanese. These terms are parallel to suffix, prefix, and affix in morphology.

    The ability of heads to either precede or follow their complements is called head directionality. A language can be head initial like English, or head-final like Japanese. If you’re analyzing an unfamiliar language, and need to figure out its word order, one of the first questions you should ask is whether it appears to be head initial or head final.

    In later sections of this chapter we’ll see other ways to derive differences in word order, involving differences in the movement (or transformations) available in a language’s grammar.


    Check your understanding

    Query \(\PageIndex{1}\)

    If you are following the alternative path through this chapter that interleaves core concepts with tree structures, the previous section was 6.2 Word order and the next section is 6.4 Identifying phrases: Constituency tests.

    6.4: Identifying phrases- Constituency tests

    By identifying certain parts of sentences as phrases, we are making a claim that language users represent them as units in their mental grammar. The technical term for units inside a sentence is constituent: a constituent is any group of words that acts together within a sentence.

    Along with headedness, constituency is one of the central concepts in syntax. Both of these are highlighted when we represent the structure of language using tree diagrams, as we’ll see beginning in Section 6.13, but they’re fundamental to understanding the organization of sentences with or without trees.

    When we analyze a new sentence, how do we identify the phrases inside of it? We want to find evidence that certain groups of words actually do act together as units. To find that evidence, we use grammaticality judgements, and a few simple tests.

    The tests that identify constituents (often called constituency tests) that we’ll review in this chapter come in four basic types:

    • Replacement tests
    • Movement tests
    • It-clefts
    • Answers to questions

    Many textbooks also introduce a coordination test, but it is not always reliable, so we’ll discuss it briefly at the end of this section but won’t rely on it.

    REPLACEMENT TEST

    Here are two sentences to start with.

    (1)   The students saw a movie after class.
    (2)   The students saw a movie about dinosaurs.

    Let’s consider the string of words a movie. Based on discussion so far in this chapter, you might have the idea that this is a noun phrase—or at least that it could be a noun phrase. But whether or not you have that idea, we need evidence to decide one way or the other.

    One piece of evidence that something is a noun phrase is that you can replace it with a pronoun, and get a sentence with the same meaning (in a context where the meaning of the pronoun is made clear). In (3) we take the pronoun it and replace the string of words we’re interested in, then ask if the new sentence is grammatical and whether it has the same meaning.

    (3)   The students saw a movie after class. The students saw it after class.

    Replacing a movie with it in (3) does give us a new grammatical sentence that can mean the same thing as (1), so we have evidence not only that a movie is a constituent in (1), but also that that constituent is a noun phrase.

    What about a movie in (2)? Let’s run the same test there:

    (4)   The students saw a movie about dinosaurs. *The students saw it about dinosaurs.

    This time the result of replacing a movie with it is an ungrammatical sentence, so in (2) a movie is not a complete noun phrase. We might be surprised about this—we expect a noun like movie to be inside a noun phrase—but if we test other possible constituents we see that it’s not that there’s no noun phrase here, it’s just that it’s a bit bigger:

    (5)   The students saw a movie about dinosaurs. The students saw it.

    Based on comparing the results of our replacement tests in (4) and (5), we can conclude that in (2) a movie is not a complete noun phrase, but a movie about dinosaurs is both a constituent and a noun phrase.

    We can do the same pronoun replacement test with the string the students in (1). Because students is plural, the relevant pronoun is they:

    (6)   The students saw a movie after class. They saw a movie after class.

    The result of this replacement is grammatical, so we conclude that the students is also a constituent, and also a noun phrase.

    Replacement tests don’t have to involve pronouns. Verb phrases can be replaced with do (or do too), but seeing this usually requires setting up two sentences with different subjects or with a contrast in time like yesterday vs. today. Since we have just seen that the students in (1) is a noun phrase subject (because it comes at the beginning of a simple declarative sentence, before the verb), let’s set up a replacement test for verb phrase with a preceding sentence with a different subject:

    (7) a. The teachers saw a movie after class, and… The students did too.
      b. The teachers saw a movie after class, and… *The students did too before class.

    What we see in (7) is that did too can replace saw a movie after class, but can’t replace saw a movie alone. This tells us that saw a movie after class is a constituent, and it’s a verb phrase (because do (too) replaces verb phrases).

    What about the string after class? This string expresses a time, and we can replace it with the word then:

    (8)   The students saw a movie after class. The students saw a movie then.

    This shows that after class is a constituent; in fact, it’s a prepositional phrase. Not all prepositional phrases can be replaced by then, however—about dinosaurs is also a prepositional phrase, but can’t be replaced by then.

    (8)   The students saw a movie about dinosaurs. *The students saw a movie then.

    Here the result of doing replacement would be grammatical in other contexts, but it isn’t another way to say that the students saw a movie about dinosaurs—this is why it’s marked ungrammatical here, it’s ungrammatical on the intended meaning. You have to pay attention to both grammaticality and meaning when you do replacement tests.

    At this point, you’re probably wondering how you know what you can use as a replacement. Here are some handy tips:

    • Noun Phrases can be replaced with pronouns (it, them, they).
    • Verb Phrases can be replaced with do or do too (or did, does, doing).
    • Some Prepositional Phrases (but not all) can be replaced with then or there.
    • Adjective Phrases can be replaced with something that you know to be an adjective, such as happy (though in this case the meaning will change)

    Because replacement is category-specific, you can use the evidence of replacement tests both to identify constituents and to figure out the constituent’s category: If you can replace it with a pronoun, then you’ve got a noun phrase and you can look for the noun that’s the head. If you can replace it with do (too), then you’ve got a verb phrase which will have a verb as its head.

    MOVEMENT TEST

    Replacement is not the only tool we have for checking if a set of words is a constituent. Some constituents can be moved to somewhere else in the sentence without changing the sentence’s meaning or its grammaticality. Prepositional phrases are especially good at being moved. Consider this sentence:

    (9)   Nimra bought a scarf at that strange little shop.

    Let’s start by targeting the last string of words by moving it to the beginning. Move the string of words then ask yourself whether the resulting sentence is grammatical.

    (10)   Nimra bought a scarf at that strange little shop. At that strange little shop Nimra bought a scarf.

    It is! In isolation the sentence might sound a little unnatural, but we can imagine a context where it would be fine, such as, “At the department store she bought socks, at the pharmacy she bought some toothpaste, and at that strange little shop, she bought a scarf.”

    On the other hand, if we target a smaller string of words, as in (11), we get a different result.

    (11)   Nimra bought a scarf at that strange little shop. *At that strange Nimra bought a scarf little shop.

    The result of moving the string at that strange to the beginning of the sentence is a total disaster. The fact that the resulting sentence is totally ungrammatical gives us evidence that the string of words at that strange is not a constituent in this sentence.

    CLEFT TEST

    A cleft construction is one where you take two parts of a sentence and divide them from each other. (A cleft is a split or gap.)

    In English, a cleft is a sentence with the form: It is/was _ that _.

    To use the cleft test, we take the string of words that we’re investigating and put it after the words It was (or it is/it’s), then put the remaining parts of the sentence after the word that. Let’s try this for phrases that we’ve already shown to be constituents with our other tests.

    (12)   The students saw a movie after class.
      It was a movie that the students saw _ after class.
      It was after class that the students saw a movie _.
    (13)   The students saw a movie about dinosaurs.
      It was a movie about dinosaurs that the students saw _.
    (14)   Nimra bought a scarf at that strange little shop.
      It was at that strange little shop that Nimra bought a scarf _.

    To cleft a verb phrase in English you need put a present or past tense form of do in the position the verb phrase occupied in the original sentence, as shown in (15).

    (15)   The students saw a movie after class.
      (?)It was see a movie after class that the students did.

    Clefting a verb phrase doesn’t always sound totally natural in English, but most people find it better than clefting non-constituents. At the end of this section, go back and compare (15) to the sentences in (16) and (17), and see if you agree that (15) is better.

    By contrast, things that our tests showed were not constituents cannot be put into the first position of a cleft sentence:

    (16)   *It was a movie that the students saw _ about dinosaurs.
    (17)   *It was at the strange that Nimra bought a scarf _ little shop.

    Now let’s try the cleft test on a new sentence:

    (18)   Rathna’s brother baked these delicious cookies.
      It was these delicious cookies that Rathna’s brother baked _.
      It was Rathna’s brother that _ baked these delicious cookies.

    The cleft test shows us that the string of words these delicious cookies is a constituent, and that the words Rathna’s brother are a constituent. But look what happens if we apply the cleft test to another string of words:

    (19)   Rathna’s brother baked these delicious cookies.
      *It was Rathna’s brother baked that _ these delicious cookies.
    (20)   Rathna’s brother baked these delicious cookies.
      *It was these delicious that Rathna’s brother baked _ cookies.
    (21)   Rathna’s brother baked these delicious cookies.
      *It was cookies that Rathna’s brother baked these delicious _.

    All of these applications of the cleft test result in totally ungrammatical sentences, which gives us evidence that those underlined strings of words are not constituents in this sentence. Remember, though, just because a certain string of words isn’t a constituent in one sentence, doesn’t mean it’s not a constituent in any sentence—the result of a constituency test only applies to the specific sentence you’re testing.

    AnswerS TO QUESTIONS

    If a string of words is a constituent, it’s usually grammatical for it to stand alone as the answer to a question based on the sentence.

    (22)   Rathna’s brother baked these delicious cookies.
      a. What did Rathna’s brother bake? These delicious cookies.
      a. Who baked these delicious cookies? Rathna’s brother.

    Answers to questions can also help us identify a verb phrase, because they’re a good context for do-replacement (as a replacement test):

    (23)   Who baked these delicious cookies? Rathna’s brother did.

    In the answer, “Rathna’s brother did”, the word did replaces the verb phrase baked these delicious cookies.

    Again, if a string of words is not a constituent, then it is unlikely to be grammatical as the answer to a question. In fact, it’s difficult to even form the right kind of question:

    (23) a. What did Rathna’s brother bake cookies? *These delicious.
      b. Who of Rathna’s these delicious cookies? *Brother baked.

    SUMMARY

    Results of tests like these are how we investigate the structure of the mental grammar that underlies how people use the languages they know. We can’t observe mental grammar directly, so observing how words behave is how we make inferences about how it must work. These four tests are tools that we have for observing how words behave in sentences. If we discover a string of words that passes these tests, then we know that the phrase is a constituent, and that tells us something about the organization of the sentence as a whole.

    Not every constituent will pass every test, but if you’ve found that it passes two of the four tests, then you can be confident that the string is actually a constituent.


    Check your understanding

    Query \(\PageIndex{1}\)

    If you are following the alternative path through this chapter that interleaves core concepts with tree structures, the previous section was 6.3 Structure within the sentence: Phrases, heads, and selection and the next section is 6.5 Functional categories.


    This page titled 5.3: Structure within the sentence- Phrases, heads, and selection is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Catherine Anderson, Bronwyn Bjorkman, Derek Denis, Julianne Doner, Margaret Grant, Nathan Sanders, and Ai Taniguchi (eCampusOntario) via source content that was edited to the style and standards of the LibreTexts platform.