When good sentences sound bad Understanding the limits of how the mind processes language.

Author’s rating for this post: 2 nerd emoji
It’s intended for language “dabblers”, willing to put in a bit of effort. What do you think? Add your rating at the bottom.


Here’s a game for those of you who are native speakers of English. Which one of the following sentences has a grammatical error?

  1. The girls think the boys will eat ice cream.
  2. The girls think the boys who will eat ice cream.
ice cream
(Image: Pampered Chef)

That was easy, right? (In case you were wondering: sentence (1) is fine, sentence (2) is not. No ice cream for correct answers, sorry.) What about the next two sentences?

  1. Yesterday, the girls who love dogs said the boys who will eat ice cream.
  2. Yesterday, the girls who love dogs said the boys who will eat ice cream love cats.

That was pretty easy, too: Sentence (3) is bad, while sentence (4) is way better. As native speakers who grew up speaking a language, we can usually easily tell whether a sentence in that language contains grammatical errors or not, even though we might not be able to say what went wrong. We do so quickly and reliably even for fairly complicated sentences. If someone actually said sentences like (2) and (3) to you, you might conclude quickly that this person doesn’t speak English too well.

The ease with which we can identify whether a sentence contains grammatical errors says something about our mind, and one of the major goals in language science is to spell out exactly how the mind is set up to process language so efficiently.

Somewhat paradoxically, one way to learn about language and the mind is by investigating the exceptional circumstances when our language processing abilities falter and grammatical sentences consistently seem awful to us. “Consistently” is an important caveat here, since we all know from personal experience that no native speaker can be perfect all the time; we sometimes skip a word when we read, or mis-hear or fail to pay sufficient attention to an utterance. What researchers are interested in are grammatical sentences that pose a problem for everyone even in the most favorable environments. Our struggles with these sentences can then be blamed on some special cognitive “bug” that all of us have, rather than, say, the fact that we have different attention spans and cognitive ability (e.g. some people are good at doing mental arithmetic, others not so much).

Center-embedding

A class of grammatical sentences that has long baffled native speakers and excited researchers involves the phenomenon known as “center-embedding.” To see how this works, let’s begin with acceptable simple sentences.

  1. The cat chased the mouse.
  2. The dog growled at the cat.
Left: The cat chased the mouse; Right: The dog growled at the cat.
(Left image: Jeroen Moes; Right image: http://www.dogclipart.com/)

We can add more words to the nouns “the cat” and “the mouse” to provide additional information about the cat and the mouse in question, as in (7) and (8). (Technically, what we added are “relative clauses” — think of them as almost-complete sentences that describe nouns.) Sentences (7) and (8) are longer than (5) and (6), but you can obviously nonetheless tell that they are fine.

  1. The cat that the dog growled at chased the mouse.
  2. The dog that the girl fed growled at the cat.

In fact, there doesn’t seem to be a cap on the number of relative clauses one could add or where they appear in a sentence. Sentence (9) is an even longer sentence with two relative clauses, where the relative clauses appear at the end, one inside the other. (I have underlined the outer relative clause, and bolded the inner one.) We can easily judge sentence (9) to be OK.

  1. The girl fed the dog that growled at the cat that chased the mouse.

Intuitively, we might predict that native speakers can judge a sentence to be fine or not regardless of the number of relative clauses it contains or where the relative clauses attach to. This prediction turns out to be false.

Here is an example. Let’s take a perfectly fine sentence like The cat that the dog growled at chased the mouse. In the middle of the relative clause that the dog growled at, let’s insert another perfectly fine relative clause, like that the girl fed. This nesting of relative clauses, which is technically known as “center-embedding,” creates a new sentence, which is shown in (10).

  1. The cat that the dog that the girl fed growled at chased the mouse.

What did you think about (10)? Does it sound like something you could say, or something that you could understand? If you are like most English native speakers, you probably found (10) incomprehensible when you first read it. You might have even taken a few seconds (or more) to try to figure out who did what in that sentence. In fact, sentences like (10) are not only difficult to understand, but experiments have shown that English speakers rate sentences like (10) as if they contained grammatical errors!

Note that the incomprehensibility of sentence (10) cannot be blamed on the complexity of the concept that is being conveyed, since it can be easily expressed in the form of (9) (or in the image below). 

A picture is worth a thousand words (or fewer)

This picture can be either described by sentence (9) "The girl fed the dog that growled at the cat that chased the mouse" or sentence (10) "The cat that the dog that the girl fed growled at chased the mouse." Both sentences seem to be grammatically fine, but only (9) is comprehensible to native speakers. (Images: Silhouette Design Store; iStockPhoto; LuCiD 2018)
This picture can be either described by sentence (9) “The girl fed the dog that growled at the cat that chased the mouse” or sentence (10) “The cat that the dog that the girl fed growled at chased the mouse.” Both sentences seem to be grammatically fine, but only (9) is easily comprehensible to native speakers. (Images: Silhouette Design Store; iStockPhoto; LuCiD 2018)

The difficulty of understanding (10) also cannot be simply blamed on the fact that it is a new sentence that we have never seen before, as there are many new utterances that we can easily understand or produce on the fly. For example, Superman accidentally swallowed a fly is a sentence that (presumably) no one has come across before, but we manage to figure out what it means.

Why are sentences with center-embedding difficult?

So why does a sentence like (10) create so much trouble for native speakers? What does it tell us about our minds? Elaborating on an intuition floated by George Miller and Noam Chomsky in the 1960s, Ted Gibson and James Thomas suggested the following hypothesis:

We know from existing research that we make predictions about upcoming words based on what we just read or heard. When we read “the cat,” we predict that we will see a verb like “chased” (or “meowed”, “played”, etc.). However, in a sentence like (10), we don’t see a verb right away; the next thing we see turns out to be the start of a relative clause: “that the dog.” Since English sentences must have verbs in them, we expect to see a verb eventually. For the time being, though, we park this prediction for a verb in our memory. Likewise, when we read “the dog,” we generate a second prediction for a verb, based on the fact that English relative clauses have verbs in them. Again, since the next thing read is “that the girl,” this prediction is not immediately borne out either, so it gets parked in our memory, too. The presence of multiple predictions of verbs then creates memory overload, interfering with our ability to interpret the sentence. In contrast, in a sentence like (9), our predictions for verbs (and nouns) are quickly borne out, and a memory overload problem does not arise.

Put differently, this memory constraint hypothesis says that our language processing resources are so limited that they can’t cope with center-embedding sentences. You might find this claim surprising. Our brains are powerful enough to deal with complex situations, such as planning a long trip, tracking and participating in a conversation at a noisy restaurant, memorizing the first 100 digits of pi, etc. Why aren’t there more resources allocated for language processing?

This hypothesis also predicts that speakers of all languages with center-embedding structures like the one in (10) should experience the same kind of difficulty with this kind of sentence. Cross-linguistic research on this topic has found only partial support for this prediction: while speakers of English, French, and Mandarin Chinese find sentences like (11) to be at least as bad as ungrammatical sentences of similar complexity, Shravan Vasishth and Stefan Frank and collaborators report that German and Dutch speakers do not. They observe that compared to English, German and Dutch sentences are more likely to have verbs appearing toward the end, and they suggest that this difference makes German and Dutch speakers more experienced with processing center-embedding sentences like (10).

(Here’s an example that Stefan Frank and Patty Ernst found was relatively OK:

Het spannende boek dat de populaire schrijver die de recensenten nauwlettend bekritiseerden met veel vertrouwen publiceerde miste een aantal pagina’s.

“The exciting book that the popular author who the reviewers meticulously criticized published with much confidence missed a number of pages.” Frank and Ernst’s paper can be found here — it’s open access, so free to read!)

Furthermore, even within English, there are center-embedding sentences that are easier to identify as grammatical than others. Janet Fodor, a linguist who spearheaded research on this topic, notes that the center-embedding sentence in (11), which is structurally similar to (10), is easier to understand. To see her point, compare both sentences.

  1. The rusty old ceiling pipes that the plumber that my dad trained fixed continue to leak occasionally.

(adapted from a Language Log post reporting on a presentation Fodor gave)

Fodor suggests that the difference between (10) and (11) might be due to how we say the sentences rather than memory constraints. The general idea behind her argument is this: when we read (10) aloud, we do so as if we were reading a list of nouns and verbs, and not a sentence, which affects our ability to make sense of it. (11), on the other hand, can be read like a regular sentence, which facilitates comprehension.

To sum up, center-embedding sentences provide a classic case study of how language science research works: researchers identify a class of sentences that native speakers exceptionally have problems processing and develop theories to explain why that might be the case. In the context of center-embedding, researchers have come up with competing hypotheses, attributing the difficulty of these sentences to memory limitations, linguistic experience, or how we pronounce sentences. While these hypotheses are very different from each other, their objective is the same: to use these bad sentences to shed light on how our minds process language.


Nick Huang is a PhD student in linguistics at the University of Maryland, interested in issues related to syntax and sentence processing.


How would you rate this post?

EveryoneDabblersNerdsExperts (4 votes, average: 2.25 out of 4)
Loading...

Author: Nick Huang

I am a graduate student in linguistics at the University of Maryland, interested in issues related to syntax and sentence processing.

Leave a Reply

Your email address will not be published. Required fields are marked *