When good sentences sound bad Understanding the limits of how the mind processes language.

Author’s rating for this post: 2 nerd emoji
It’s intended for language “dabblers”, willing to put in a bit of effort. What do you think? Add your rating at the bottom.

Here’s a game for those of you who are native speakers of English. Which one of the following sentences has a grammatical error?

  1. The girls think the boys will eat ice cream.
  2. The girls think the boys who will eat ice cream.
ice cream
(Image: Pampered Chef)

That was easy, right? (In case you were wondering: sentence (1) is fine, sentence (2) is not. No ice cream for correct answers, sorry.) What about the next two sentences?

  1. Yesterday, the girls who love dogs said the boys who will eat ice cream.
  2. Yesterday, the girls who love dogs said the boys who will eat ice cream love cats.

That was pretty easy, too: Sentence (3) is bad, while sentence (4) is way better. As native speakers who grew up speaking a language, we can usually easily tell whether a sentence in that language contains grammatical errors or not, even though we might not be able to say what went wrong. We do so quickly and reliably even for fairly complicated sentences. If someone actually said sentences like (2) and (3) to you, you might conclude quickly that this person doesn’t speak English too well.

The ease with which we can identify whether a sentence contains grammatical errors says something about our mind, and one of the major goals in language science is to spell out exactly how the mind is set up to process language so efficiently.

Somewhat paradoxically, one way to learn about language and the mind is by investigating the exceptional circumstances when our language processing abilities falter and grammatical sentences consistently seem awful to us. “Consistently” is an important caveat here, since we all know from personal experience that no native speaker can be perfect all the time; we sometimes skip a word when we read, or mis-hear or fail to pay sufficient attention to an utterance. What researchers are interested in are grammatical sentences that pose a problem for everyone even in the most favorable environments. Our struggles with these sentences can then be blamed on some special cognitive “bug” that all of us have, rather than, say, the fact that we have different attention spans and cognitive ability (e.g. some people are good at doing mental arithmetic, others not so much).


A class of grammatical sentences that has long baffled native speakers and excited researchers involves the phenomenon known as “center-embedding.” To see how this works, let’s begin with acceptable simple sentences.

  1. The cat chased the mouse.
  2. The dog growled at the cat.
Left: The cat chased the mouse; Right: The dog growled at the cat.
(Left image: Jeroen Moes; Right image: http://www.dogclipart.com/)

We can add more words to the nouns “the cat” and “the mouse” to provide additional information about the cat and the mouse in question, as in (7) and (8). (Technically, what we added are “relative clauses” — think of them as almost-complete sentences that describe nouns.) Sentences (7) and (8) are longer than (5) and (6), but you can obviously nonetheless tell that they are fine.

  1. The cat that the dog growled at chased the mouse.
  2. The dog that the girl fed growled at the cat.

In fact, there doesn’t seem to be a cap on the number of relative clauses one could add or where they appear in a sentence. Sentence (9) is an even longer sentence with two relative clauses, where the relative clauses appear at the end, one inside the other. (I have underlined the outer relative clause, and bolded the inner one.) We can easily judge sentence (9) to be OK.

  1. The girl fed the dog that growled at the cat that chased the mouse.

Intuitively, we might predict that native speakers can judge a sentence to be fine or not regardless of the number of relative clauses it contains or where the relative clauses attach to. This prediction turns out to be false.

Here is an example. Let’s take a perfectly fine sentence like The cat that the dog growled at chased the mouse. In the middle of the relative clause that the dog growled at, let’s insert another perfectly fine relative clause, like that the girl fed. This nesting of relative clauses, which is technically known as “center-embedding,” creates a new sentence, which is shown in (10).

  1. The cat that the dog that the girl fed growled at chased the mouse.

What did you think about (10)? Does it sound like something you could say, or something that you could understand? If you are like most English native speakers, you probably found (10) incomprehensible when you first read it. You might have even taken a few seconds (or more) to try to figure out who did what in that sentence. In fact, sentences like (10) are not only difficult to understand, but experiments have shown that English speakers rate sentences like (10) as if they contained grammatical errors!

Note that the incomprehensibility of sentence (10) cannot be blamed on the complexity of the concept that is being conveyed, since it can be easily expressed in the form of (9) (or in the image below). 

A picture is worth a thousand words (or fewer)

This picture can be either described by sentence (9) "The girl fed the dog that growled at the cat that chased the mouse" or sentence (10) "The cat that the dog that the girl fed growled at chased the mouse." Both sentences seem to be grammatically fine, but only (9) is comprehensible to native speakers. (Images: Silhouette Design Store; iStockPhoto; LuCiD 2018)
This picture can be either described by sentence (9) “The girl fed the dog that growled at the cat that chased the mouse” or sentence (10) “The cat that the dog that the girl fed growled at chased the mouse.” Both sentences seem to be grammatically fine, but only (9) is easily comprehensible to native speakers. (Images: Silhouette Design Store; iStockPhoto; LuCiD 2018)

The difficulty of understanding (10) also cannot be simply blamed on the fact that it is a new sentence that we have never seen before, as there are many new utterances that we can easily understand or produce on the fly. For example, Superman accidentally swallowed a fly is a sentence that (presumably) no one has come across before, but we manage to figure out what it means.

Why are sentences with center-embedding difficult?

So why does a sentence like (10) create so much trouble for native speakers? What does it tell us about our minds? Elaborating on an intuition floated by George Miller and Noam Chomsky in the 1960s, Ted Gibson and James Thomas suggested the following hypothesis:

We know from existing research that we make predictions about upcoming words based on what we just read or heard. When we read “the cat,” we predict that we will see a verb like “chased” (or “meowed”, “played”, etc.). However, in a sentence like (10), we don’t see a verb right away; the next thing we see turns out to be the start of a relative clause: “that the dog.” Since English sentences must have verbs in them, we expect to see a verb eventually. For the time being, though, we park this prediction for a verb in our memory. Likewise, when we read “the dog,” we generate a second prediction for a verb, based on the fact that English relative clauses have verbs in them. Again, since the next thing read is “that the girl,” this prediction is not immediately borne out either, so it gets parked in our memory, too. The presence of multiple predictions of verbs then creates memory overload, interfering with our ability to interpret the sentence. In contrast, in a sentence like (9), our predictions for verbs (and nouns) are quickly borne out, and a memory overload problem does not arise.

Put differently, this memory constraint hypothesis says that our language processing resources are so limited that they can’t cope with center-embedding sentences. You might find this claim surprising. Our brains are powerful enough to deal with complex situations, such as planning a long trip, tracking and participating in a conversation at a noisy restaurant, memorizing the first 100 digits of pi, etc. Why aren’t there more resources allocated for language processing?

This hypothesis also predicts that speakers of all languages with center-embedding structures like the one in (10) should experience the same kind of difficulty with this kind of sentence. Cross-linguistic research on this topic has found only partial support for this prediction: while speakers of English, French, and Mandarin Chinese find sentences like (11) to be at least as bad as ungrammatical sentences of similar complexity, Shravan Vasishth and Stefan Frank and collaborators report that German and Dutch speakers do not. They observe that compared to English, German and Dutch sentences are more likely to have verbs appearing toward the end, and they suggest that this difference makes German and Dutch speakers more experienced with processing center-embedding sentences like (10).

(Here’s an example that Stefan Frank and Patty Ernst found was relatively OK:

Het spannende boek dat de populaire schrijver die de recensenten nauwlettend bekritiseerden met veel vertrouwen publiceerde miste een aantal pagina’s.

“The exciting book that the popular author who the reviewers meticulously criticized published with much confidence missed a number of pages.” Frank and Ernst’s paper can be found here — it’s open access, so free to read!)

Furthermore, even within English, there are center-embedding sentences that are easier to identify as grammatical than others. Janet Fodor, a linguist who spearheaded research on this topic, notes that the center-embedding sentence in (11), which is structurally similar to (10), is easier to understand. To see her point, compare both sentences.

  1. The rusty old ceiling pipes that the plumber that my dad trained fixed continue to leak occasionally.

(adapted from a Language Log post reporting on a presentation Fodor gave)

Fodor suggests that the difference between (10) and (11) might be due to how we say the sentences rather than memory constraints. The general idea behind her argument is this: when we read (10) aloud, we do so as if we were reading a list of nouns and verbs, and not a sentence, which affects our ability to make sense of it. (11), on the other hand, can be read like a regular sentence, which facilitates comprehension.

To sum up, center-embedding sentences provide a classic case study of how language science research works: researchers identify a class of sentences that native speakers exceptionally have problems processing and develop theories to explain why that might be the case. In the context of center-embedding, researchers have come up with competing hypotheses, attributing the difficulty of these sentences to memory limitations, linguistic experience, or how we pronounce sentences. While these hypotheses are very different from each other, their objective is the same: to use these bad sentences to shed light on how our minds process language.

Nick Huang is a PhD student in linguistics at the University of Maryland, interested in issues related to syntax and sentence processing.

How would you rate this post?

EveryoneDabblersNerdsExperts (4 votes, average: 2.25 out of 4)

Seeds for Thought: Is there Structure in Birdsong? We may be underestimating the communication systems of birds.

Author’s rating for this post: 4 nerd emoji
It’s intended for experts. What do you think? Add your rating at the bottom!

A group of alien linguists receives a modest grant to do fieldwork on the communication systems of animals on Earth. They notice two groups making a lot of noise: birds and humans, and since there are a lot fewer humans than birds, they decide it’s more feasible to focus on the featherless bipeds. They record all the different sounds we make, the order in which they are usually produced, and who we’re with and what’s around us when we converse. After analyzing terabytes worth of data, they gain some compelling insights into our communication system: they discover phonological rules, dialects, vowel formants, and categorical perception. They posit reasonable hypotheses about the function of language in building alliances, courting mates, and managing conflict. But, I would argue, following Chomsky, that they’d be missing out on fundamental properties about the nature of language.

For one thing, since the relationship between sounds and meanings is essentially arbitrary, they would have a difficult time uncovering what our words mean or even which segments of the sound signal are functioning as words. On top of that, without access to meaning, they would have very little chance at discovering rules and principles of how words fit together, i.e. syntax. As a result, these alien linguists would miss out on the structure of human language: how words combine to form phrases that can join in infinite ways with other phrases. Despite having only studied the surface of our language, they would conclude that these humans lack the infinite generativity and recursive rules of their own beautifully unique alien language.

After decades of humans studying the communication systems of birds, the predominant consensus is that birdsong is all surface: no compositional meaning, no hierarchical combinations of phrases, only strings of sounds. And, while I agree that there is no strong evidence of structure below the surface in birdsong, I would argue that those alien linguists, without our intuitions about how words fit together, without our judgements of what constructions are acceptable and not, would be hard pressed to find evidence of structure below the surface in human language.

So, when it comes to birdsong, are we the alien linguists? Are we missing out on structure below the surface?

Budgerigar (Melopsittacus undulatus) flock. Originally posted to Flickr by anna banana and licensed under the Creative Commons Attribution-Share Alike 2.0 Generic license.

Traditionally, researchers have analyzed the sequences of sounds produced by songbirds and parrots by using concepts from language and music: a note is a continuous trace on a spectrogram, a syllable is a collection of notes, and a motif or phrase is a grouping of syllables. They then characterize patterns in the order of syllables or phrases by using Markov chain models. These models capture how the probability of an element occurring in a sequence depends on the previous elements. If the two foregoing elements help to predict the next one, then it is a 2nd order Markov model. If three elements, then 3rd order, and so on. Most birdsong can be characterized by low (1st or 2nd) order Markov models. Vicky Hsu, a former PhD student at UMD, studied the complex and variable warble song of parakeets and found that it can arguably be best captured by a 5th order Markov model. This is very impressive but still categorically different than the hierarchical depths of human language.

Yet, think about this: if you just studied the sound or sign patterns of human language, then you could also characterize them with Markov models, as our own Bill Idsardi has argued. You would have no idea, of course, that there is structure below the surface. Like the alien linguists, you would confidently assert that human language patterns are Markovian in nature, easily computable by finite-state machines. Could that also be the case for us and the birds?

In my view, this question is still open and important. Uncovering the putative structure of birdsong would give us an unparalleled window into the minds of birds, possibly helping us to better understand the workings of the computational devices inside all of our skulls (whether bird or human). But needless to say, it is not satisfying just to claim that structure in birdsong is possible. We want to find evidence either way. Here are two possible steps forward I think we could take:

  • While there is no evidence for anything like words in birdsong in our studies of syllables and motifs, we might be wrong about the fundamental units in their communication systems. Recent studies have shown that songbirds and parakeets are exquisitely sensitive to temporal fine structure (TFS) — rapid changes in the amplitude and frequency of the acoustic waveform. Their perceptual abilities in this dimension actually exceed our own: zebra finches, for example, can hear changes in TFS for periods as short as 1-2ms while humans require periods 3-4ms long. This means that birdsong may sound very different to the birds than to us. Thus, while many birdsongs appear simple and repetitive to us, more complex patterns and variability could be embedded in the strings. Examining temporal fine structure patterns and understanding perception in this acoustic dimension could help us uncover any hierarchical structure, if it exists.
  • While we are stuck looking at strings of sound when analyzing birdsong, there are tools out there to help us decode structural rules governing the strings. Much like we can make up minimal pairs of sentences and ask human subjects to make acceptability judgments in order to test hypotheses about grammatical dependencies, we can ask the birds to “judge” strings of sounds as valid and invalid. This is not an easy road to take, since, as I’ve argued, we lack the intuitions of a native speaker for what elements might fit together. But we can draw on clues to structural rules: for example, in parakeet warble song, “contact call-like” elements are the most common, perhaps playing a role like function words in human language, in which case they may mark the beginning or ends of “phrases”. When it comes to asking the birds to “judge” strings, the path is a bit more straightforward. We can, for instance, ask birds to perform a preference task: you put two perches in a cage and when a bird stands on one it triggers a set of sounds manipulated in one way (with, for instance, hierarchical embedding), and when it stands on the other it triggers sounds manipulated in another way (for instance, a random arrangement of the same sounds). If a bird prefers one set of sounds over another, this could tell us what strings are acceptable and allow us to uncover any grammar in birdsong, again if it exists.

These are, of course, only two ideas of many possible paths forward. What we really need is linguists with theoretical insights, computer scientists with powerful algorithms and processors, biologists with expertise in animal behavior and cognition, birdsong researchers with innovative behavioral testing paradigms and other passionate folks to come together and approach this question with as open of minds as when it comes to presupposing the complexity of our own thoughts. My view could certainly turn out to be totally wrong. However, I’d rather run the risk of being wrong than to miss out on an entire dimension of animal cognition and communication, which could help us better understand our own. I, for one, would like those alien linguists to come back with more colleagues and new ideas (and perhaps a generous new grant) to take a chance at delving below the surface of our language and minds.

Adam Fishbein is a PhD student in the Neuroscience and Cognitive Science program at UMD, using comparative work with birds to study the evolution of human language and cognition. He also has a Master’s in Professional Writing from USC and has published several short stories and a novel.

How would you rate this post?

EveryoneDabblersNerdsExperts (9 votes, average: 3.33 out of 4)