Grammar sometimes gets a bad reputation, but we're actually doing grammar all the time! And we're pretty good at it! In this episode of Crash Course Linguistics, we'll begin our discussion of syntax by learning how we can take words and morphemes and turn them into sentences, questions, stories, and even videos like this!

Acknowledgements: Ian Woolford, Jill Vaughan, Gabrielle Hodge

I'm Taylor and welcome to Crash Course Linguistics! Let's say we have a bag of words and we want to use them to tell a story.

This should be simple enough! We pull out some words one at a time, and we get "sees," "Taylor," "rabbit," and "the." Okay, so we have some idea of what's going on, but we're left with an important question: am I stealthily sneaking up on the rabbit? Or has the rabbit seen me first, and hopped away before I have a chance to take a photo?

I need to know if I’m gonna get some sweet validation from the ‘gram. Words by themselves are great, but they're not enough. We also need some way of conveying the relationships between words.

In this case, the difference between "Taylor sees the rabbit" and "the rabbit sees Taylor". We need what linguists call syntax. [THEME MUSIC]. Distinguishing between sentences like these two is so fundamental that every language has some way of doing it.

Syntax is the study of how languages express relationships between words. One way of expressing relationships between words is to put the words in a consistent order, to tell us who did what to whom. For example, we can say the subject first, then the verb, then the object.

English uses this word order, as do many other languages like Nahuatl from Mexico, Portuguese, and Malagasy from Madagascar. The word order doesn’t have to go subject, verb, object — any order will work as long as it's consistent within a given language. For example, in Hindi, the typical order is subject, object, verb.

This is also very common across languages, such as Czech, Tibetan, and Korean. And in Irish, the typical order is verb, subject, object. This order is rarer, but it's also found in Hawaiian, Māori, and Chatino, another language of Mexico.

A second way of expressing relationships between words is by adding a morpheme, the smallest unit of meaning. That morpheme would indicate whether the thing being referred to is the do-er or the do-ee, the subject or the object. Even if we scramble the order of the words around, we'd still be able to tell the subject and object apart.

For example, in Latin, these two sentences have the same word order, but opposite meanings, and we can tell this because the words change their shape a bit. hospes leporem videt is "the host sees the rabbit," while hospitem lepus videt means "the rabbit sees the host." Because of these morphemes, Latin can use word order for other things, like emphasis or making a poem rhyme better. And many other languages use this strategy, including Turkish, Modern Greek, and Yupik, the language group that spans Alaska and Siberia. These distinctions were created based on spoken languages.

Signed languages use word order and a range of other strategies to distinguish between subjects and objects. For example, one strategy in ASL is setting up referents in space. Say I've already established that this is Gav.

I can say “I saw Gav” by signing the verb “see” from me to the object. English used to do the morphological strategy too, and you can still see some traces of it! For example, in "I see them" or "the employer hired the employee," the word order and the shapes of the words are reinforcing each other, so they may feel natural to you as an English speaker.

As linguists say, they feel grammatical. Meanwhile, in "me see they" or "the employee hired the employer" the word order and the shapes of the words are in tension. They're signalling opposite things, so these sentences may feel weird to you.

They feel ungrammatical. Linguists sometimes mark an ungrammatical sentence with an asterisk or star like *me see they. If you're not a native English speaker, you may not feel these same intuitions about these English sentences, but you do have a set of linguistic intuitions for grammaticality in your own native language or languages.

Now, there are two things that grammaticality doesn't mean. One, grammaticality has nothing to do with whether a sentence makes any sense. There's a famous example in linguistics that proves this point.

The sentence goes, "Colorless green ideas sleep furiously." This sentence was coined by the linguist Noam Chomsky as an example that's perfectly grammatical, but also completely nonsensical. I feel like I should apologize to Thought Cafe for having to figure out how to animate it. Another example, "Furiously sleep ideas green colorless" is equally bizarre in meaning, but this time the grammar is nonsensical too.

Even if you've never heard either sentence before, you can probably tell that "Colorless green ideas sleep furiously" is a grammatical sentence, but "Furiously sleep ideas green colorless" is /un/grammatical. Something about an ungrammatical sentence just feels...weird. Even though it’s the same words, it's not something anyone would say.

Two, grammaticality is also not about whether a sentence meets with the approval of teachers, editors, or other authorities. For example, "Don’t nobody know nothing," is perfectly grammatical. In fact, someone's probably saying it right now!

But "Nothing don’t nobody know" is /un/grammatical. It's not the way anyone would combine these words. It’s amazing that speakers of a language can have such similar grammatical intuitions without ever being formally taught them!

That said, our mental grammars are all slightly different from each other, based on our own unique personal version of language, also known as our idiolect. So you may sometimes notice exceptions or edge cases or things that I say here that don't quite work in your idiolect. That's great!

It means you're thinking like a linguist. Now that we’re paying attention to our linguistic intuitions about grammaticality, we can use them to figure out the relationships between words within sentences. Some words go together more closely than others, and we can test this.

If we can substitute a single word in for several words, while preserving the meaning, then we know that this group of words can act as a single unit. We can call this the substitution test. Let’s start with the sentence "Taylor sees the rabbit".

We can substitute Taylor with a longer phrase, like "The host of Crash Course Linguistics" ...sees the rabbit. Or with a shorter pronoun, like "She sees the rabbit." Since this sentence means the same thing, we know they’re all equivalent units and pass the substitution test. We can also substitute "the rabbit" with a longer phrase, too, like "the purple rabbit with long ears".

Or with a single name, like Gavagai, or pronoun, like them. The subject or object can be one word, or many words, but they all act together as a unit. But the substitution test only gets us so far.

Let’s go to the Thought Bubble to see what other relationships there are between groups of words in this sentence. There are other versions of “Taylor sees the rabbit” that we can make, and the combinations that work tell us how the verb relates to the subject and object. For example, we can shift parts of the original sentence to the beginning, saying, "It's Taylor who sees the rabbit" or "It's the rabbit that Taylor sees".

This type of sentence structure, with “it’s” and “that,” is known as a cleft construction. By looking at what words can be moved together as a group, we're going to do a cleft test. The test is to see which word or group of words is grammatical when we put it in the first slot of a cleft construction, between the "it's" and "that":.

Let's try: *it's rabbit that Taylor sees the. Okay, that sounds weird — it's ungrammatical. We'll mark it with a star. *it's sees the rabbit that Taylor.

Hmm, that's ungrammatical too. We can rescue it, if we make a small tweak: it's see the rabbit that Taylor does. But we can never take "see" all by itself, without "the rabbit": *it's sees that Taylor the rabbit. *it's see that Taylor does the rabbit.

And we can't take "sees" and "Taylor" together, without "the rabbit". *it's Taylor sees that the rabbit. So we've found that clefts are grammatical where the subject, the verb, or the object are split apart on their own, or when the verb "see" and object "the rabbit" are pulled away together. But other clefts are ungrammatical: the one where we try to pull the verb "see" and subject "Taylor" away, without the object.

This suggests that the verb and the object have a closer relationship with each other than the subject and the verb do. This is why we sometimes also refer to a subject and a predicate when talking about syntax, so that we have a single word to describe this grouping of verb and object together. Thanks, Thought Bubble!

During these tests, we notice that some words group together more closely than others, like "the" plus "rabbit", and "see" plus "the" plus "rabbit". All of the different sub-groups that we can find in a sentence are called constituents. By the way, if you've encountered the word "constituent" before, it might have been in a political context.

You can call up your representative and say "Hi, I'm one of your constituents." A constituent is something that constitutes, or makes up a part of, a larger whole. When you're a constituent, you make up a part of your political district, and when some words are a constituent, they make up their own distinct part of a sentence. In English, because we use word order to tell how words are related to each other in a sentence, we also use word-order-based tests like cleft tests to figure out what's a constituent.

And constituents in English are generally words right next to each other. But in languages like Latin, which add morphemes to words to show how they’re related to each other, their constituents can be scattered throughout the sentence. So we need to use different tests to figure out which parts are grouped together.

For example, in this sentence, we can tell that leporem "rabbit" and purpureum "purple" are a constituent, even though they sit on opposite ends of the sentence, because they have the same ending in -m. So the cleft and substitution tests that show constituents in English won't necessarily work in Latin, nor in Hindi, Irish, South African Sign Language or any other language, because we have to consider how each language has different structural patterns. But every language does have constituents, and linguists can figure out ways of testing for them that make sense for each particular language.

Linguists use the word grammar to talk about these structural patterns, how a language puts morphemes together into words, words together into constituents, and constituents into sentences. This combination of morphology and syntax is also called morphosyntax. In European history, grammar often meant learning the specific patterns of how Latin works.

That involved trying to awkwardly shoehorn English into being more like Latin or trying to undo the perfectly natural language changes that happen all the time. So even now, grammar sometimes has a bad reputation, of smug people telling you you're wrong about how you use language. But in fact, like we saw earlier, we're all doing grammar all the time, and we're all really good at feeling whether something is grammatical intuitively!

Grammar is what takes us from "rabbit!" to "is this the same rabbit as I saw yesterday?" Grammar is the thing that lets us transform a grab-bag of words and morphemes into questions and stories and videos like this. Next time: we're going to look at what happens when sentences get longer, and a handy tool so we can keep track of all these constituents. Thanks for watching this episode of Crash Course Linguistics.

If you want to help keep all Crash Course free for everybody, forever, you can join our community on Patreon.