Previous: Minimum Viable Product and Pivoting: Crash Course Business Entrepreneurship #6
Next: 18th Century Warfare: Crash Course European History #20



View count:164,151
Last sync:2024-04-14 15:15


Citation formatting is not guaranteed to be accurate.
MLA Full: "Unsupervised Learning: Crash Course AI #6." YouTube, uploaded by CrashCourse, 20 September 2019,
MLA Inline: (CrashCourse, 2019)
APA Full: CrashCourse. (2019, September 20). Unsupervised Learning: Crash Course AI #6 [Video]. YouTube.
APA Inline: (CrashCourse, 2019)
Chicago Full: CrashCourse, "Unsupervised Learning: Crash Course AI #6.", September 20, 2019, YouTube, 12:35,
For more information go to
Today, we’re moving on from artificial intelligence that needs training labels, called Supervised Learning, to Unsupervised Learning which is learning by finding patterns in the world. We’ll focus on the performing unsupervised clustering, specifically K-means clustering, and show you how we can extract meaningful patterns from data even when you don't know where those patterns are.

Crash Course AI is produced in association with PBS Digital Studios

Crash Course is on Patreon! You can support us directly by signing up at

Thanks to the following patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Eric Prestemon, Sam Buck, Mark Brouwer, Indika Siriwardena, Avi Yashchin, Timothy J Kwist, Brian Thomas Gossett, Haixiang N/A Liu, Jonathan Zbikowski, Siobhan Sabino, Zach Van Stanley, Jennifer Killen, Nathan Catchings, Brandon Westmoreland, dorsey, Kenneth F Penttinen, Trevin Beattie, Erika & Alexa Saur, Justin Zingsheim, Jessica Wode, Tom Trval, Jason Saslow, Nathan Taylor, Khaled El Shalakany, SR Foxley, Yasenia Cruz, Eric Koslow, Caleb Weeks, Tim Curwick, David Noe, Shawn Arnold, Andrei Krishkevich, Rachel Bright, Jirat, Ian Dundore

Want to find Crash Course elsewhere on the internet?
Facebook -
Twitter -
Tumblr -
Support Crash Course on Patreon:

CC Kids:

#CrashCourse #ArtificialIntelligence #MachineLearning
Thanks to Wix for supporting PBS Digital Studios.

Hey, I’m Jabril and welcome to Crash Course AI! So far in this series, we’ve focused on artificial intelligence that uses Supervised Learning.

These programs need a teacher to use labeled data to tell them “right” from “wrong.” And we humans have places where supervised learning happens, like classrooms with teachers, but that’s not the only way we learn. We can also learn lots of things on our own by finding patterns in the world. We can look at dogs and elephants and know they’re different animals without anyone telling us.

Or we can even figure out the rules of a sport just by watching people play. This kind of learning without a teacher is called Unsupervised Learning and, in some cases, computers can do it too. INTRO The key difference between supervised and unsupervised learning is what we’re trying to predict.

In supervised learning, we’re trying to build a model to predict an answer or label provided by a teacher. In unsupervised learning, instead of a teacher, the world around us is basically providing training labels. For example, if I freeze this video of a tennis ball RIGHT NOW, can you draw what could be the next frame?

Unsupervised learning is about modeling the world by guessing like this, and it’s useful because we don’t need labels provided by a teacher. Babies do a lot of unsupervised learning by watching and imitating people, and we’d like computers to be able to learn like this as well. This lets us utilize lots of freely available data in the world or on the internet.

In many cases, one of the easiest ways to understand how AI can use unsupervised learning is by doing it ourselves, so let’s look at a few photos of flowers with no labels. The most basic way to model the world is to assume that it’s made up of distinct groups of objects that share properties. So, for example, how many types of flowers are here?

We could say there are two because there are two colors, purple and yellow. Or we could look at the petal shapes, and divide them into round petals and tall vertical ones. Or maybe we have some more experience with flowers and realize that two of these are tulips, one is a sunflower, and one is a daisy, so there are three categories.

Immediately recognizing different properties like this and creating categories is called unsupervised clustering. We don’t have labels provided by a teacher, but we do have a key assumption about the world that we’re modeling: certain objects are more similar to each other than others. We can program computers to perform clustering too.

But to do that, we need to choose a few properties of flowers we’re interested in looking at, like how we picked color or shape just now. For a more realistic example, let’s say I bought a packet of iris seeds to plant in my garden. After the flowers bloom though, it looks like there were several species of irises mixed up in that one packet.

Now I’m no expert gardener, but I can use some AI to help me analyze my garden. To construct a model, we have to answer two key questions. First, what observations can we measure?

All of these flowers are purple, so that’s probably not the best way to tell them apart. But different irises seem to have different petal lengths and widths, which we can measure and place on this graph with petal length on the Y axis and width on the X axis. And second, how do we want to represent the world?

We’re going to stick to a very simple assumption here: there are clusters in our data. Specifically, we’re going to say there are some number of groups called K clusters, but we don’t know where they are. To help us, we’re going to use the K-means clustering algorithm.

K-means clustering is a simple algorithm. All it needs is a way to compare observations, a way to guess how many clusters exist in the data, and a way to calculate averages for each cluster it predicts. In particular, we want to calculate the mean by adding up all data points in a cluster and dividing by the total number of points.

Remember, unsupervised learning is about modeling the world, so our algorithm will have two steps: First, our AI will predict. What does the model expect the world to look like? In other words, which flowers should be clustered together because they’re the same species?

Second, our AI will correct or learn. The model will update its beliefs to agree with its observation of the world. To start the process, we have to specify how many clusters the model should look for.

I’m guessing there are three clusters in the data, so that becomes the model’s initial understanding of the world, and we’re looking for K=3 averages, or three types of irises. But to start, our model doesn’t really know anything, so the averages are random and so are its predictions. Each data point (which is a flower) is given a label as type1, type2, or type3, based on the algorithm’s beliefs.

Next, our model tries to correct itself. The average of each cluster of data points should be in the middle, so the model corrects itself by calculating new averages. We can see those averages here, marked with Xs, which gives our updated model of the three (or so we guessed) types of irises.

The graph is still pretty noisy. For example, it’s a little weird that there are type2 flowers so close to the average for type3. But we did start with a random model, so we can’t expect too much accuracy.

Logically, we know that irises of the same species tend to have similar petals, so those data points should be clustered together. Since we just did a correction or learning step, we can repeat the process, starting with a new prediction step. Let’s predict new labels using the Xs that mark the averages of each label.

We’ll give every data point the label of its closest X -- type1, type2, or type3 -- and then we’ll calculate new averages. That’s better, but still not the cleanest clusters, so we can repeat the process again: Predict, Learn, Predict, Learn. Eventually, the Xs will stop moving and we have a model of iris clusters created with unsupervised learning!

Now the ultimate question is, did we find meaningful patterns about the world with our AI? We made an assumption that there were three types of irises, and we assumed that they have different petal lengths and widths. Was this true?

Lucky for us, I have a friend who is a master gardener. I showed him the real-life flowers closest to each of the three averages and he said that type1 is Versicolor, type2 is Setosa and type3 is Virginica. Three different iris species!

We learned about the world from observation, which is what makes this unsupervised learning, even though we relied a tiny bit on a teacher(the master gardener) for confirmation and help. Now that we’ve learned the basics, we can experiment with harder examples. Let’s say we want to use an unsupervised learning algorithm to sort a bunch of different photos, not just three iris species.

First, what observations can we measure? How much green there is? Whether there’s a nose and fur?

To have a computer make these observations, we need to measure thousands of red, green, and blue pixels in each image. Second, how do we want to represent the world? Before, we were only working with 2 features, so we could just use averages of the clustered data points and get meaningful abstraction from it.

But when dealing with images, we can’t use the same method, because we won’t get much meaning out of averaging colored pixels for what we want to accomplish. Somehow, we need the model to create a representation that tells us if two images are similar. There are meaningful patterns in the data that are more abstract than individual pixels, and finding them across many images is called Representation Learning.

These patterns help us understand what’s in the images and how to compare them to each other. Representation learning happens both in supervised and unsupervised learning models, so we can do it with or without labels to find patterns in the world. To understand the basic idea of representation learning, check out this experiment: I’m gonna look at a picture really fast and then try to draw it.

Ready, Set, Go! Woah. That was 5 seconds?

My eyes took in the picture and remembered important features, so I’m building a representation in my mind. But I can’t just show you my thoughts to get feedback on what parts I misremembered, so I have to produce a reconstruction, or draw the original image from memory. Alright, so this is what I’ve got.

Now let’s compare my drawing to the original image. Let's see round plate, triangle slice of pizza, some cheese, some crust, tablecloth. Pretty good.

For an AI, making a reconstruction would mean producing all the right pixel values to make a reconstruction. Our K-means clustering algorithm from before, predicted classes for flowers based on how close the data points were to the averages. For images, we will have learned image representations instead of averages.

After that step, just like before, the AI will have to correct itself. Previously, we updated the K clusters based on how well our predicted labels fit the data. But for images, we’d have to update the model’s /internal representations/ based on its reconstructions.

There are different ways to use unsupervised learning in combination with representation learning so that an AI can compare images. Like, for example, there’s a type of neural network called an auto-encoder, which uses the same basic principles of weights and biases to process inputs, pass data onto hidden neuron layers, and finally to a prediction output layer. If John-Green-bot was programmed with an auto-encoder, the input would be an image, the hidden layers would contain representations, and the output would be a full reconstruction of the original image (which gets more accurate the more we train his AI).

Theoretically, I could give John-Green-bot a representation of a pizza and he could reconstruct the original pizza image. What’s so powerful about unsupervised learning is that the world is our teacher. By looking around, taking in a lot of data, and predicting what we’ll see and hear next, we learn about how the world works and how it should be represented.

When asked how AI will fulfill its grand ambitions, 2018 Turing Award Winner Professor Yann LeCun, said: “We all know that unsupervised learning is the ultimate answer.“ So I guess we better keep working on it! Unsupervised learning is a huge area of active research. The human brain is specially designed for this kind of learning and has different parts for vision, language, movement, and so on.

These structures and what kinds of patterns our brains look for were developed over billions of years of evolution. But it’s really tricky to build an AI that does unsupervised learning well because AI systems can’t learn exactly like human often do, just by watching and imitating. Someone, like us, has to design the models and tell them how to look for patterns before letting them loose.

Next time, we’ll look at applying similar concepts to AI systems that find patterns in words and language, in what’s called Natural Language Processing. See you then! Thanks to Wix for supporting PBS Digital Studios.

Checkout if you’re looking to make your own website. Wix is a platform that allows you to build a personalized website for almost any purpose from promoting your business or creating an online shop to a place for you to test out new ideas. Their technology allows you to create something unique no matter your skill level with templates and all in one management.

If you’d like to check it out you can go to Or click the link in the description. Crash Course AI is produced in association with PBS Digital Studios. If you want to help keep Crash Course free for everyone, forever, you can join our community on Patreon.

And if you want to learn more about the math of k-means clustering, check out this video from Crash Course Statistics.