crashcourse
Let's make an AI that destroys video games: Crash Course AI #13
YouTube: | https://youtube.com/watch?v=osbmLJb2Tkc |
Previous: | What's new with Crash Course |
Next: | Reform and Revolution 1815-1848: Crash Course European History #25 |
Categories
Statistics
View count: | 94,142 |
Likes: | 2,053 |
Comments: | 66 |
Duration: | 13:26 |
Uploaded: | 2019-11-08 |
Last sync: | 2024-10-22 14:30 |
Citation
Citation formatting is not guaranteed to be accurate. | |
MLA Full: | "Let's make an AI that destroys video games: Crash Course AI #13." YouTube, uploaded by CrashCourse, 8 November 2019, www.youtube.com/watch?v=osbmLJb2Tkc. |
MLA Inline: | (CrashCourse, 2019) |
APA Full: | CrashCourse. (2019, November 8). Let's make an AI that destroys video games: Crash Course AI #13 [Video]. YouTube. https://youtube.com/watch?v=osbmLJb2Tkc |
APA Inline: | (CrashCourse, 2019) |
Chicago Full: |
CrashCourse, "Let's make an AI that destroys video games: Crash Course AI #13.", November 8, 2019, YouTube, 13:26, https://youtube.com/watch?v=osbmLJb2Tkc. |
Follow along: https://colab.research.google.com/drive/1uYXTDeBbPeuJfM1teufZ9nUaiRIN9nHW
Today we create a game and then build an AI to destroy it. Our game is called TrashBlaster, and it’s like Asteroids but with trash in the ocean, and instead of a spaceship John Green Bot is wielding a laser. We'll use machine learning techniques such as an evolutionary neural network alongside a carefully crafted fitness function to create an unstoppable AI.
To install the game on your computer download our repo on Github: https://github.com/crash-course-ai/lab3-games
Crash Course is produced in association with PBS Digital Studios: https://www.youtube.com/pbsdigitalstudios
Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse
Thanks to the following patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:
Eric Prestemon, Sam Buck, Mark Brouwer, Indika Siriwardena, Avi Yashchin, Timothy J Kwist, Brian Thomas Gossett, Haixiang N/A Liu, Jonathan Zbikowski, Siobhan Sabino, Zach Van Stanley, Jennifer Killen, Nathan Catchings, Brandon Westmoreland, dorsey, Kenneth F Penttinen, Trevin Beattie, Erika & Alexa Saur, Justin Zingsheim, Jessica Wode, Tom Trval, Jason Saslow, Nathan Taylor, Khaled El Shalakany, SR Foxley, Sam Ferguson, Yasenia Cruz, Eric Koslow, Tim Curwick, David Noe, Shawn Arnold, William McGraw, Andrei Krishkevich, Rachel Bright, Jirat, Ian Dundore
--
Want to find Crash Course elsewhere on the internet?
Facebook - http://www.facebook.com/YouTubeCrashCourse
Twitter - http://www.twitter.com/TheCrashCourse
Tumblr - http://thecrashcourse.tumblr.com
Support Crash Course on Patreon: http://patreon.com/crashcourse
CC Kids: http://www.youtube.com/crashcoursekids
#CrashCourse #ArtificialIntelligence #MachineLearning
Today we create a game and then build an AI to destroy it. Our game is called TrashBlaster, and it’s like Asteroids but with trash in the ocean, and instead of a spaceship John Green Bot is wielding a laser. We'll use machine learning techniques such as an evolutionary neural network alongside a carefully crafted fitness function to create an unstoppable AI.
To install the game on your computer download our repo on Github: https://github.com/crash-course-ai/lab3-games
Crash Course is produced in association with PBS Digital Studios: https://www.youtube.com/pbsdigitalstudios
Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse
Thanks to the following patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:
Eric Prestemon, Sam Buck, Mark Brouwer, Indika Siriwardena, Avi Yashchin, Timothy J Kwist, Brian Thomas Gossett, Haixiang N/A Liu, Jonathan Zbikowski, Siobhan Sabino, Zach Van Stanley, Jennifer Killen, Nathan Catchings, Brandon Westmoreland, dorsey, Kenneth F Penttinen, Trevin Beattie, Erika & Alexa Saur, Justin Zingsheim, Jessica Wode, Tom Trval, Jason Saslow, Nathan Taylor, Khaled El Shalakany, SR Foxley, Sam Ferguson, Yasenia Cruz, Eric Koslow, Tim Curwick, David Noe, Shawn Arnold, William McGraw, Andrei Krishkevich, Rachel Bright, Jirat, Ian Dundore
--
Want to find Crash Course elsewhere on the internet?
Facebook - http://www.facebook.com/YouTubeCrashCourse
Twitter - http://www.twitter.com/TheCrashCourse
Tumblr - http://thecrashcourse.tumblr.com
Support Crash Course on Patreon: http://patreon.com/crashcourse
CC Kids: http://www.youtube.com/crashcoursekids
#CrashCourse #ArtificialIntelligence #MachineLearning
(00:00) to (02:00)
J: John Green-Bot, are you serious? I made this game and you beat my high score?
JGB: Pizza!
J: So, John Green-Bot is pretty good at Pizza Jump, but what about this new game we made, Trashblaster?
JGB: Hey, that's me.
J: Yeah. Let's see what you got.
JGB: That's not fair, Jabril.
J: It's okay, John Green-Bot. We've got you covered. Today, we're gonna design and build an AI program to help him play this game like a pro.
(CrashCourse AI Intro)
Hey, I'm Jabril, and welcome to Crash Course AI. Last time, we talked about some of the ways that AI systems learn to play games. I've been playing video games for as long as I can remember. They're fun, challenging, and tell interesting stories where the player gets to jump on goombas, build cities, cross the road, or flap a bird, but games are also a great way to test AI techniques, because they usually involve simpler worlds than the one we live in. Plus, games involve things that humans are often pretty good at, like strategy, planning, coordination, deception, reflexes, and intuition.
Recently, AIs have become good at some tough games, like Go or StarCraft 2. So our goal today is to build an AI to play a video game that our writing team and friends at Thought Cafe designed called Trashblaster. The players goal in Trashblaster is to swim through the ocean as a little virtual John Green-Bot and destroy pieces of trash, but we have to be careful, because if John Green-Bot touches a piece of trash, then he loses and the game restarts.
Like in previous labs, we'll be writing all the code using a language called Python and a tool called Google Collaboratory, and as you watch this video, you can follow along with the code in your browser from the link we put in the description. In these Collaboratory files, there's some regular text explaining what we're trying to do, and pieces of code that you can run by pushing the play button.
(02:00) to (04:00)
These pieces of code build on each other so keep in mind that we have to run them in order from top to bottom, otherwise we might get an error. To actually run the code and experiment with changing it, you have to either click open in playground at the top of the page, or open the file menu and click 'Save a copy in Drive', and just FYI, you'll need a Google account for this.
So to create this gamegplaying AI system, first, we need to build the game and set up everything like the rules and graphics. Second, we'll need to think about how to create a Trashblaster AI model that can play the game and learn to get better, and third, we'll need to train the model and evaluate how well it works.
Without a game, we can't do anything, so we've got to start by generating all the pieces of one. To start, we're gonna need to fill up our toolbox by importing some helpful libraries, such as PyGame. The first step in 1.1 and 1.2 loads the libraries, and step 1.3 saves the game so we can watch it later. This might take a second to download. The basic building blocks of any game are different objects that interact with each other. There's usually something or someone the player controls and enemies that you battle. All these objects and interactions with each other needs to be defined in the code, so to make Trashblaster, we need to define three objects and what they do. A blaster, a hero, and trash to destoy. The blaster is what actually destroys the trash, so we're gonna load an image that looks like a laser ball and set some properties. How far does it go? What direction does it fly? And what happens to the blast when it hits a piece of trash?
Our hero is John Green-Bot. So now, we've got to load his image and define properties like how fast he can swim and how a blast appears when he uses his blaster, and we need to load an image for the trash pieces, and then code how they move and what happens when they get hit by a blast. Like, for example, total destruction or splitting into two smaller pieces.
Finally, all these objects are floating in the ocean, so we need a piece of code to generate the background. The shape of this game's ocean is (?~4:08), which means it wraps around and if any object flies off-screen to the right, then it will immediately appear on the far left side.
Every game needs some way to track how the player is doing, so we'll show the score, too.
(04:00) to (06:00)
Now that we have all the pieces in place, we can actually build the game and decide how everything interacts. The key to how everything fits together is the run function. It's a loop checking whether the game is over, moving all the objects, updating the game, checking whether our hero is okay, and making trash. As long as our hero hasn't bumped into any trash, the game continues. That's pretty much it for the game mechanics. We've created a hero, a blaster, trash, a scoreboard, and code that controls the interactions.
Step 2 is modeling the AI's brain, so John Green-bot can play, and for that, we can turn back to our old friend, the neural network. When I play games, I try and watch for the biggest threat, because I don't wanna lose, so let's program John Green-bot to use a similar strategy. For his neural network input layer, let's consider the five pieces of trash that are closest to his avatar, and remember, the closest trash might actually be on the other side of the screen. Really, we want John Green-bot to pay attention to where the trash is and where it's going. So, we want the X and Y positions and X and Y velocities relative to the hero, and the size of each piece of trash. That's five inputs for five pieces of trash, so our input layer is gonna have 25 notes.
For the hidden layers, let's start small and create two layers with 15 nodes each. This is just a guess, so we can change it later if we want. Because the output of this neural network is gameplay, we want the output nodes to be connected to the movement of the hero and shooting blasts, so there will be five nodes total: an X and Y for movement, an X and Y direction frame in the blaster, and whether or not to fire the blaster. To start, the weights of the neural network are initalized to zero, so the first time John Green-bot plays, he basically sits there and does nothing. To train its brain with regular supervised learning, we'd normally say what the best action is at each time step, but because losing Trashblaster depends on lots of collective actions and mistakes, not just one key moment, supervised learning might not be the right approach for us.
(06:00) to (08:00)
Instead, we'll use reinforcement learning strategies to train John Green-bot based on all the moves he makes from the beginning to the end of the game, and we'll evolve a better AI using a genetic algorithm which is commonly referred to as GA. To start, we'll create some number of John Green-bots with empty brains. Let's say 200, and we'll have them play Trashblaster. They're all pretty terrible, but because of luck, some will probably be a little bit less terrible. In biological evolution, parents pass on most of their characteristics to their offspring when they reproduce, but the new generation may have some small differences or mutations.
To replicate this process, we'll use code to take the hundred highest scoring John Green-bots and clone each of them as a reproduction step. Then, we'll slightly and randomly change the weights in those 100 clone neural networks, which is our mutation step. Right now, we'll program a 5% chance that any given weight will be mutated and randomly choose how much that weight mutates. So it could be barely any change or a huge one, and you can experiment with this if you like.
Mutation effects how much the AI changes overall, so it's a little bit like the learning rate that we talked about in previous episodes. We have to try and balance steadily improving each iteration with making big changes that might be helpful or harmful. After we've created these 100 mutant John Green-bots, we'll combine them with the 100 unmutated original models, just in case the mutations were harmful, and then have them all play the game. Then, we evaluate, clone, and mutate them over and over again. Over time, the genetic algorithm usually makes AI gradually better at whatever they're being asked to do, like play Trashblaster. This is because models with the better mutations will be more likely to score high and reproduce in the future.
All of this stuff, from building John Green-bot's neural networks to defining mutations for our genetic algorithm, are in this section of the code.
(08:00) to (10:00)
After setting all that up, we have to write code to carefully define what doing better at the game means. Destroying a bunch of trash? Staying alive for a long time? Avoiding off-target blaster shots? Together, these decisions about what better means defines an AI model's fitnes. Programming this function is pretty much the most important part of this lab, because how we define fitness will affect how John Green-bot's AI will evolve. If we don't carefully balance our fitness function, his AI could end up doing some pretty weird things.
For example, we could just define fitness as how long the player stays alive, but then John Green-bot's AI might play Trashavoider and dodge trash instead of Trashblaster and destroy trash, but if we define the fitness to only be related to how many trash pieces are destroyed, we might get a wild hero that's constantly blasting. So for now, I'm gonna try a fitness function that keeps the player alive and blasts trash. We'll define the fitness as +1 for every second that John Green-bot stays alive, and +10 for every piece of trash that is zapped, but it's not as fun if the AI just blasts everywhere, so let's add a penalty of -2 for every blast he fires.
The fitness for each John Green-bot AI will be updated continuously as he plays the game, and it'll be shown on the scoreboard we created earlier. You can take some time to play around with the fitness function and watch how John Green-bot's AI can learn and evolve differently.
Finally, we can move on to Step 3 and actually train John Green-bot's AI to blast some trash. So first, we need to set up our game, and to kick off the genetic algorithm, we have to find how many randomly wired John Green-bot models we want to use in our starting population. Let's just stick for 200 for now. If we waited for each John Green-bot model to start, play, and lose the game, this training could take days, but because our computer can multi-task, we can use a multi-processing package to make all 200 AI models play separate games at the same time, which will be much faster, and this is all part of the training. This is where we'll code in the details of the genetic algorithm, like sorting John Green-bots by their fitness and choosing which ones will reproduce.
(10:00) to (12:00)
Now that we have the 100 John Green-bots we want to reproduce, this code will clone and mutate them so that we have a combined group of 100 old and 100 mutant AI models. Then, we can run 200 more games for these 200 John Green-bots. It just takes a few seconds to go through them all, thanks to that last chunk of code, and we can see how well they do. The average score of the AI models that we picked to reproduce is almost twice as high as the overall average, which is good. It means that John Green-bot is learning something. We can even watch a replay of the best AI.
Uhh, even the best isn't very exciting right now. We can see the fitness function changing as time passes, but the hero's just sitting there, not getting hit and shooting forward. We want John Green-bot to actually play, not just sit still and get lucky. We can also see visual representations of the specific neural network, where higher weights are represented by the redness of the connection. It's tough to interpret exactly what this diagram means, but we can keep it in mind as we continue to train John Green-bot.
Genetic algorithms take time to evolve a good model, so let's change the number of iterations in the looping step 3.3, and run the training step 10 times to repeatedly copy, mutate, and test the fitness of these AI models. Okay, now we've trained for 10 more iterations, and if we view a replay of the last game, we see that John Green-bot is doing a little better. He's moving around and actually sort of aiming. If we keep training, one model might get lucky, destroy a bunch of trash, has a high fitness, and gets copied and mutated to make future generations even better, but John Green-bot needs lots of iterations to get really good at Trashblaster. You might consider changing the number of iterations to 50 or 100 times per click, which might take a while.
Now here's an example of the game after 15,600 training iterations. Just look at John Green-bot swimming and blasting trash like a pro, and all this was done by training a genetic algorithm, raw luck, and a carefully crafted fitness function.
(12:00) to (13:27)
Genetic algorithms tend to work pretty well on small problems like getting good at Trashblaster. When the problems get bigger, the random mutations of genetic algorithms are sometimes, well, too random to create consistently good results, so part of the reason this worked so well is because John Green-bot's neural network is pretty tiny compared to many AIs created for industrial-sized problems. But still, it's fun to experiment with AI in games like Trashblaster. For example, you can try and change the values of the fitness function and see how John Green-bot's AI evolves differently, or you can change how the neural network gets mutated, like by messing with the structure instead of the weights, or you can change how much the run function loops per second, from five times a second to ten or twenty and give John Green-bot superhuman reflexes.
You can download the clip of your AI playing Trashblaster by looking for game_animation.gif in the file browser on the left-hand side of the Collaboratory file. You can also download the source code from GitHub to run on your computer if you wanna experiment. We'll leave a link in the description.
And next time, we'll start shifting away from games and learning about other ways that humans and AI can work together in teams. See you then.
Crash Course AI is produced in association with PBS Digital Studios. If you wanna help keep Crash Course free for everyone, forever, you can join our communit on Patreon, and if you wanna learn more about evolution and genetics, check out Crash Course Biology.