crashcourse
AI Playing Games: Crash Course AI #12
YouTube: | https://youtube.com/watch?v=nw7zmdBLQ6U |
Previous: | How to Communicate with Customers: Crash Course Entrepreneurship #11 |
Next: | The Industrial Revolution: Crash Course European History #24 |
Categories
Statistics
View count: | 83,719 |
Likes: | 1,829 |
Comments: | 105 |
Duration: | 11:31 |
Uploaded: | 2019-11-01 |
Last sync: | 2024-10-22 11:00 |
Citation
Citation formatting is not guaranteed to be accurate. | |
MLA Full: | "AI Playing Games: Crash Course AI #12." YouTube, uploaded by CrashCourse, 1 November 2019, www.youtube.com/watch?v=nw7zmdBLQ6U. |
MLA Inline: | (CrashCourse, 2019) |
APA Full: | CrashCourse. (2019, November 1). AI Playing Games: Crash Course AI #12 [Video]. YouTube. https://youtube.com/watch?v=nw7zmdBLQ6U |
APA Inline: | (CrashCourse, 2019) |
Chicago Full: |
CrashCourse, "AI Playing Games: Crash Course AI #12.", November 1, 2019, YouTube, 11:31, https://youtube.com/watch?v=nw7zmdBLQ6U. |
As we mentioned last episode, one of the best test spaces for building new AI systems are games. This is because games provide a great framework for an AI to learn an objective and slowly improve. In recent years, AI has made huge strides in games from beating Jeopardy! champions to crushing a five-person team in DOTA 2. Today, we’re going to walk you through creating a Tic Tac Toe bot that uses the minimax algorithm to become undefeatable and we’ll talk about evolutionary neural networks like in SethBling’s MarI/O project. Next week we’ll create our very own game and create an AI to master it!
Watch SethBling’s MarI/O video: https://www.youtube.com/watch?v=qv6UVOQ0F44
Crash Course is produced in association with PBS Digital Studios: https://www.youtube.com/pbsdigitalstudios
Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse
Thanks to the following patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:
Eric Prestemon, Sam Buck, Mark Brouwer, Indika Siriwardena, Avi Yashchin, Timothy J Kwist, Brian Thomas Gossett, Haixiang N/A Liu, Jonathan Zbikowski, Siobhan Sabino, Zach Van Stanley, Jennifer Killen, Nathan Catchings, Brandon Westmoreland, dorsey, Kenneth F Penttinen, Trevin Beattie, Erika & Alexa Saur, Justin Zingsheim, Jessica Wode, Tom Trval, Jason Saslow, Nathan Taylor, Khaled El Shalakany, SR Foxley, Sam Ferguson, Yasenia Cruz, Eric Koslow, Caleb Weeks, Tim Curwick, David Noe, Shawn Arnold, William McGraw, Andrei Krishkevich, Rachel Bright, Jirat, Ian Dundore
--
Want to find Crash Course elsewhere on the internet?
Facebook - http://www.facebook.com/YouTubeCrashCourse
Twitter - http://www.twitter.com/TheCrashCourse
Tumblr - http://thecrashcourse.tumblr.com
Support Crash Course on Patreon: http://patreon.com/crashcourse
CC Kids: http://www.youtube.com/crashcoursekids
#CrashCourse #ArtificialIntelligence #MachineLearning
Watch SethBling’s MarI/O video: https://www.youtube.com/watch?v=qv6UVOQ0F44
Crash Course is produced in association with PBS Digital Studios: https://www.youtube.com/pbsdigitalstudios
Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse
Thanks to the following patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:
Eric Prestemon, Sam Buck, Mark Brouwer, Indika Siriwardena, Avi Yashchin, Timothy J Kwist, Brian Thomas Gossett, Haixiang N/A Liu, Jonathan Zbikowski, Siobhan Sabino, Zach Van Stanley, Jennifer Killen, Nathan Catchings, Brandon Westmoreland, dorsey, Kenneth F Penttinen, Trevin Beattie, Erika & Alexa Saur, Justin Zingsheim, Jessica Wode, Tom Trval, Jason Saslow, Nathan Taylor, Khaled El Shalakany, SR Foxley, Sam Ferguson, Yasenia Cruz, Eric Koslow, Caleb Weeks, Tim Curwick, David Noe, Shawn Arnold, William McGraw, Andrei Krishkevich, Rachel Bright, Jirat, Ian Dundore
--
Want to find Crash Course elsewhere on the internet?
Facebook - http://www.facebook.com/YouTubeCrashCourse
Twitter - http://www.twitter.com/TheCrashCourse
Tumblr - http://thecrashcourse.tumblr.com
Support Crash Course on Patreon: http://patreon.com/crashcourse
CC Kids: http://www.youtube.com/crashcoursekids
#CrashCourse #ArtificialIntelligence #MachineLearning
(00:00) to (02:00)
Jabril: D-7
John Green-Bot: Miss. I-9?
J: Noooo!
(Intro)
J: Hey, I'm Jabril, and welcome to Crash Course: AI. John Green-Bot might struggle with some things like natural language processing and moving (?~0:48) but using AI, he's pretty good at boardgames. AI researchers spent a lot of time trying to teach AI how to beat humans at games, and this isn't just because games are fun. Games provide constrained scenarios for testing new AI algorithms and approaches. In a game, it's easy to know whether the AI is winning or losing, because there's usually a score or some objective measure of winning. This is great because AI learns from examples, trying things out, and slowly improving. Games basically provide their own training data, which is a big relief, because AI systems need lots of training data to get really good.
An AI can even play games itself to generate training data and involve better game strategies, or an AI can be programmed to look at previous games, even games played by expert humans for strategies that lead to victory. Comparing AIs against expert human gamers can also help us figure out how an AI is improving over time. This comparison also gives us a sense of the difficulty of problems an AI can solve. In other words, if I can teach John Green-bot to beat me at Battleship, I probably can also teach him to beat me at any game that's simpler than Battleship.
Finally, games are cool, and that's important, too. Sometimes, AI programming can feel a bit difficult to drive because of all the math and troubleshooting, and games can provide a fun motivation to learn all this stuff.
(02:00) to (04:00)
This is the reason why some of my first AI demos were games. For all these reasons, games and computing have a rich history that's worth diving into. For that, let's go to the Thought Bubble.
Humans have always been fascinated by the idea that machines could beat us at our own strategy games. In 1770, the inventor Wolfgang von Kempelen revealed an invention to the Empress Maria Teresa of Austria. The mechanical turk was a clock-like contraction that appeared to play chess. Chess pieces were attached to rods that were hooked into a bathtub-sized box. After Empress Maria made a move on the board, she turned a crank that activated the machine, which would move chess pieces mechanically. To her surprise, the machine was able to beat most challengers. However, it was an elaborate hoax and the mechanical turk was actually controlled by a person hidden inside.
Getting a machine to play chess is actually really complicated, so when AI researchers first tried to tackle the problem in the late 1940s and early 1950s, they focused on simpler chess situations, like games with only a few pieces remaining on the board or full games played on a small 6x6 board without any bishops. At the same time, researchers worked on an AI that could play checkers, because checkers looked easier, although it was almost as complicated. The first program to play a legitimate game of checkers was built by IBM in 1956 and in a classic Cold War move, two programs that could play a full game of chess were developed in parallel by the US and Russia in 1957, but these programs didn't get good for another 40 years.
Checkers was first, with a program called Chinook, which started dominating masters in 1995, and chess followed when a computer called Deep Blue beat the chess master Garry Kasparov in 1997. Thanks, Thought Bubble.
Since then, strategy games have been mastered one by one, with the most recent victories over humans at Go in 2017, Dota 2 in 2018, and Starcraft 2 in 2019. Okay, so the best way to understand the difficulty of teaching AIs to play games is through an example.
(04:00) to (06:00)
Oh, John Green-Bot. So, let's start with the really simple goal, like teaching John Green-Bot here how to play Tic-Tac-Toe. One of the ways we can think about playing Tic-Tac-Toe is as a tree with all the possible moves of any given state of what the gameboard looks like. For example, if this is the current game state, it's John Green-Bot's turn and he's using Xs, there are three places he can go. We can draw a tree representing all possible outcomes for each of these options, and all the options his opponent, me or anyone else, can take. Because computers think with numbers, each outcome can be assigned a reward. A number like a 1 for a win and -1 for a loss or a tie. Basically, John Green-Bot will need to search through the tree of possibilities to find his win.
To decide which choice to make, John Green-Bot will assume that in each tree, both he and his opponent will make the best possible choices. In other words, his policy or his framework for making decisions, will alternate between choosing the branch that will maximize the outcome of winning on his turn and minimize the outcome of his opponent winning on their turn. This is called the minimax algorithm. Then, each gamestate can be assigned a value based on how likely it leads to John Green-Bot winning and he can decide which move to make based on his policy.
Looking at this tree, John Green-Bot will always pick option 1.0 and win the game. Of course, this is a pretty small tree because we were looking at a game already in progress. To draw the whole Tic-Tac-Toe tree from beginning to end, we'd need to represent about 250,000 boards. Now, that seems like a lot, but it would take like, half a second for a powerful modern computer to compute this many options. By laying out all the possibilities and taking the paths that led to a win, John Green-Bot can solve Tic-Tac-Toe.
(06:00) to (08:00)
This means that John Green-Bot will always achieve the best possible outcome, either a win or tie, no matter how his opponent plays. Thanks, John Green-Bot.
But we can't solve all games this way. Checkers, for example, has about 1020 board states, or 10 followed by 20 zeros. That's more board states than there are grains of sand on Earth. Chess has 1050 board states, and Go has 10250 board states. To put those huge numbers into perspective, there are only 1080 atoms in the entire known universe. Computer scientists have theorized that it would be impossible for conventional computers to calculate this many states due to the laws of physics.
Like, for example, if you combined all of the planets and stars and everything in the whole universe into a single supercomputer, it still wouldn't be powerful enough to solve the game of Go, but some people have hope that quantum computers may be able to get there someday. So, if figuring out all the board states could be mathematically impossible, how did computers beat the number one ranked human masters in Chess and Go?
Many modern systems, including Google's AlphaGo computer that beat a human master in Go in 2017 use an algorithm called Monte Carlo Tree Search. Monte Carlo is a famous casino, so whenever you see the term Monte Carlo, it's a good bet that the algorithm will be using randomness and chance to solve a problem.
Combining Monte Carlo randomness and regular tree search like Minimax, modern game AIs decide which part of the huge tree to search by guessing at odds. Basically, they want higher odds at the part of the tree they search will lead to a win, but these aren't just random guesses like we would make in many casino games. AI systems simulate millions of what-if scenarios and use math to estimate the likelihood of winning if they choose one path or another.
(08:00) to (10:00)
In each what-if scenario, the AI considers making one particular move and then simulates playing a large number of, but not all, possible games where the next moves are chosen at random. By averaging these possible outcomes, the AI estimates how good that particular move is. It's so much faster to estimate a handful of choices than exhaustively calculate each branch of the game tree, and some computers can even do this estimation in real time.
One example fo this is Google's Deepmind, which defeated human professional players at Starcraft 2 in 2019, where time is very critical. Of course, Starcraft 2, Go, and Tic-Tac-Toe, aren't all the types of games that humans play. Other games require other strategies and have other computational challenges.
IBM's Watson's question answering system was able to beat human Jeopardy champions in two televised matches in 2011. Watson listened for key words in the clue and tried to use a knowledge graph to figure out responses and we'll talk more about knowledge graphs in a future episode. Watson wasn't perfect and struggled a bit with context. For example, it famously guessed 'What is Toronto?' on something in the category US Cities, but Watson was still able to do better than human contestants overall.
Evolutionary neural networks use the environment as an input, like reinforcement learning, but this approach introduces multiple agents who try multiple neural network structures and then build on successful ones for the next generation, sorta like animals. The ones that are better at surviving get to pass on their genes. For example, the AI mar.io can learn how to play a Super Mario World level by telling mar.io what buttons it can push and that getting further to the right in the level is good. This AI will start by basically mashing buttons at random, but as some button mashes get it further to the right, it remembers and learns from those successful attempts.
In the next lab, we'll build our own AI to use this approach to crush a video game that we've built where John Green-bot destroys trash.
So, are there any games that are safe to play, where humans will always have an edge and AI won't be able to beat us?
(10:00) to (11:31)
Computers definitely seem to struggle with parts of language, like humor, irony, metaphor, and wordplay. Computers also aren't great at understanding and predicting real people who don't always act optimally, so social games could be more difficult, too, but AI systems are finding some success in bluffing games like online poker, so it's important to not underestimate them.
John Green-bot: All in.
J: Computers might also struggle with creativity or surprise, because there's not a really clear way to assign values of states. It's really difficult to assign a number to how creative something is compared to saying 'go as far right as you can' in the Mario level or 'achieve a winning state' in a game of chess. So considering all of that, maybe games like charades will be pretty stacked for human victory, or what about Pictionary? Maybe hide and seek? We'd love to hear in the comments what games you think are safe from AI, but in the next episode, which is another lab, we'll program an AI system to learn how to play an arcade game and I'll beg John Gree-bot for my poker money back.
See you then.
Crash Course is produced in association with PBS Digital Studios. If you want to help keep Crash Course free for everyone forever, you can join our community on Patreon, and if you want to learn more about the history of games, we have a whole series about that.