Previous: The Harlem Renaissance: Crash Course Theater #41
Next: The Engineering Challenges of Renewable Energy: Crash Course Engineering #30



View count:329
Last sync:2019-01-02 19:10
Today we’re going to talk about why many predictions fail - specifically we’ll take a look at the 2008 financial crisis, the 2016 U.S. presidential election, and earthquake prediction in general. From inaccurate or just too little data to biased models and polling errors, knowing when and why we make inaccurate predictions can help us make better ones in the future. And even knowing what we can’t predict can help us make better decisions too.

Crash Course is on Patreon! You can support us directly by signing up at

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Eric Prestemon, Sam Buck, Mark Brouwer, Naman Goel, Patrick Wiener II, Nathan Catchings, Efrain R. Pedroza, Brandon Westmoreland, dorsey, Indika Siriwardena, James Hughes, Kenneth F Penttinen, Trevin Beattie, Satya Ridhima Parvathaneni, Erika & Alexa Saur, Glenn Elliott, Justin Zingsheim, Jessica Wode, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Brian Thomas Gossett, Khaled El Shalakany, SR Foxley, Yasenia Cruz, Eric Koslow, Caleb Weeks, Tim Curwick, D.A. Noe, Shawn Arnold, Malcolm Callis, Advait Shinde, William McGraw, Andrei Krishkevich, Rachel Bright, Jirat, Ian Dundore

Want to find Crash Course elsewhere on the internet?
Facebook -
Twitter -
Tumblr -
Support Crash Course on Patreon:

CC Kids:
[Complexly theme]

Hi, I'm Adriene Hill, and welcome back to Crash Course Statistics. We've learned a lot about how statistics can help us understand the world better, make better decisions, and guess what'll happen in the future.

Prediction is a big part of how modern statistical analysis is used, and it's helped us make improvements to our lives big and small. But predictions are just educated guesses. We use the information that we have to build up a model of how the world works. A lot of the examples we talked about earlier in the series were making predictions about the present–things like, which coffee shop has better coffee, or how much does an increase in cigarette smoking decrease heart health?

But in this episode, we're gonna focus on using statistics to make predictions about the future. Like, who will win the next World Series, or what stock's gonna do well next month?

Looking back at times when we've failed to make accurate predictions can help us understand more about how to get it right, or whether we just don't have enough information. Today, we're gonna talk about three areas of prediction: markets, earthquakes, and elections. We'll look at why predicting these events can be tricky; why we get it wrong.

[Crash Course theme]

Banks were influential in creating the perfect storm that lead to the 2008 financial crisis. If you've seen "The Big Short" or read the book it's based on, you know that. You also know that Steve Carell should never go blonde again.

The financial crisis is really complicated, and we're gonna simplify it a lot here, but if you're interested, you can check out episode 12 of our Economics series. For now, we're gonna focus on two prediction issues related to the crisis: 1) Overestimating the independence of loan failures, and 2) Economists who didn't see the crisis coming.

So before the crisis, banks were giving out mortgages to pretty much anyone. Normally, banks and lenders in general are choosy about who they lend to. If you give someone a million-dollar loan, and they can't pay it back, you lose out. But banks weren't holding on to that debt; they were selling it to others.

They combined mortgages into groups and sold shares of the loans as mortgage backed securities. The banks knew that some people wouldn't pay their loan in full, but when the mortgages were packaged together, the risk was supposedly mitigated.

Say, there's a 10% chance that each borrower will default on or fail to repay their loan. While not totally risky, it's not ideal for investors. But if you packaged even five similar loans together, the probability that all of them will default is now only 0.001%, because the probability of all of them failing–if each loan failing is independent of another loan failing–is 0.1 to the 5th power.

But we just made a prediction mistake. Many investors overestimated the independence of loan failures. They didn't take into account that if the then-overhauled housing market, and subsequently the economy, began to crumble, the probability of loans going unpaid would shoot way up.

They also had bad estimates of just how risky some of these loans were. Families were losing their homes, and the unemployment rate in the US steadily increased from around 5% to as high as 10% in just a couple of years. There was a global recession that most economists' models hadn't predicted. And to this day, they're still debating exactly why.

Economist John T Harvey claims: "Economics is skewed towards rewarding people for building complex mathematical models, not for explaining how the actual economy works." Others theorize that we need to focus more on people and their sometimes irrational behavior. Wharton finance professor Franklin Allen partly attributes our inability to predict the financial crisis to models that underplayed the impact of banks–the same banks that were involved in the lending practices that helped create, and then deflate, the housing bubble. He claims: "That's a large part of the issue. They simply didn't believe the banks were important."

But they were. Prediction depends a lot in whether or not you have enough data available, but it also depends on what your model deems as "important." You can collect a huge amount of data predicting the rates of diabetes in each country, but if your model only considers hair color, whether or not the person drives a hybrid, and the number of raccoons they think can fight, it probably won't be a good model.

When we create a model to predict things, we're trying to use data, math, and statistics in order to approximate how the world works. We're never going to get it perfect, but if we include most of the important things, we can usually get pretty close.

Even if we can tell what features will be important, it might be hard to get enough data. Earthquakes are particularly difficult to predict. The United States Geological survey even has a webpage dedicated to telling the public that currently, earthquakes just aren't predictable. Clusters of smaller earthquakes often happen before larger ones, but these pre-quakes aren't that helpful in predicting when a big earthquake will hit, because they're almost just as often followed by nothing.

In order to accurately predict an earthquake, you would need three pieces of information: its location, magnitude, and time. It can be relatively easy to get two out of three of those. For example, I predict that there will be an earthquake in the future in Los Angeles, California. And I'd be right, but unless I can also specify an exact time, no one's going to be handing me any honorary degrees in seismology.

We're not bad at earthquake forecasting even if we struggle with accurate earthquake prediction. Earthquake forecasting focuses on the probabilities of earthquakes, usually over longer periods of time. It can also help predict likely effects and damage. This forecasting work is incredibly important for mitigating the sometimes devastating effects of larger earthquakes. For example, scientists might look at the likelihood of severe earthquakes along the San Andreas fault. Their estimates can help inform building codes, disaster plans, and even earthquake insurance rates.

And earthquakes are not without some kind of pattern. They do tend to occur in clusters, with aftershocks following quakes in a pretty predictable pattern. But in his book, "The Signal and the Noise," Nate Silver warns about looking so hard at the data that we see noise–random variation with no pattern–as signal. The causes of earthquakes are incredibly complex, and the truth is, we're not in a place where we can currently predict when, where, and how they'll occur.

Especially the larger, particularly destructive earthquakes. To predict a magnitude 9 earthquake, we'd need to look at data on other similar earthquakes. But there just isn't that much out there. Realistically, it could be centuries before we have enough to make solid predictions.

Even for more common magnitude earthquakes, it could take a lot of data before we have enough to see the pattern amidst all the randomness.(?~6:33)