Previous: Where and Why Do People Move? Crash Course Geography #32
Next: The Great Migration: Crash Course Black American History #24



View count:797
Last sync:2021-11-02 18:15
When outbreaks happen, we need to be able to predict the course they’ll take in the future, but of course we can’t run experiments on real people to figure that out. Thankfully we can simulate outbreaks and use models to find out how different scenarios could play out! In this episode of Crash Course Outbreak Science, we’ll look at what models are, how they help predict the course of an outbreak, and how we can use them to manage real world outbreaks.

This episode of Crash Course Outbreak Science was produced by Complexly in partnership with Operation Outbreak and the Sabeti Lab at the Broad Institute of MIT and Harvard—with generous support from the Gordon and Betty Moore Foundation.


Watch our videos and review your learning with the Crash Course App!
Download here for Apple Devices:
Download here for Android Devices:

Crash Course is on Patreon! You can support us directly by signing up at

Thanks to the following patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:
DL Singfield, Jeremy Mysliwiec, Shannon McCone, Amelia Ryczek, Ken Davidian, Brian Zachariah, Stephen Akuffo, Toni Miles, Oscar Pinto-Reyes, Erin Nicole, Steve Segreto, Michael M. Varughese, Kyle & Katherine Callahan, Laurel A Stevens, Vincent, Michael Wang, Stacey Gillespie, Jaime Willis, Krystle Young, Michael Dowling, Alexis B, Rene Duedam, Burt Humburg, Aziz Y, DAVID MORTON HUDSON, Perry Joyce, Scott Harrison, Mark & Susan Billian, Junrong Eric Zhu, Alan Bridgeman, Rachel Creager, Jennifer Smith, Matt Curls, Tim Kwist, Jonathan Zbikowski, Jennifer Killen, Sarah & Nathan Catchings, Brandon Westmoreland, team dorsey, Trevin Beattie, Divonne Holmes à Court, Eric Koslow, Jennifer Dineen, Indika Siriwardena, Khaled El Shalakany, Jason Rostoker, Shawn Arnold, Siobhán, Ken Penttinen, Nathan Taylor, William McGraw, Andrei Krishkevich, ThatAmericanClare, Rizwan Kassim, Sam Ferguson, Alex Hackman, Jirat, Katie Dean, neil matatall, TheDaemonCatJr, Wai Jack Sin, Ian Dundore, Matthew, Justin, Jessica Wode, Mark, Caleb Weeks

Want to find Crash Course elsewhere on the internet?
Facebook -
Twitter -
Tumblr -
Support Crash Course on Patreon:

CC Kids:
When outbreaks happen, we have to  consider some important questions:.

How fast could the disease spread? How many people could get sick?

What would work best to stop the outbreak? We can’t run experiments on real  people to answer questions like these. But we can simulate outbreaks to see how different  scenarios would play out during an outbreak.

In other words, we can use a model. In this episode, we’ll have a look at what models  are, how they help predict an outbreak’s future, and their potential to help us  manage outbreaks in the real world. I’m Pardis Sabeti, and this is  Crash Course Outbreak Science! [Theme Music].

Before we get started, it’s  worth mentioning that since   outbreaks of infectious disease can be fatal, we’re going to be talking  about deaths in this episode. We know that loss of life and the other  impacts of outbreaks are the kind of tragedy that math and numbers don’t fully describe. While graphs and models can seem abstract– who wants to be reduced to a number?

Real people, from the scientists who make models  to the public health officials who use them, to the people saved by their predictions,   are at the heart of the models  we’ll be talking about this episode. Models can help us make decisions that  can save many lives in the real world. Generally speaking, a scientific model is  a description of some features of the world and their relationship to one another.

As long as you have a good understanding of the  relationship between the things you’re studying, you can model pretty much anything,  from economies to ecosystems. To see this, let’s look at  something near to my heart: music! Besides studying outbreaks, I also write  and sing in the rock band Thousand Days.

If our band was going on tour, one of the things we’d want to know is  how many tickets would we sell overall. And with some simple modelling, we could  work out roughly what that number might be. First off, we’re gonna make an assumption, a simplification about the world that   gets us into the right ballpark by  smoothing over some of the details.

Let’s assume that the number of tickets  sold depends on how many gigs we play, and that at every gig a hundred people come along. So if we play five gigs, we sell five  hundred tickets for the whole tour. That sounds imprecise, but we can  always revisit that assumption later on.

Or, in this case, I might already have data  that tells me that about a hundred people always turn up to our gigs. Right now, our model is quantitative. Its features  describe the number of some particular thing, like the number of gigs and the  number of people who attend per gig.

But we might also be interested in  qualities or categories of some kind. For instance, we could make a more detailed  assumption that there are two kinds of gigs. There are weekday gigs, where only seventy  five people attend, and weekend gigs, where up to two hundred people might show up.

Our model now has a couple of different features. There are the number of gigs, the  category of gigs, weekday or weekend, and the number of tickets sold. All of these features are things  that vary depending on circumstances, so we call them variables!

In general, the goal of a model is to describe  how variables are related to each other and the values they have in certain scenarios, such as the five hundred tickets in our example. But to do this, our model has to  rely on numbers that don’t vary, which are the minimum and maximum number of  tickets sold at weekday and weekend gigs. These fixed numbers that we put into  the model are called parameters.

Parameters are usually numbers that come from  data, or our assumptions about the world. They help us understand the  relationship between variables, like how more gigs means we sell more tickets. In fact, the relationship between  variables is really an equation.

Equations capture relationships like these  in a way that lets us plug numbers in, and get numbers out. The numerical relationships described by  equations are helpful because they tell us the extent to which variables  might affect one another. In our case, the model tells us that  the number of tickets scales linearly   with the number of small or large gigs.

So, if you put the number of ticket sales  and the number of gigs played on a graph, it would look like a straight line. But there are other kinds of  mathematical relationships too, with curvy lines when you plot them  on a graph, which we call non-linear. Non-linear models are a little trickier but  they let us study a broader range of events with complex behaviors, including outbreaks!

One of the models most commonly used for  studying outbreaks is the S-I-R model. The model gets its name  from three groups of people. The first is “susceptible”, the people who  haven’t yet caught the disease but could.

The second group is “infected”, the  group who currently have the disease and could spread it to others. And finally there’s the “removed” group, the people who have already had the  disease and have either recovered and gained immunity or, sadly, died of it. The SIR model describes the  relationship between these three groups and how their interactions affect the fraction of  people in each group at a given moment in time.

The proportions of a given population   in each of these three groups are the  three variables of interest in our model. The value of these variables matters to us since   the number of people who do  or don’t catch the disease is one of the most important  outcomes of an outbreak! The basic SIR model makes some assumptions about   how the fraction of people in  each group changes over time, based on real-world observations.


One: when susceptible and infectious people interact,   it can cause susceptible  people to become infected. So, the number of susceptible  people declines over time, proportional to how many people  are in both of these groups. Like, if there’s a lot of  susceptible and infectious people, more susceptible people will become infected,   so the number of susceptible  people declines more quickly. Assumption

Two: the number of people removed from the susceptible   group is the same as the number of  people added to the infectious group since, they’re infected now. And Assumption

Three: after people are infected for a  while, they move to the removed group, since they’ve recovered, or sadly, died,  and can’t transmit the infection anymore. And, the decrease of the number  of infectious people over time   is equivalent to the increase in removed people. If we consider what we know about outbreaks  in real life, that all makes sense. Over some period of time, infected  people do infect susceptible people   who then become infected, and eventually, those infected  people recover or die.

The exact relationships in the model depend  on a parameter called the reproductive number, or “R” for short. This is the average number of susceptible  people one infected person can infect and it varies based on the pathogen. So if R were three, then a single infected person  infects three susceptible people on average.

R itself can change over time. We might start social distancing, for instance, or the disease might enter a  new susceptible population. So we often define R at the start of the outbreak, when nearly everyone is in the susceptible group, and we call it the basic  reproductive number or R naught.

Outbreaks happen when R  naught is greater than one, that is, when each infected person  can infect more than one person, making the number of infected  people increase over time. But since R can change, our model  variables S, I, and R can, too. To see how, let’s go to the Thought Bubble.

If we plot the values of our three variables over  time, we can see the course an outbreak takes.   On this graph, the horizontal axis follows time, while the vertical axis shows the proportion  of people in each of the three groups. At the start of our model outbreak, we have a small proportion  of people in the “infected”   group and everyone else in the susceptible group. R naught is greater than one.

Then, we step ahead in time,  and the proportions in each   group change according to the model’s equations: some susceptible people become infected  and some infected people become “removed”, by recovering or dying when enough time passes. At first, since the number  of infected people is small, the change in this group is slow and  only a few new people become infected. But as the number of infected people increases, its rate of change increases too  and suddenly there’s a steep rise   in the number of people becoming infected and a steep drop in the  number of susceptible people.

But some people will also be recovering or dying, which means people will move over time  from the infected to the removed group. During this entire time, R is greater than one. Eventually, at the peak of the outbreak,   since there are fewer susceptible  people to become infected, infections don’t happen as often,  which creates a turning point.

At that moment, susceptible people become   infected at exactly the same rate  as infected people become removed. Then, more people will be  recovering than becoming infected, and the number of infected people will begin  to drop, since R becomes less than one. Slowly, the fraction of the population  that’s infected tails off towards zero, and the fractions of susceptible and  recovered people reach a steady state, marking the end of the outbreak.

Thanks Thought Bubble! There are a few useful observations  we can make about these results. First off, we can subtract the final fraction  of susceptible people from the whole, to find the total number of people who  became infected during the outbreak.

We can also predict how long  the outbreak could last for, and at what point the number of infections might  peak and potentially strain the healthcare system. While this version of an SIR model  captures the basics of an outbreak, there are things we can do  to make it more accurate. For example, many infections have a  period where a person is infected, but still can’t transmit them to anyone else.

So we could include another group in our model  between susceptible and infected called “exposed”, which would make the duration  of the outbreak more realistic. There are other details we could include too, such as the possibility that some recovered  people could become infected again, if immunity isn’t guaranteed. We can reflect these details in the equations in  the model by adding new variables and parameters.

A broader understanding of other factors  surrounding an outbreak like these help improve our model’s ability to  predict the course of an outbreak. Finally, there are other kinds  of variables and parameters we   need to include to help make decisions: the kind that capture our response. After all, the goal of the model is to  tell what could happen so we can act.

We may also include variables and parameters  that represent changes during the outbreak like implementing contact  tracing, or social distancing. These would decrease R since  infected people would, on average, go on to infect fewer susceptible people. And, as we mentioned, if we can  make R become less than one, an outbreak will begin to decline.

We can run our models with different  combinations of these interventions to forecast the effect they have on the  length and severity of the outbreak. But, at its heart, this model is only a simplified  description of the dynamics of an outbreak, so there are also some challenges  to using the SIR model. For starters, we need to be able to  determine parameters like R naught, which are calculated from real world data.

These might not be available during  the early stages of an outbreak. And even with data, we won’t have a totally  exact value for parameters like these. Instead, they’ll have some uncertainty, meaning there’s a range of possible numbers  that will be compatible with the data.

Since the inputs into our models are uncertain,  naturally, their outputs will be uncertain too. That doesn’t mean the models  aren’t helpful at all, but it means instead of getting a precise  prediction like “the outbreak will last 53 days”, we’ll have estimates, like “the outbreak  could last anywhere from 30 to 70 days.” So models can help our decision making  by forecasting which of our actions   could have the greatest impact. The catch is our ability to  predict these outcomes is limited by the extent to which we can accurately  model the effects of the intervention.

But the better we get at developing  models, the more useful a model will be. Before we can add interventions into a model, we should understand what options are available   to us when an outbreak happens, and  how they might help us tackle it. So, in our next episode, we’ll  look at how to intervene in an   outbreak in the context of public health.

We at Crash Course and our  partners Operation Outbreak and the Sabeti Lab at the Broad  Institute at MIT and Harvard want to acknowledge the Indigenous people  native to the land we live and work on, and their traditional and ongoing  relationship with this land. We encourage you to learn about the  history of the place you call home through resources like and by engaging with your local  Indigenous and Aboriginal nations through the websites and resources they provide. Thanks for watching this episode  of Crash Course Outbreak Science, which was produced by Complexly in  partnership with Operation Outbreak and the Sabeti Lab at the Broad  Institute of MIT and Harvard— with generous support from the  Gordon and Betty Moore Foundation.

If you want to help keep Crash  Course free for everyone, forever, you can join our community on Patreon.