Previous: Media Ownership: Crash Course Media Literacy #8
Next: Beasts of No Nation: Crash Course Film Criticism #14



View count:292,922
Last sync:2022-12-30 18:15
Today we’re going to talk about ethical data collection. From the Tuskegee syphilis experiments and Henrietta Lacks’ HeLa cells to the horrifying experiments performed at Nazi concentration camps, many strides have been made from Institutional Review Boards (or IRBs) to the Nuremberg Code to guarantee voluntariness, informed consent, and beneficence in modern statistical gathering. But as we’ll discuss, with the complexities of research in the digital age many new ethical questions arise.

Crash Course is on Patreon! You can support us directly by signing up at

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Glenn Elliott, Justin Zingsheim, Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, Robert Kunz, SR Foxley, Sam Ferguson, Yasenia Cruz, Eric Koslow, Caleb Weeks, Tim Curwick, Evren Türkmenoğlu, Alexander Tamas, D.A. Noe, Shawn Arnold, mark austin, Ruth Perez, Malcolm Callis, Ken Penttinen, Advait Shinde, Cody Carpenter, Annamaria Herrera, William McGraw, Bader AlGhamdi, Vaso, Melissa Briski, Joey Quek, Andrei Krishkevich, Rachel Bright, Alex S, Mayumi Maeda, Kathy & Tim Philip, Montather, Jirat, Eric Kitchen, Moritz Schmidt, Ian Dundore, Chris Peters, Sandra Aft, Steve Marshall

Want to find Crash Course elsewhere on the internet?
Facebook -
Twitter -
Tumblr -
Support Crash Course on Patreon:

CC Kids:
Hi I'm Adrian Hill and welcome back to crash course statistics.

Today we're going to take a step back from sampling and regressions to talk about the impact of all that statistical gathering. We've seen that the interpretation of this information can have real, lasting effects on our society, but its collection can also have lasting effects on the subjects.

The process of gathering and applying statistics can affect real people's lives, which means there's a responsibility to gather and use this data ethically. Today we're gonna discuss five stories. Four of them are real and all of them can help us learn where collecting data can go wrong and how we can help prevent these things from happening again. (0:41)

Our first story begins in 1822 when a young fur-trapper named Alexis saint martin got shot in the stomach when another trapper's gun accidentally went off.

The wound was serious, but a local army doctor, William Beaumont, was able to stabaize saint martin through a series of presumably painful anesthetic free surgeries. But doctor Beaumont couldn't close the wound, which left a small hole called a gastric fistula that allowed access to the stomach. Saint martin was out of a job since its hard to be an active fur trapper with a hole in your stomach.

So he signed a contract to become a servant to doctor Beaumont. In addition to traditional chores, saint martin participated in all sorts of experiments at the whim of the doctor. Beaumont used the gastric fistula to study how the body digested food.  He made huge strides in the field, including exploring the influence of mental disturbance on the process of digestion and correcting the long-held belief that the stomach digested food by grinding it up.  

When, in 1938, the two finally parted ways, Beaumont spent the last 15 years of his life pleading with St. Martin to come back.  Maybe unsurprisingly, St. Martin declined.  Without this strange situation, the field of gastroenterology may have progressed more slowly.  In fact, St. Martin's fistula was an inspiration to Pavlov, who used fistulas in dogs during his famous classical conditioning experiments.  But all this progress came at a cost to St. Martin and also to those dogs.  

One of the most important ethical considerations in research is whether humans who participate are able to feasibly say no.  People with little power, resources, or money can be coerced into participating in experiments that they're uncomfortable with.  Most research institutions have a committee called the Institutional Review Board, or IRB, which oversees all the research at that institution to make sure that it's ethical.  Voluntariness is one of the most important things that they check for.  This prohibits people with undue power or influence over us from asking that we participate in a research study.  

For example, your boss or professor is limited in how they ask to participate in a research study, because you might feel that you have no choice, that you have to participate, otherwise they might fire you or give you a failing grade.  Ethical research needs to be voluntary, at least in humans.  

Animal rights activists argue that since animals cannot volunteer for a study, we shouldn't use them.

In addition to their voluntary participation, subjects should also know what will happen to them during the study.  This was not the case in 1932, when the Tuskegee Institute began a 40 year long study on over 600 black men.  Under the guise of free medical care, the men were secretly enrolled in a study to observe the long-term progression of syphilis.  Over 300 of the men enrolled had the disease, but researchers failed to treat them with anything but fake or innocuous medicines like aspirin, even after it became clear that penicillin was a highly effective treatment for the disease.  Late stage symptoms of syphilis include serious neurological and cardiovascular problems, yet the institute allowed the study to go on.  Some wives and kids also contracted syphilis.  In 1972, public outrage caused the study to close down, when news of unethical conditions was leaked to the media.  

In 1951, at the same time the Tuskegee study was running, a poor tobacco farmer named Henrietta Lacks went to Johns Hopkins Hopsital in Maryland and had cells from a tumor collected without her knowledge or consent.  These cells were used to grow a new cell line, called the HeLa line, which scientists used to do in vitro experiments.  The cell's ability to thrive and multiply outside her body made the cell line useful to researchers.  It's still used today for medical research, lending itself to cancer and AIDS research as well as immunology studies like the one that led Jonas Salk to discover the polio vaccine, and in 1955, HeLa cells were the first human cells to be successfully cloned.  Over time, the cell line and the discoveries it facilitated became extremely lucrative for researchers, but Lacks and her family didn't receive any financial benefit.  

These studies emphasize the need for informed consent.  Subjects have the right to not only receive all the facts relevant to their decision to participate, they have the right to understand them.  Many institutions require that information must be presented clearly and in a way that's appropriate for the subject's comprehension level.  Even children whose parents are legally allowed to consent for them must get an age appropriate explanation of what will happen in the study.  This is incredibly important, because it respects the dignity and autonomy of the subject, allowing them to stop research procedures at any time.  That incentivizes researchers to design studies with more acceptable levels of risk.  

In all three of those stories, the research procedures didn't have any benefit to the patients.  In 1947, the Nuremburg Code was created in order to establish guidelines for the ethical treatment of human subjects.  One of the main tenets is beneficence, which not only requires that researchers minimize the potential risk to subjects but also requires that the risk should be outweighed by potential benefits to the patient and the scientific community. 

The Nuremburg Code was created and implemented after the Second World War, during which horrifying experiments were conducted on prisoners in Nazi concentration camps.  The Nuremburg Code lays out 10 principles to which modern day studies still must adhere.  These 10 principles stand as the basis for much of current research ethics and include things like voluntariness and informed consent and beneficence, but as we settle into the age of technology, the application of these ethical principles can become more cloudy.

Our last story here isn't real, but it illustrates the complexities of research ethics in the digital age.  In the 7th season of the hit show, Parks and Recreation, a giant internet corporation comes to the small town of Pawnee, Indiana to offer free WiFi to the entire city.  Everyone gladly accepts.  They like the free service.  But when boxes of personalized gifts arrive at every citizen's doorstep, some become a little concerned, because the gifts are perfect, fitting the exact interests of the recipient.  Someone who collects stuffed pigs dressed as celebrities gets Hamuel L. Jackson, and someone obsessed with politics gets the newest Joe Biden poetry collection.  These boxes are perfect for the people who receive them, eerily perfect.  So how did the internet company know what each person would want?  Well, in the show, it turns out that the free WiFi came with a pretty high cost: privacy.  In exchange for the free WiFi, the internet company, Gryzzl, was collecting all data that was transferred over the network.  This gets called data mining, and it may seem far-fetched, but it's happening right now.  Not the gift stuff, the data mining.  

Grocery stores track what we buy with our rewards cards, Netflix keeps track of everything we watch, Amazon knows exactly what we buy, what we look at, and those Terms of Service agreements we click on without reading them when we download an app or sign up for a social media account?  They often include some kind of stipulation.  When we use free internet services, we agree to pay not with money, but usually with our information.  Facebook and Google offer their services for free in part because they are profiting off of our data.  They might be using it for research or to customize our experience on the site so that maybe we buy or watch more stuff on Amazon and YouTube.  They also use it to sell targeted ads, giving advertisers the opportunity to select exactly the type of people who are gonna see their ads, and sometimes the way that these ads are targeted can be pretty unethical.  For example, companies discriminating based on age by specifying that job ads should only be shown to young people.  Data is being used in ways that affect every facet of our lives, but since we're still in the beginning stages of this huge influx of digital information, we get to see the progression of ethics in this area unfold right in front of us.  The laws that will protect your data and privacy and mine, like the Nuremburg Code, protects participants in scientific experiments are still being written and many of the same concepts are coming up.

For example, using the internet, using Google, social media, have become so entrenched in some societies that it's almost impossible to hold a job without them, and if that's the case, we need to ask whether it's ethical to require that users sign over their right to privacy in order to use them, or like in most clinical studies, does that border on coercion?  We also need to ask whether companies that use or sell our information be held to the standard of informed consent, which requires agreements to be in language that's simple enough for the user to understand what they're agreeing to, even if they don't have a law degree, or, on the hand, whether companies should be exempt from this requirement if they only use the data internally.  It's possible to draw parallels between data mining and the stories we talk about in the beginning of the episode, though admittedly, it's not quite as harrowing.

Like Alexis St. Martin may have felt pressure to stay with Dr. Beaumont because he couldn't work as a fur trapper anymore, it can be argued, to a much lesser degree, that we use sites like Google or Twitter because we feel there's no other option, as we try to remain informed in our hyperconnected world, and we might not be getting all the information we need to consent in an understandable way, similar to how Henrietta Lacks was not informed why her cells were being taken or what they'd be used for.  These situations are obviously not exactly the same and we, as a society, need to decide how to apply the principles of research ethics and these new digital spaces.  As we move forward and gain the ability to do things like sequence an entire genome in days rather than years, we open the door for amazing advances in personalized medicine that could save millions of lives, but we also open the door for abuse of this sensitive information.  The conversation about how to handle these types of situations is still going on.  We're the ones who will decide what is said and we're gonna be the subject of those decisions.

Thanks for watching, I'll see you next time.

CrashCourse Statistics is filmed in the Chad and Stacey Emigholtz Studio in Indianapolis, Indiana and it's made with the help of all these nice people.  Our animation team is Thought Cafe.  If you'd like to keep CrashCourse free for everyone forever, you can support the series at Patreon, a crowdfunding platform that allows you to support the content you love.  Thank you to all our Patrons for your continued support.  CrashCourse is a production of Complexly.  If you like content designed to get you thinking, check out some of our other channels at  Thanks for watching.