Previous: DNA: Not Just for Life Anymore!
Next: How Earth Recycled a Mountain Range



View count:183,550
Last sync:2022-11-29 06:45
When you have a study with a small sample size, how do you know that the results represent the broader population? Well, thanks to a brewer who needed to assess beer quality in the early 1900s, we now have a simple statistical test that lets us do just that!

SciShow is supported by Go to to get 20% off of an annual Premium subscription.

Hosted by: Michael Aranda

SciShow has a spinoff podcast! It's called SciShow Tangents. Check it out at

Support SciShow by becoming a patron on Patreon:
Huge thanks go to the following Patreon supporters for helping us keep SciShow free for everyone forever:

Alisa Sherbow, Silas Emrys, Drew Hart. Jeffrey Mckishen, James Knight, Christoph Schwanke, Jacob, Matt Curls, Christopher R Boucher, Eric Jensen, Adam Brainard, Nazara, GrowingViolet, Ash, Sam Lutfi, Piya Shedden, KatieMarie Magnone, charles george, Alex Hackman, Chris Peters, Kevin Bealer, Jason A Saslow

Looking for SciShow elsewhere on the internet?

Thanks to Brilliant for supporting this episode of SciShow.

Go to to learn how you can take your STEM skills to the next level! [♪ INTRO]. One of the trickiest issues in science has always been small studies.

Like, how much do the results of small studies actually tell you about a broader population? And how much of their results are just random noise? Fortunately, these days, we do have a way to gauge how much we can trust small studies.

And it’s all thanks to a beer brewer working at Guinness Brewery in the early 1900s. Today, a simple statistical test that was invented to assess beer quality is one of the most important tests used in biology, medicine, and some other scientific fields. The guy at the heart of this story is William Sealy Gosset, a British chemist and mathematician born in 1876.

Gosset had enjoyed experimenting and inventing things from a young age, and he studied math and chemistry at both the University of Winchester and at Oxford. Then, when he was just 23, he got a job at Guinness as a brewer in other words, a beer chemist. Back when Gosset was hired, Guinness was still a pretty small operation, and their brewing process wasn’t much of a science.

Basically, they would mix some barley or other grain with flavored water and let the yeast do its thing. The solution would ferment, and in just over a week, voilà! Beer.

Along the way, brewers would sample and sniff their products… and that was how they kept the quality consistent. But now Guinness was looking to start brewing beer on an industrial scale. So sniff and taste tests weren’t going to cut it anymore.

It just wasn’t practical to have someone sampling that much stuff every step of the way. But they needed some way to guarantee that the quality of their beer was still good—and consistent. Without wasting too much time or money.

That was Gosset’s challenge. For instance, one of his specific projects was to compare the sugar content in the barley malt extract from different batches. This sugar is what feeds the yeast, so any differences can change the outcome of the beer.

But Gosset could only use a small set of samples to make this comparison. And he knew that any difference between the sets could possibly mean one of two things:. It could mean the two batches had different concentrations of sugar….

But it could also just mean that at least one of the samples had a different average concentration than the batch it came from. Maybe the batch wasn’t uniformly mixed, or something fluke-y like that. Basically, he was running into the same problem scientists have with any small study:.

It’s easy to randomly select samples that don’t really represent the whole batch. It’s like if you wanted to estimate the average cost of a home in a certain town, but you only sampled the price of ten houses. That wouldn't necessarily tell you much of anything about the town as a whole.

So, Gosset realized that his key challenge was how to figure out whether or not a given sample set made a reliable proxy for the whole batch. And so far, statisticians hadn’t bothered to investigate small samples, since they weren’t considered useful for analyses. So this was uncharted territory.

But fortunately, Gosset’s stats background came in handy here and he decided to figure it out himself. He compared the average concentration of small sets of samples to averages from much bigger sample sizes specifically, the classic bell curve called a normal distribution. And he found that the smaller the sample size, the more different its mean concentration could be from that of a large batch.

So, to deal with this situation, Gosset developed the concept of a t-distribution. A t-distribution looks a lot like your classic bell curve, which is used to depict a range of probabilities. Like, say you’re trying to get the distribution of test scores in a class.

You’ve got the average score at the top of the bell, and the curve of the bell shows how much variability there is. The difference is that while a normal bell curve quickly trails off to zero on either end, a t-distribution doesn’t. Instead, its ends slowly taper off, with long tails that represent the greater amount of noise, or unreliability, that’s inherent in a small sample.

Gosset’s t-distribution related the size of a set of samples to how much variability it was likely to have. And it came with a cutoff: a critical value. If the difference in concentrations between two sample sets was significantly bigger than this value, he could be pretty sure that the difference really existed in the larger batches he was actually trying to compare.

If the difference was smaller, there was a good chance that the batches were similar enough to be considered the same. He called this comparison a t-test. And the approach worked!

The development of the t-test made it possible for Guinness to industrialize with confidence and start brewing the massive amounts of beer it puts out today. And more than a hundred years later, the t-test isn’t just for beer. It’s been adopted by all kinds of scientists needing to interpret their experimental results.

Like, whether you need to compare the concentration of sodium in patients’ blood samples, or test which variety of crop yields the starchiest wheat, the t-test is your go-to. Today, it’s known as the Student’s t-test not because it’s the bane of stats learners everywhere, but because Guinness would only allow Gosset to publish the results under a pen name. He seems to have pulled the name “Student” from a brand stamped on his lab notebook!

But whether or not we know Gosset’s name, we’ve probably all been touched by science that his test made possible and reliable so we can raise a glass to that! We hope we’ve whetted your appetite to learn more about stats, because today’s episode is brought to you by Brilliant. And they’ve got loads of courses that will teach you all about stats.

Like Random Variables and Distributions, which will help you understand why Gosset had to invent a whole test just for small sample sizes. But you don’t have to stop there, because Brilliant has courses in basic science, engineering, and computer science too. If you’re interested, you can head to, where you can sign up and get 20% off an annual Premium subscription.

And thanks for checking them out! [♪ OUTRO].