Previous: How to Escape Quicksand
Next: 3 Things to Know About Cecil the Lion



View count:723,643
Last sync:2018-04-23 07:10
There are lots of different ways to encrypt a message, from early, simple ciphers to the famous Enigma machine. But it’s tough to make a code truly unbreakable.

Hosted by: Michael Aranda
Dooblydoo thanks go to the following Patreon supporters -- we couldn't make SciShow without them! Shout out to Justin Ove, John Szymakowski, Fatima Iqbal, Justin Lentz, David Campos, and Chris Peters.
Like SciShow? Want to help support us, and also get things to put on your walls, cover your torso and hold your liquids? Check out our awesome products over at DFTBA Records:

Or help support us by becoming our patron on Patreon:
Looking for SciShow elsewhere on the internet?

(SciShow Intro plays)

Michael: This probably looks like gibberish to you and it should because it's a cryptogram, a message in code. But if I told you that all I did was shift every letter in the sentence to the next one in the alphabet, then you'd know that it translates to this.

To encrypt a message, you need 2 main parts: the cipher and the key. The cipher is the set of rules that you're using to encode the information, for example, shifting the alphabet by certain number of letters.

The key tells you how to arrange those rules, otherwise they'd be the same every time and it would be easy to decode the message. In this case, the key would be one because we shifted the alphabet by one letter. To decrypt the information you need to know what kind of cipher was used and also have the key. Or you can just crack the code, either by trying all possible combinations you can think of or by analyzing the code and working backward from it, known as deciphering.

But is it possible to come up with a combination of a cipher and key that could never be determined? Is there such a thing as an unbreakable code? Well people keep coming up with new and better ciphers but it's hard to make them unbreakable because no matter what, you're using a set of rules to encrypt the information and with enough time and enough data, someone can usually uncover those rules.

That puzzle I just gave you is one of the oldest and simplest ways to encrypt a message. It's usually called a Caesar cipher, and in this case the key was just a number representing how many letters of the alphabet I shifted it. But it's also very easy to crack. Even if you didn't know the key, it would take you at most 25 tries to decode the message because you know the whole alphabet has to be shifted by a certain amount. Since there are 26 letters in the alphabet, there are only 25 other options. 

A Caesar cipher is one simple type of mono-alphabetic cipher - a class of ciphers where the whole code is based on one letter of the alphabet standing in for another letter consistently throughout the whole message. Basically you just scramble the alphabet. In that case, the key would just be a list of which letters correspond to which, like this one.

There are over 400 septillion possible ways to encrypt this kind of message, so you'd think it would be hard to crack, and it is, but only a little bit because there are lots of ways to decode messages. 

Just trying all of the possible keys to a code is probably the most obvious and least subtle, and it has an equally unsubtle name - brute force. But you can try a more sophisticated technique, something called frequency analysis which is based on the idea that every language has its own specific patterns. In English for example, the letter 'E' shows up a lot. I used it 7 times in just the last sentence, and some words like 'the' are so common that it's hard to even use a sentence without them. Cryptographers call these words 'cribs'. 

So frequency analysis looks for common words and also common letters or sets of letters like 'ed' or 'ing' at the end of words. If you find that the letter 'x' is showing up a lot in a message and so is the 3-letter word 'irx', you might guess that in the key 'x' corresponds to the letter 'e' and 'irx' spells 'the'. And once you've figured out those letters, you can figure out the rest by recognizing other words and using the process of elimination. And since longer cryptograms contain more clues, they're easier to crack.

So mono-alphabetic ciphers are fun, but they're not hard to break. If you want to get a little fancier with your encryption, you can use poly-alphabetic ciphers instead. They're much more effective. In a poly-alphabetic cipher, the way you scramble the alphabet actually changes throughout the message. In the first word, 's' might translate to 'w', but in the last word, 's' might translate to 'h'. It all depends on the particular encryption method you're using, and on your key.

One of the earliest poly-alphabetic ciphers was the Vigenére cipher. Developed in the 16th century, it was pretty simple because the key was just a word. So let's say you want to encrypt "SciShow is the greatest" using a Vigenére cipher. Well the first thing you need to do is write out a Vigenére square. The alphabet goes across the top and along the left side and each row contains the letters A to Z shifted over by one. So the first line starts with A and ends with a Z, but the second starts with B, goes all through the letters until Z and then ends with A and so on. You end up with 26 differently scrambled alphabets and now, you're ready to encode the message. You just have to pick a key. 

Let's just say your key is ' Michael'. You write out your key multiple times until it fills the same number of letters as your message, so "SciShow is the greatest" would correspond to this. Then to encrypt it, you take each letter of the message and move along its row in the Vigenére square until you get to the column of the corresponding letter in Michael. So "SciShow is the greatest" turns into this. 

That's much tougher to decode unless you have the key because those letter frequencies are all different now. Since the key word 'Michael' is 7 letters long, each letter of your message is encrypted using 7 different scrambled alphabets. But, if your text is long enough, it's still crackable using a type of frequency analysis developed in the 19th century by cryptographer Charles Babbage. 

Babbage realized that in a long enough message, some patterns in the coded message will still show up. Like if your key only has 7 letters, that means that there are only 7 ways to encrypt the word 'the'. But if your message uses the word 'the' 8 times, there are definitely going to be repeats. So he just counted how many letters separated those repeated patterns. If they were separated by 7, 14, or 21 letters, he knew that the key was probably 7 letters long. And from there, he would just use frequency analysis to figure out the 7 scrambled alphabets. 

Babbage's method is just one example of why it's so hard to create an unbreakable cipher. Your key creates a pattern within the encrypted message and with enough work, a spy can uncover that pattern.

It turns out that the only way to have a really unbreakable cipher is to use what's known as a one-time pad encryption, which uses a key that is as long as the message itself. That way, there aren't any patterns in the encrypted text. There's nothing to analyze, so there's no way to work backwards. The sender and the recipient both have the same "pad" and each sheet contains a long set of random letters, which is used as the key. Once a sheet is used to decode a message, you destroy it. Then you use the next sheet for the next message so you never repeat a key. As long as you keep the pad safe, the message can't be decrypted by anyone else.

But you can't always use on-time pad encryption. Let's say you needed to get a message to someone halfway across the world whom you'd never met. You wouldn't have a chance to give them a matching pad. In warfare, that sort of situation comes up a lot, which is why in the 20th century there was suddenly plenty of incentive to come up with better ciphers.

Remote communication, using technology like the telegraph, was incredibly valuable during wartime, but it was essential that only your allies understood what you were saying. The Germans experimented with a new, more complicated mono-alphabetic cipher during World War I, but eventually, the French managed to crack it. Then, during World War II, the Germans again came up with a new cipher, and this time its security seemed perfect.

Maybe you've heard of it: the enigma machine. The machine used a poly-alphabetic cipher that scrambled the alphabet in a different way each time you typed a new letter. As far as the German's knew, the only way to decipher the message was to have your own enigma machine and set it up using a secret key that changed every day.

The machine was meant to work like a one-time pad in the sense that the alphabet was re-scrambled for every letter of the message, but instead of having to distribute a set of sheets to everyone, you could just use a key that told users how to set up their enigma machines, and you could change that key as often as you wanted.

But it had a few flaws. For example, no letter could be encoded as itself. That might not sound like a big deal, but it ended up being a fatal weakness. British mathematician Alan Turing, along with the rest of his team, designed a machine of their own that could crack the enigma code, as long as they knew around 20 of the characters contained in the message.

Which they often did because some phrases tended to show up a lot in Nazi communiques. Especially nice things about Hitler. So part of the strategy of Turing's team was to look for cribs, those common words and phrases, and see where they might fit. For instance, if they knew a message contained the word "führer," they could look for places in the text that didn't have the letter f, since they knew that the f in führer couldn't be encoded as itself.

Those clues helped them figure out how the enigma rotors were set up to encrypt the message. Cracking the enigma code was a huge advantage for the allies. Many historians attribute some of the most important victories during the war to information the allies got from the enigma encrypted messages.

These days encryption is most important in digital computing, and that isn't perfect either. When websites announce that hackers now know everything about you, that's because their encryption methods were breakable.

Companies that store your data have to take into account a whole new set of considerations, like how when you can complete billions of operations per second, brute force suddenly becomes a lot more practical. So the same principles that Vigenére and Turing used are the same ones that allow you to pay your bills online and keep North Korea out of your email. Most of the time. But how is a story for another episode.

Thank you for watching this episode of SciShow, which was brought to you by our patrons on Patreon. If you want to help us keep making videos like this, check out and don't forget to go to and subscribe.