scishow
Uncovering the Secrets of the Past with AI
YouTube: | https://youtube.com/watch?v=ehTqSTTJ8Mk |
Previous: | Can We Keep Neurons Active…with Algae? |
Next: | These Chimps Treat Each Other’s Wounds... with Bugs |
Categories
Statistics
View count: | 167,916 |
Likes: | 7,987 |
Comments: | 266 |
Duration: | 06:03 |
Uploaded: | 2022-02-10 |
Last sync: | 2024-12-03 22:15 |
Citation
Citation formatting is not guaranteed to be accurate. | |
MLA Full: | "Uncovering the Secrets of the Past with AI." YouTube, uploaded by SciShow, 10 February 2022, www.youtube.com/watch?v=ehTqSTTJ8Mk. |
MLA Inline: | (SciShow, 2022) |
APA Full: | SciShow. (2022, February 10). Uncovering the Secrets of the Past with AI [Video]. YouTube. https://youtube.com/watch?v=ehTqSTTJ8Mk |
APA Inline: | (SciShow, 2022) |
Chicago Full: |
SciShow, "Uncovering the Secrets of the Past with AI.", February 10, 2022, YouTube, 06:03, https://youtube.com/watch?v=ehTqSTTJ8Mk. |
Head to https://cometeer.com/scishow3 to get $20 off your Cometeer order + free shipping - That’s over 30% in savings!
It’s probably not a surprise that many ancient texts are a bit worn out and tattered, and that makes deciphering what they say quite a task. But with new computer tech and artificial intelligence, we are getting much clearer glimpses of what people of the past thought was important enough to write down.
Hosted by: Hank Green
SciShow is on TikTok! Check us out at https://www.tiktok.com/@scishow
----------
Support SciShow by becoming a patron on Patreon: https://www.patreon.com/scishow
----------
Huge thanks go to the following Patreon supporters for helping us keep SciShow free for everyone forever:
Bryan Cloer, Sam Lutfi, Kevin Bealer, Jacob, Christoph Schwanke, Jason A Saslow, Eric Jensen, Jeffrey Mckishen, Nazara, Ash, Matt Curls, Christopher R Boucher, Alex Hackman, Piya Shedden, Adam Brainard, charles george, Jeremy Mysliwiec, Dr. Melvin Sanicas, Chris Peters, Harrison Mills, Silas Emrys, Alisa Sherbow
----------
Looking for SciShow elsewhere on the internet?
SciShow Tangents Podcast: https://scishow-tangents.simplecast.com/
Facebook: http://www.facebook.com/scishow
Twitter: http://www.twitter.com/scishow
Instagram: http://instagram.com/thescishow
----------
Sources:
https://doi.org/10.1017/S0959774318000471
https://ieeexplore.ieee.org/document/7004460
https://www.cl.cam.ac.uk/teaching/0809/CompVision/
https://link.springer.com/article/10.1007/s10462-020-09827-4#Sec25
https://doi.org/10.1109/ICEngTechnol.2017.8308186
https://www.inf.ed.ac.uk/teaching/courses/nlu/assets/reading/Gurney_et_al.pdf
https://www.nature.com/articles/s41583-021-00473-5
https://link.springer.com/chapter/10.1007/978-3-540-75171-7_2
https://link.springer.com/chapter/10.1007/978-3-642-30223-7_87
https://arxiv.org/pdf/1707.02968.pdf
https://link.springer.com/article/10.1007/s10462-020-09827-4
https://www.science.org/doi/10.1126/sciadv.1601247
https://arxiv.org/pdf/1707.02968.pdf
https://dl.acm.org/doi/pdf/10.1145/3457607
https://doi.org/10.1145/3290605.3300773
https://doi.org/10.1117/12.2010051
https://ieeexplore.ieee.org/document/6628705
https://link.springer.com/chapter/10.1007/978-981-33-6912-2_6
https://link.springer.com/article/10.1007/s11042-021-10775-6
https://link.springer.com/article/10.1007/s42452-019-1340-4
https://doi.org/10.1109/ICTC.2017.8191045
https://www.sciencedirect.com/science/article/pii/S1877050920316033
https://dl.acm.org/doi/abs/10.1145/3460961
https://link.springer.com/chapter/10.1007/978-3-030-68787-8_21
https://aclanthology.org/D19-1668/
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0249769
https://www.youtube.com/watch?v=aircAruvnKk&vl=en
https://www.youtube.com/watch?v=IHZwWFHWa-w
Image Sources:
https://commons.wikimedia.org/wiki/File:Edwin_Smith_Papyrus_v2.jpg
https://commons.wikimedia.org/wiki/File:Ancient_greek_text.jpg
https://commons.wikimedia.org/wiki/File:Mawangdui_Ancient_Texts_on_Silk_or_Wood_(10113031274).jpg
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0249769
https://pixabay.com/vectors/neural-network-thought-mind-mental-3816319/
https://commons.wikimedia.org/wiki/File:Deep_Learning.jpg
https://www.istockphoto.com/photo/data-scientists-male-programmer-using-laptop-analyzing-and-developing-in-various-gm1295900106-389481184
https://commons.wikimedia.org/wiki/File:Supervised_machine_learning_in_a_nutshell.svg
https://commons.wikimedia.org/wiki/File:Bactrian_document_Northern_Afghanistan_4th_century.jpg
https://commons.wikimedia.org/wiki/File:Codex_Vercellensis_-_Old_Latin_gospel_(John_ch._16,_v._23-30)_(The_S.S._Teacher%27s_Edition-The_Holy_Bible_-_Plate_XXXII).jpg
https://picryl.com/media/fragments-from-a-book-of-the-dead-892f17
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0249769
It’s probably not a surprise that many ancient texts are a bit worn out and tattered, and that makes deciphering what they say quite a task. But with new computer tech and artificial intelligence, we are getting much clearer glimpses of what people of the past thought was important enough to write down.
Hosted by: Hank Green
SciShow is on TikTok! Check us out at https://www.tiktok.com/@scishow
----------
Support SciShow by becoming a patron on Patreon: https://www.patreon.com/scishow
----------
Huge thanks go to the following Patreon supporters for helping us keep SciShow free for everyone forever:
Bryan Cloer, Sam Lutfi, Kevin Bealer, Jacob, Christoph Schwanke, Jason A Saslow, Eric Jensen, Jeffrey Mckishen, Nazara, Ash, Matt Curls, Christopher R Boucher, Alex Hackman, Piya Shedden, Adam Brainard, charles george, Jeremy Mysliwiec, Dr. Melvin Sanicas, Chris Peters, Harrison Mills, Silas Emrys, Alisa Sherbow
----------
Looking for SciShow elsewhere on the internet?
SciShow Tangents Podcast: https://scishow-tangents.simplecast.com/
Facebook: http://www.facebook.com/scishow
Twitter: http://www.twitter.com/scishow
Instagram: http://instagram.com/thescishow
----------
Sources:
https://doi.org/10.1017/S0959774318000471
https://ieeexplore.ieee.org/document/7004460
https://www.cl.cam.ac.uk/teaching/0809/CompVision/
https://link.springer.com/article/10.1007/s10462-020-09827-4#Sec25
https://doi.org/10.1109/ICEngTechnol.2017.8308186
https://www.inf.ed.ac.uk/teaching/courses/nlu/assets/reading/Gurney_et_al.pdf
https://www.nature.com/articles/s41583-021-00473-5
https://link.springer.com/chapter/10.1007/978-3-540-75171-7_2
https://link.springer.com/chapter/10.1007/978-3-642-30223-7_87
https://arxiv.org/pdf/1707.02968.pdf
https://link.springer.com/article/10.1007/s10462-020-09827-4
https://www.science.org/doi/10.1126/sciadv.1601247
https://arxiv.org/pdf/1707.02968.pdf
https://dl.acm.org/doi/pdf/10.1145/3457607
https://doi.org/10.1145/3290605.3300773
https://doi.org/10.1117/12.2010051
https://ieeexplore.ieee.org/document/6628705
https://link.springer.com/chapter/10.1007/978-981-33-6912-2_6
https://link.springer.com/article/10.1007/s11042-021-10775-6
https://link.springer.com/article/10.1007/s42452-019-1340-4
https://doi.org/10.1109/ICTC.2017.8191045
https://www.sciencedirect.com/science/article/pii/S1877050920316033
https://dl.acm.org/doi/abs/10.1145/3460961
https://link.springer.com/chapter/10.1007/978-3-030-68787-8_21
https://aclanthology.org/D19-1668/
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0249769
https://www.youtube.com/watch?v=aircAruvnKk&vl=en
https://www.youtube.com/watch?v=IHZwWFHWa-w
Image Sources:
https://commons.wikimedia.org/wiki/File:Edwin_Smith_Papyrus_v2.jpg
https://commons.wikimedia.org/wiki/File:Ancient_greek_text.jpg
https://commons.wikimedia.org/wiki/File:Mawangdui_Ancient_Texts_on_Silk_or_Wood_(10113031274).jpg
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0249769
https://pixabay.com/vectors/neural-network-thought-mind-mental-3816319/
https://commons.wikimedia.org/wiki/File:Deep_Learning.jpg
https://www.istockphoto.com/photo/data-scientists-male-programmer-using-laptop-analyzing-and-developing-in-various-gm1295900106-389481184
https://commons.wikimedia.org/wiki/File:Supervised_machine_learning_in_a_nutshell.svg
https://commons.wikimedia.org/wiki/File:Bactrian_document_Northern_Afghanistan_4th_century.jpg
https://commons.wikimedia.org/wiki/File:Codex_Vercellensis_-_Old_Latin_gospel_(John_ch._16,_v._23-30)_(The_S.S._Teacher%27s_Edition-The_Holy_Bible_-_Plate_XXXII).jpg
https://picryl.com/media/fragments-from-a-book-of-the-dead-892f17
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0249769
Thank you to Cometeer for sponsoring today’s episode! Cometeer offers barista-quality coffee that’s delivered in frozen, recyclable aluminum capsules.
Click the link in the description and you’ll get $20 off your first purchase, plus free shipping. That’s 10 free cups of coffee and over 30% off! [♪ INTRO] The art of writing is thousands of years old, and, amazingly, some texts have even lasted that long.
Whether etched on paper or stone, the bits that survive give us lots of insight into ancient worlds and their cultures, but not before going through some hoops. You see, some of these scripts are divided across thousands of tiny fragments, which are tricky to read, even by skilled researchers. So, experts spent more time figuring out what something says rather than understanding what it really means.
But computer vision is starting to change that. Reading fragments of documents to decipher individual letters and words is pretty challenging. So, the main goal for historians is to leave that tedious task to the machines.
Computer vision can process images of text with algorithms and artificial intelligence to automatically extract the text from those pictures, producing a clean digital text that experts can easily study and share with others. While computer vision has been around in many forms, today's most powerful techniques tend to rely on deep learning. Deep learning involves a unique algorithm called a neural network, named for its “network” of interconnected nodes, which are called “neurons.” The neurons, in this case, are simple mathematical functions.
They're arranged in a sequence of layers, so that when data are fed into them, the data are processed from one layer to the next to make a prediction or classification. As the name implies, neural networks are sort of analogous to our brain structure, albeit very simplified. So, modern networks have multiple layers of neurons, which scientists describe as “deep.” And they become helpful at certain tasks by learning from different data examples, hence the term “deep learning.” Now the idea is that scientists repeatedly show the network pairs of inputs and outputs.
The inputs are data the algorithm needs to process, like a digital image of a text fragment. Outputs are the desired outcome or what the text from the picture is supposed to say. So just to simplify, researchers might show the network an image of the letter “R” and let it make a prediction about which letter it is.
It might guess correctly, or it might guess wrongly and classify it as a “Q.” When the algorithm classifies something wrong, that's when the “learning” comes in. It involves using the difference between the desired outcome and the algorithm’s guess to tweak the network’s parameters. So it can then make a more accurate prediction the next time around.
By showing thousands of these kinds of examples, again and again, the parameters of the network configure themselves to get better and better at figuring out what the images say. With enough data, networks can eventually correctly identify letters and words they have never even seen before, which is how we check how accurate they really are. Neural networks are a pretty general tool, so it can do more things beyond just reading images.
They can be trained to identify which parts of an image actually contain the text in the first place, make an attempt to fill in missing text, and even digitally reconstruct whole documents that are too fragile to physically touch. Most of these applications currently involve using lots of training data of correctly labeled images with the correct output, and the more, the better. But the correct output, in this case, usually comes from humans who read the text in the first place.
So, in this kind of learning, machines might come to basically copy the examples humans give them. And, as you probably know, we sometimes make biased or flawed judgements, so machines can come to inherit the same kinds of biases. Thankfully, in the case of text, the output we want tends to be pretty unambiguous.
So, for now, it still takes lots of initial human work to create enough data examples for these algorithms to “learn” how to do their jobs. But once you go through that initial effort, scientists can train models that work quickly and efficiently on ancient documents. So far, computer vision has been used to read printed Latin letters, like the kind used for English, for decades, but new techniques are expanding to different historical languages and writing styles from all over the world.
Researchers have created models that can read stylized versions of Latin script from old German, Tamil, Devanagari, the ancient Ethiopian language of Ge’ez, Korean, and Japanese, just to name a few. And, as we mentioned, these algorithms can do more than just read text. One 2021 study used computer vision to digitally piece together torn papyrus fragments from the Dead Sea.
Another 2021 study made old documents with faded text easier to read, while a different study from 2019 helped guess missing ancient Greek words in broken stone tablets based on existing examples. These algorithms might also be able to help with the actual “history” part too. A 2021 study used neural networks to try and identify if an ancient Hebrew scroll from the Dead Sea was written by a single person based on features of the handwriting.
And researchers found that it wasn’t just a single person that wrote it; it was multiple scribes that were careful enough to have similar handwriting to each other! We are still in the early days for this kind of work, but it’s beginning to look like the future of archeology could be a little more “Tony Stark” than “Indiana Jones.” And that goes for the world of coffee, too, thanks to today’s sponsor Cometeer. Cometeer brews barista-quality coffee from the world’s best specialty roasters.
The coffee is ground, brewed, and immediately flash-frozen to lock in the coffee bean’s freshness and flavor. Then it gets delivered right to your door inside convenient recyclable aluminum capsules. And you can make the cometeer coffee at home in just a couple of minutes, just add hot water, or cold if you’d rather have iced coffee, and you’ll have a cup that tastes just like a barista brewed it for you.
If you’re interested in trying them out, they have a special offer going for fans of SciShow right now. You can get $20 off your first purchase plus free shipping when you click on our link. That’s ten free cups of coffee and over 30% off! [♪ OUTRO]
Click the link in the description and you’ll get $20 off your first purchase, plus free shipping. That’s 10 free cups of coffee and over 30% off! [♪ INTRO] The art of writing is thousands of years old, and, amazingly, some texts have even lasted that long.
Whether etched on paper or stone, the bits that survive give us lots of insight into ancient worlds and their cultures, but not before going through some hoops. You see, some of these scripts are divided across thousands of tiny fragments, which are tricky to read, even by skilled researchers. So, experts spent more time figuring out what something says rather than understanding what it really means.
But computer vision is starting to change that. Reading fragments of documents to decipher individual letters and words is pretty challenging. So, the main goal for historians is to leave that tedious task to the machines.
Computer vision can process images of text with algorithms and artificial intelligence to automatically extract the text from those pictures, producing a clean digital text that experts can easily study and share with others. While computer vision has been around in many forms, today's most powerful techniques tend to rely on deep learning. Deep learning involves a unique algorithm called a neural network, named for its “network” of interconnected nodes, which are called “neurons.” The neurons, in this case, are simple mathematical functions.
They're arranged in a sequence of layers, so that when data are fed into them, the data are processed from one layer to the next to make a prediction or classification. As the name implies, neural networks are sort of analogous to our brain structure, albeit very simplified. So, modern networks have multiple layers of neurons, which scientists describe as “deep.” And they become helpful at certain tasks by learning from different data examples, hence the term “deep learning.” Now the idea is that scientists repeatedly show the network pairs of inputs and outputs.
The inputs are data the algorithm needs to process, like a digital image of a text fragment. Outputs are the desired outcome or what the text from the picture is supposed to say. So just to simplify, researchers might show the network an image of the letter “R” and let it make a prediction about which letter it is.
It might guess correctly, or it might guess wrongly and classify it as a “Q.” When the algorithm classifies something wrong, that's when the “learning” comes in. It involves using the difference between the desired outcome and the algorithm’s guess to tweak the network’s parameters. So it can then make a more accurate prediction the next time around.
By showing thousands of these kinds of examples, again and again, the parameters of the network configure themselves to get better and better at figuring out what the images say. With enough data, networks can eventually correctly identify letters and words they have never even seen before, which is how we check how accurate they really are. Neural networks are a pretty general tool, so it can do more things beyond just reading images.
They can be trained to identify which parts of an image actually contain the text in the first place, make an attempt to fill in missing text, and even digitally reconstruct whole documents that are too fragile to physically touch. Most of these applications currently involve using lots of training data of correctly labeled images with the correct output, and the more, the better. But the correct output, in this case, usually comes from humans who read the text in the first place.
So, in this kind of learning, machines might come to basically copy the examples humans give them. And, as you probably know, we sometimes make biased or flawed judgements, so machines can come to inherit the same kinds of biases. Thankfully, in the case of text, the output we want tends to be pretty unambiguous.
So, for now, it still takes lots of initial human work to create enough data examples for these algorithms to “learn” how to do their jobs. But once you go through that initial effort, scientists can train models that work quickly and efficiently on ancient documents. So far, computer vision has been used to read printed Latin letters, like the kind used for English, for decades, but new techniques are expanding to different historical languages and writing styles from all over the world.
Researchers have created models that can read stylized versions of Latin script from old German, Tamil, Devanagari, the ancient Ethiopian language of Ge’ez, Korean, and Japanese, just to name a few. And, as we mentioned, these algorithms can do more than just read text. One 2021 study used computer vision to digitally piece together torn papyrus fragments from the Dead Sea.
Another 2021 study made old documents with faded text easier to read, while a different study from 2019 helped guess missing ancient Greek words in broken stone tablets based on existing examples. These algorithms might also be able to help with the actual “history” part too. A 2021 study used neural networks to try and identify if an ancient Hebrew scroll from the Dead Sea was written by a single person based on features of the handwriting.
And researchers found that it wasn’t just a single person that wrote it; it was multiple scribes that were careful enough to have similar handwriting to each other! We are still in the early days for this kind of work, but it’s beginning to look like the future of archeology could be a little more “Tony Stark” than “Indiana Jones.” And that goes for the world of coffee, too, thanks to today’s sponsor Cometeer. Cometeer brews barista-quality coffee from the world’s best specialty roasters.
The coffee is ground, brewed, and immediately flash-frozen to lock in the coffee bean’s freshness and flavor. Then it gets delivered right to your door inside convenient recyclable aluminum capsules. And you can make the cometeer coffee at home in just a couple of minutes, just add hot water, or cold if you’d rather have iced coffee, and you’ll have a cup that tastes just like a barista brewed it for you.
If you’re interested in trying them out, they have a special offer going for fans of SciShow right now. You can get $20 off your first purchase plus free shipping when you click on our link. That’s ten free cups of coffee and over 30% off! [♪ OUTRO]