Nerdfighteria Wiki

YouTube:	https://youtube.com/watch?v=lo82twBZT8Q
Previous:	The Sun Is Green
Next:	Planet Powered Protein! #shorts #science #SciShow

Statistics

View count:	154,279
Likes:	7,435
Comments:	561
Duration:	12:05
Uploaded:	2023-06-06
Last sync:	2025-08-12 12:30

Citation

Citation formatting is not guaranteed to be accurate.
MLA Full:	"Why Is ChatGPT Bad At Math?" YouTube, uploaded by SciShow, 6 June 2023, www.youtube.com/watch?v=lo82twBZT8Q.
MLA Inline:	(SciShow, 2023)
APA Full:	SciShow. (2023, June 6). Why Is ChatGPT Bad At Math? [Video]. YouTube. https://youtube.com/watch?v=lo82twBZT8Q
APA Inline:	(SciShow, 2023)
Chicago Full:	SciShow, "Why Is ChatGPT Bad At Math?", June 6, 2023, YouTube, 12:05, https://youtube.com/watch?v=lo82twBZT8Q.

Head to https://linode.com/scishow to get a $100 60-day credit on a new Linode account. Linode offers simple, affordable, and accessible Linux cloud solutions and services.

Sometimes, you ask ChatGPT to do a math problem that an arithmetically-inclined grade schooler can do with ease. And sometimes, ChatGPT can confidently state the wrong answer. It's all due to its nature as a large language model, and the neural networks it uses to interact with us.

Want to hear our ChatGPT dinosaur poem? Check out our patreon at patreon.com/scishow!

Hosted by: Stefan Chin
----------
Support SciShow by becoming a patron on Patreon: https://www.patreon.com/scishow
----------
Huge thanks go to the following Patreon supporters for helping us keep SciShow free for everyone forever: Matt Curls, Alisa Sherbow, Dr. Melvin Sanicas, Harrison Mills, Adam Brainard, Chris Peters, charles george, Piya Shedden, Alex Hackman, Christopher R, Boucher, Jeffrey Mckishen, Ash, Silas Emrys, Eric Jensen, Kevin Bealer, Jason A Saslow, Tom Mosner, Tomás Lagos González, Jacob, Christoph Schwanke, Sam Lutfi, Bryan Cloer
----------
Looking for SciShow elsewhere on the internet?
SciShow Tangents Podcast: https://scishow-tangents.simplecast.com/
TikTok: https://www.tiktok.com/@scishow
Twitter: http://www.twitter.com/scishow
Instagram: http://instagram.com/thescishowFacebook: http://www.facebook.com/scishow

#SciShow #science #education #learning #complexly
----------

Sources:

https://www.youtube.com/watch?v=1I5ZMmrOfnA
https://www.sciencedirect.com/science/article/pii/S0262885607001096?casa_token=Q13niAJrUtgAAAAA:1Fib_lLmB0EH_C7nbqdI_DfepZHwuy3QDKaAX0hiZVQzFfCNYOkwmUqZLB19yU8vS_fBIgPYSxE
https://books.google.co.uk/books?hl=en&lr=&id=3yj_IdO1zPEC&oi=fnd&pg=PT2&dq=scunthorpe+penistone&ots=thw6jucWMd&sig=8teF04MOIDozpOY_Fp_aCSiZzNE&redir_esc=y#v=onepage&q=penistone&f=false
https://intjem.biomedcentral.com/articles/10.1186/s12245-015-0078-z
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6800670/#R25
https://hal.science/hal-03913837v1/preview/ChatGPT.pdf
https://ai.stackexchange.com/questions/38220/why-is-chatgpt-bad-at-math
https://www.britannica.com/technology/computer/The-first-computer
https://www.technologyreview.com/2023/02/08/1068068/chatgpt-is-everywhere-heres-where-it-came-from/
https://news.mit.edu/2023/large-language-models-in-context-learning-0207
https://www.mdpi.com/2079-9292/10/20/2470
https://cds.cern.ch/record/400313/files/p21
https://www.psychologytoday.com/gb/blog/your-internet-brain/202303/think-chatgpt-is-smart-ask-it-to-do-arithmetic
https://arxiv.org/pdf/2302.03494.pdf
https://arxiv.org/pdf/2301.13867.pdf

Images

https://www.gettyimages.com
https://commons.wikimedia.org/wiki/File:Fulladder.gif

This SciShow video is supported by Linode!

You can get a $100 60-day credit on a new Linode account at linode.com/scishow. You can ask ChatGPT to write sonnets about dinosaurs on skateboards.

Surely it can handle basic addition. But when I asked it this. ChatGPT returned this.

Which is… Not right. Now, it’s not always wrong, but somehow, humanity has developed a computer program that occasionally screws up grade school math. Much like an actual grade schooler.

And that’s weird, right? I mean, if there’s anything computers ought to be great at, it’s, you know, computing. But it turns out there’s a very good reason why ChatGPT is bad at math, and it’s because we’ve spent a lot of time and effort trying to get our computers to think less like calculators, and more like us. [Intro] To understand how we got here, it helps to take a step back and see how computers were originally designed to do math.

All modern computers have special components called arithmetic logic   units, or ALUs, which do all your behind-the-scenes number-crunching. And the basic building block of an ALU is a kind of electronic circuit called a logic   gate. Logic gates receive a set of input values…a series of 1s, 0s, or a mix of both…and then they   apply rigid, logical operations to produce an output…which is also either a 1 or a 0.

For example, an AND gate basically asks the question,   “Are both my first and second inputs 1s?” If so, it outputs a 1. And if not,   it outputs a 0. So 1s are basically stand-ins for “true”, and 0s for “false”.

As simple as it seems, the magic happens by taking the outputs of logic gates,   and feeding them as inputs into other logic gates. Stringing gates together like this makes circuits in a computer that can do math. For example, you can create a circuit called an adder that can add two binary numbers   together.

And by combining adder circuits, you can handle larger and larger numbers! By setting them up the right way, logic gates can perform all your favorite grade school math,   and maybe less favorite high school math that’s ultimately based on grade school math. So long as the numbers are in a range the ALU can handle,   it’ll perform that math with airtight accuracy.

But ALUs aren’t just used in calculators, or Google when   you ask if your American friend is right to complain about it being 20 degrees outside. Anything on your computer involving calculations or decision making,   like balancing a budget on a spreadsheet or shuffling songs on a playlist,   involves a series of mathematical computations performed on an ALU. But those rigid, logical operations can   make working with computers on a fundamental level pretty tricky.

Say you want to use a computer to constantly monitor the live-stream of a forest,   and spot any fires that pop up. After all, humans need things like sleep and bathroom   breaks. So you create an algorithm, or a set of rules for the computer to follow.

Fires are a noticeably different color than your typical tree, so one rule you include is something   like “If a pixel is 40% redder than it is on average, it corresponds to a place on fire.”. These little rules of thumb are sometimes called heuristics. And while a fairly   sophisticated fire-detection algorithm might use a whole bunch of heuristics,   it could still fail in unexpected situations.

Like, for example, if a setting Sun reflects off a lake in the image,   causing the lake to look over 40% redder than usual… A human could take a look and easily tell it wasn’t a real fire,   but the computer, running on those very strict rules, would sound a false alarm. It turns out that some complex tasks are pretty difficult for humans to translate   into instructions that a rigidly logical computer can follow exactly how we mean it to. But in the last decade, a different approach has exploded in popularity for tackling these tasks.

It’s called a neural network, and as the name implies,   it takes inspiration from the neurons in our brains. Neural networks are a kind of algorithm that connect up thousands or even millions of   mathematical components called neurons. Much like a logic gate, neurons take numbers as input and, according to some rules,   produce numbers as output that can then go on to be inputs for other neurons.

But unlike logic gates, a neural network’s inputs and outputs can be   any number that the computer’s hardware can represent, not just one or zero. So we can approximate almost any complex mathematical function by   using a large enough neural network. That makes it   easier to create reliable solutions to problems that require more nuance.

Another key difference between neural networks and logic gates is that in   order to come up with the rules they follow, neural networks are trained. Back to our forest fire example, a neural network can be shown pictures   of forests that are or aren’t on fire, and examples of the outputs we want,   like highlighting all the parts that correspond to a real forest fire. By feeding these pairs of inputs and outputs over and over,   the network learns what output we actually want for a given input.

And along the way, it writes its own rules to follow, as opposed to having humans program in   every single rule accounting for as many scenarios as possible from the get-go. With all their training, neural networks can sometimes create more robust and less   error prone algorithms than we can write using heuristics,. And that’s especially   true for tasks that involve complex and unstructured data, like human language.

Which is where ChatGPT finally enters the fray. It’s what’s called a large language model, or LLM. It was trained on huge bodies of writing on the internet like Wikipedia, to take a piece of text as input and produce a piece of text in response as an output.

The general idea has been around for a few years, but ChatGPT is so eerily good at responding to certain requests, it can feel like a major step forward. That’s partially because ChatGPT’s neural network is designed to pay better attention to both the context of a piece of data, and the most important bits of the input text. For example, it can craft long sentences that make way more sense than you’d get by continuously hitting the auto-complete options on your phone.

But it’s also because ChatGPT was trained with a lot of human-assisted feedback, to specifically curate outputs that the trainers consider “high quality”. We’re glossing over lots of detail, but needless to say ChatGPT has generated a lot of news for its ability to convincingly emulate human responses to all kinds of funky problems. It’s blown open the door to how we interact with computers.

Rather than painstakingly coding sophisticated solutions to many problems, the interface of ChatGPT allows us to basically talk to a computer in natural language to make requests. Which includes asking it to do math! For example, you can type in: “Give me the sum of the first 20 numbers, divided by 4.” And not only does it give the correct answer of 52.5, it demonstrates that it correctly used a famous formula for adding numbers!

Which is like… what. WHAT. Since this is a computer correctly doing math, you might think some part of ChatGPT’s enormous network resembles the logic gate structure of an adder, making it capable of doing calculations the same way.

But before you throw your calculator away for good, remember the showstopper at the beginning of this video. Asking ChatGPT an even simpler question, just taking the sum of two large numbers, sometimes gives a wrong answer. Admittedly, in our example, it only gets one digit in the middle of the number incorrect.

But it’s kind of weird. You’d never get that error on a cheap, plastic, solar powered calculator, provided the numbers could fit on the screen. And it’s not an isolated glitch either!

Lots of people have noticed that ChatGPT often fails on reasonably straightforward math when larger numbers are involved. Since outside researchers don’t have direct access to the model… to the rules that the latest version of ChatGPT has taught itself… there’s no fully transparent study available. But based on what we know, some of it likely comes down to the training process.

LLMs are trained to basically regurgitate a collage of words that closely resembles the patterns it’s encountered in its training data. Some of that data not only includes examples of adding numbers, but also encodes the broader structures of how people talk about numbers and the functions we perform with them. So somewhere deep in ChatGPT, there’s probably some bits of the network that resemble basic arithmetic.

After all, the numbers I used in my example don't show up in any internet searches, so it can’t just regurgitate an answer it found online. And in the end, it did still manage to correctly add up most of the digits. In fact, a recent preprint by Chinese researchers found that the latest model of ChatGPT could accurately add and subtract numbers under one trillion about 99% of the time!

Unfortunately, that accuracy drops when it comes to multiplication. The model only managed to get the right answer about two thirds of the time! These failures imply that it doesn’t form perfect, logic-gate style math with unfaltering accuracy, like an ALU would do.

If it does have that little bit of self-written code, it can’t find a way to use it consistently every time it’s supposed to. It’s not great when you’re hoping to get an answer that’s 100% correct, 100% of the time, but you know what ChatGPT’s math skills remind me of? Me!

Humans are prone to using unreliable reasoning too, especially for wordy math problems that cause us to flub even simple arithmetic problems. And how many of us have forgotten to carry a 1 or two? But with a bit of thought and guidance, we can improve our problem-solving skills.

And very weirdly… so might ChatGPT. There are cases of people coaxing ChatGPT into becoming more accurate with addition by explaining its logic more carefully. Yep, that includes adding up those large numbers!

So it might take some prodding, but ChatGPT has the potential to be reliable. But ultimately, we can’t guarantee its answers will be accurate, like we can for the old-school ALUs. Its reasoning, performance, and abilities are still shaped more by human expression and the data we produce, rather than hard logical rules.

So for now, you probably want to treat ChatGPT’s answers the same way you’d treat those coming from a human: Understanding that whatever their intentions, they’re fallible, sometimes unreliable, and need verifying from other sources of information, even if we think they’re right. But not everything we do requires hard facts and calculations. ChatGPTs best quality may be its capacity to throw things out there and give us food for thought.

Like, “Give me some catchy title suggestions for my comedic novel about a team of people creating a science show on the internet, building a wholesome and nerdy community in the process." Okay, maybe now I know what not to title it… Taking ChatGPT’s suggestions as a creative starting point, or even combining it with more reliable code that can produce precise outputs, might be the best way forward. And this seems to be the direction its designers are headed in. OpenAI, the organization behind ChatGPT, is testing out connecting their program with the platform Wolfram Alpha, which does contain hard coded, logical ways of processing math.

But for now, as impressive as ChatGPT is, if you’re looking to crunch numbers rather than draft emails… you might be better off with the calculator. Oh, and by the time you’re watching this video, ChatGPT may have learned to do this math problem correctly. Don’t worry.

With a little trial and error, you might be able to find a new problem it can’t solve yet. Thanks for watching this SciShow video, supported by Linode! Linode is a cloud computing company from Akamai that provides storage space, databases, analytics and more to you or your company.

And they do all of that really well. User reviews ranked Linode above average in ease of use. And by “above average,” I mean easier than other big companies you might be familiar with.

Reviews also ranked Linode above average in ease of setup and quality of support. But user reviews are just one metric. Linode literally wins awards for their customer support.

You can talk to a real person, which is shockingly rare these days, at any time of day and any time of year. After almost two decades of cloud computing, Linode has figured out how to get you the information you need. You can try out Linode by clicking the link in the description down below or going to linode.com/scishow for a $100 60-day credit on a new Linode account.

Thanks to Linode for supporting this SciShow video! [ OUTRO ]

Popular articles

scishow
Why Is ChatGPT Bad At Math?

Categories

Statistics

Citation

Nerdfighteria Wiki

Popular articles

scishow Why Is ChatGPT Bad At Math?

Categories

Statistics

Citation

scishow
Why Is ChatGPT Bad At Math?