crashcourse
The First Programming Languages: Crash Course Computer Science #11
YouTube: | https://youtube.com/watch?v=RU1u-js7db8 |
Previous: | Max Weber & Modernity: Crash Course Sociology #9 |
Next: | The Language of Film: Crash Course Film History #5 |
Categories
Statistics
View count: | 1,116,494 |
Likes: | 21,778 |
Comments: | 658 |
Duration: | 11:52 |
Uploaded: | 2017-05-10 |
Last sync: | 2024-10-23 03:00 |
Citation
Citation formatting is not guaranteed to be accurate. | |
MLA Full: | "The First Programming Languages: Crash Course Computer Science #11." YouTube, uploaded by CrashCourse, 10 May 2017, www.youtube.com/watch?v=RU1u-js7db8. |
MLA Inline: | (CrashCourse, 2017) |
APA Full: | CrashCourse. (2017, May 10). The First Programming Languages: Crash Course Computer Science #11 [Video]. YouTube. https://youtube.com/watch?v=RU1u-js7db8 |
APA Inline: | (CrashCourse, 2017) |
Chicago Full: |
CrashCourse, "The First Programming Languages: Crash Course Computer Science #11.", May 10, 2017, YouTube, 11:52, https://youtube.com/watch?v=RU1u-js7db8. |
Get your first two months of CuriosityStream free by going to http://curiositystream.com/crashcourse and using the promo code “crashcourse”.
So we ended last episode with programming at the hardware level with things like plugboards and huge panels of switches, but what was really needed was a more versatile way to program computers - software! For much of this series we’ve been talking about machine code, or the 1’s and 0’s our computers read to perform operations, but giving our computers instructions in 1’s and 0’s is incredibly inefficient, and a “higher-level” language was needed. This led to the development of assembly code and assemblers that allow us to use operands and mnemonics to more easily write programs, but assembly language is still tied to underlying hardware. So by 1952 Navy officer Grace Hopper had helped created the first high-level programming language A-0 and compiler to translate that code to our machines. This would eventually lead to IBM’s Fortran and then a golden age of computing languages over the coming decades. Most importantly, these new languages utilized new abstractions to make programming easier and more powerful giving more and more people the ability to create new and amazing things.
Produced in collaboration with PBS Digital Studios: http://youtube.com/pbsdigitalstudios
Want to know more about Carrie Anne?
https://about.me/carrieannephilbin
The Latest from PBS Digital Studios: https://www.youtube.com/playlist?list=PL1mtdjDVOoOqJzeaJAV15Tq0tZ1vKj7ZV
Want to find Crash Course elsewhere on the internet?
Facebook - https://www.facebook.com/YouTubeCrash...
Twitter - http://www.twitter.com/TheCrashCourse
Tumblr - http://thecrashcourse.tumblr.com
Support Crash Course on Patreon: http://patreon.com/crashcourse
CC Kids: http://www.youtube.com/crashcoursekids
So we ended last episode with programming at the hardware level with things like plugboards and huge panels of switches, but what was really needed was a more versatile way to program computers - software! For much of this series we’ve been talking about machine code, or the 1’s and 0’s our computers read to perform operations, but giving our computers instructions in 1’s and 0’s is incredibly inefficient, and a “higher-level” language was needed. This led to the development of assembly code and assemblers that allow us to use operands and mnemonics to more easily write programs, but assembly language is still tied to underlying hardware. So by 1952 Navy officer Grace Hopper had helped created the first high-level programming language A-0 and compiler to translate that code to our machines. This would eventually lead to IBM’s Fortran and then a golden age of computing languages over the coming decades. Most importantly, these new languages utilized new abstractions to make programming easier and more powerful giving more and more people the ability to create new and amazing things.
Produced in collaboration with PBS Digital Studios: http://youtube.com/pbsdigitalstudios
Want to know more about Carrie Anne?
https://about.me/carrieannephilbin
The Latest from PBS Digital Studios: https://www.youtube.com/playlist?list=PL1mtdjDVOoOqJzeaJAV15Tq0tZ1vKj7ZV
Want to find Crash Course elsewhere on the internet?
Facebook - https://www.facebook.com/YouTubeCrash...
Twitter - http://www.twitter.com/TheCrashCourse
Tumblr - http://thecrashcourse.tumblr.com
Support Crash Course on Patreon: http://patreon.com/crashcourse
CC Kids: http://www.youtube.com/crashcoursekids
This episode is brought to you by CuriosityStream.
Hi, I’m Carrie Anne and welcome to CrashCourse Computer Science! So far, for most of this series, we’ve focused on hardware -- the physical components of computing -- things like: electricity and circuits, registers and RAM, ALUs and CPUs.
But programming at the hardware level is cumbersome and inflexible, so programmers wanted a more versatile way to program computers - what you might call a “softer” medium. That’s right, we’re going to talk about Software! INTRO In episode 8, we walked through a simple program for the CPU we designed.
The very first instruction to be executed, the one at memory address 0, was 0010 1110. As we discussed, the first four bits of an instruction is the operation code, or OPCODE for short. On our hypothetical CPU, 0010 indicated a LOAD_A instruction -- which moves a value from memory into Register A.
The second set of four bits defines the memory location, in this case, 1110, which is 14 in decimal. So what these eight numbers really mean is “LOAD Address 14 into Register A”. We’re just using two different languages.
You can think of it like English and Morse Code. “Hello” and “.... . .-.. .-.. ---” mean the same thing -- hello! -- they’re just encoded differently. English and Morse Code also have different levels of complexity. English has 26 different letters in its alphabet and way more possible sounds.
Morse only has dots and dashes. But, they can convey the same information, and computer languages are similar. As we've seen, computer hardware can only handle raw, binary instructions.
This is the “language” computer processors natively speak. In fact, it’s the only language they’re able to speak. It’s called Machine Language or Machine Code.
In the early days of computing, people had to write entire programs in machine code. More specifically, they’d first write a high-level version of a program on paper, in English, for example “retrieve the next sale from memory, then add this to the running total for the day, week and year, then calculate any tax to be added” ...and so on. An informal, high-level description of a program like this is called Pseudo-Code.
Then, when the program was all figured out on paper, they’d painstakingly expand and translate it into binary machine code by hand, using things like opcode tables. After the translation was complete, the program could be fed into the computer and run. As you might imagine, people quickly got fed up with this process.
So, by the late 1940s and into the 50s, programmers had developed slightly higher-level languages that were more human-readable. Opcodes were given simple names, called mnemonics, which were followed by operands, to form instructions. So instead of having to write instructions as a bunch of 1’s and 0’s, programmers could write something like “LOAD_A 14”.
We used this mnemonic in Episode 8 because it’s so much easier to understand! Of course, a CPU has no idea what “LOAD_A 14” is. It doesn’t understand text-based language, only binary.
And so programmers came up with a clever trick. They created reusable helper programs, in binary, that read in text-based instructions, and assemble them into the corresponding binary instructions automatically. This program is called -- you guessed it -- an Assembler.
It reads in a program written in an Assembly Language and converts it to native machine code. 3:07“LOAD_A 14” is one example of an assembly instruction. Over time, Assemblers gained new features that made programming even easier. One nifty feature is automatically figuring out JUMP addresses.
This was an example program I used in episode 8:Notice how our JUMP NEGATIVE instruction jumps to address 5, and our regular JUMP goes to address 2. The problem is, if we add more code to the beginning of this program, all of the addresses would change. That’s a huge pain if you ever want to update your program!
And so an assembler does away with raw jump addresses, and lets you insert little labels that can be jumped to. When this program is passed into the assembler, it does the work of figuring out all of the jump addresses. Now the programmer can focus more on programming and less on the underlying mechanics under the hood enabling more sophisticated things to be built by hiding unnecessary complexity.
As we’ve done many times in this series, we’re once again moving up another level of abstraction. A NEW LEVEL OF ABSTRACTION! However, even with nifty assembler features like auto-linking JUMPs to labels, Assembly Languages are still a thin veneer over machine code.
In general, each assembly language instruction converts directly to a corresponding machine instruction – a one-to-one mapping – so it’s inherently tied to the underlying hardware. And the assembler still forces programmers to think about which registers and memory locations they will use. If you suddenly needed an extra value, you might have to change a lot of code to fit it in.
Let’s go to the Thought Bubble. This problem did not escape Dr. Grace Hopper.
As a US naval officer, she was one of the first programmers on the Harvard Mark 1 computer, which we talked about in Episode 2. This was a colossal, electro-mechanical beast completed in 1944 as part of the allied war effort. Programs were stored and fed into the computer on punched paper tape.
By the way, as you can see, they “patched” some bugs in this program by literally putting patches of paper over the holes on the punch tape. The Mark 1’s instruction set was so primitive, there weren’t even JUMP instructions. To create code that repeated the same operation multiple times, you’d tape the two ends of the punched tape together, creating a physical loop.
In other words, programming the Mark 1 was kind of a nightmare! After the war, Hopper continued to work at the forefront of computing. To unleash the potential of computers, she designed a high-level programming language called “Arithmetic Language Version 0”, or A-0 for short.
Assembly languages have direct, one-to-one mapping to machine instructions. But, a single line of a high-level programming language might result in dozens of instructions being executed by the CPU. To perform this complex translation, Hopper built the first compiler in 1952.
This is a specialized program that transforms “source” code written in a programming language into a low-level language, like assembly or the binary “machine code” that the CPU can directly process. Thanks, Thought Bubble. So, despite the promise of easier programming, many people were skeptical of Hopper’s idea.
She once said, “I had a running compiler and nobody would touch it… they carefully told me, computers could only do arithmetic; they could not do programs.” But the idea was a good one, and soon many efforts were underway to craft new programming languages -- today there are hundreds! Sadly, there are no surviving examples of A-0 code, so we’ll use Python, a modern programming language, as an example. Let’s say we want to add two numbers and save that value.
Remember, in assembly code, we had to fetch values from memory, deal with registers, and other low-level details. But this same program can be written in python like so: Notice how there are no registers or memory locations to deal with -- the compiler takes care of that stuff, abstracting away a lot of low-level and unnecessary complexity. The programmer just creates abstractions for needed memory locations, known as variables, and gives them names.
So now we can just take our two numbers, store them in variables we give names to -- in this case, I picked a and b but those variables could be anything - and then add those together, saving the result in c, another variable I created. It might be that the compiler assigns Register A under the hood to store the value in a, but I don’t need to know about it! Out of sight, out of mind!
It was an important historical milestone, but A-0 and its later variants weren’t widely used. FORTRAN, derived from "Formula Translation", was released by IBM a few years later, in 1957, and came to dominate early computer programming. John Backus, the FORTRAN project director, said: "Much of my work has come from being lazy.
I didn't like writing programs, and so ... I started work on a programming system to make it easier to write programs." You know, typical lazy person. They’re always creating their own programming systems.
Anyway, on average, programs written in FORTRAN were 20 times shorter than equivalent handwritten assembly code. Then the FORTRAN Compiler would translate and expand that into native machine code. The community was skeptical that the performance would be as good as hand written code, but the fact that programmers could write more code more quickly, made it an easy choice economically: trading a small increase in computation time for a significant decrease in programmer time.
Of course, IBM was in the business of selling computers, and so initially, FORTRAN code could only be compiled and run on IBM computers. And most programming languages and compilers of the 1950s could only run on a single type of computer. So, if you upgraded your computer, you’d often have to re-write all the code too!
In response, computer experts from industry, academia and government formed a consortium in 1959 -- the Committee on Data Systems Languages, advised by our friend Grace Hopper -- to guide the development of a common programming language that could be used across different machines. The result was the high-level, easy to use, Common Business-Oriented Language, or COBOL for short. To deal with different underlying hardware, each computing architecture needed its own COBOL compiler.
But critically, these compilers could all accept the same COBOL source code, no matter what computer it was run on. This notion is called write once, run anywhere. It’s true of most programming languages today, a benefit of moving away from assembly and machine code, which is still CPU specific.
The biggest impact of all this was reducing computing’s barrier to entry. Before high level programming languages existed, it was a realm exclusive to computer experts and enthusiasts. And it was often their full time profession.
But now, scientists, engineers, doctors, economists, teachers, and many others could incorporate computation into their work . Thanks to these languages, computing went from a cumbersome and esoteric discipline to a general purpose and accessible tool. At the same time, abstraction in programming allowed those computer experts – now “professional programmers” – to create increasingly sophisticated programs, which would have taken millions, tens of millions, or even more lines of assembly code.
Now, this history didn’t end in 1959. In fact, a golden era in programming language design jump started, evolving in lockstep with dramatic advances in computer hardware. In the 1960s, we had languages like ALGOL, LISP and BASIC.
In the 70’s: Pascal, C and Smalltalk were released. The 80s gave us C++, Objective-C, and Perl. And the 90’s: python, ruby, and Java.
And the new millennium has seen the rise of Swift, C#, and Go - not to be confused with Let it Go and Pokemon Go. Anyway, some of these might sound familiar -- many are still around today. It’s extremely likely that the web browser you’re using right now was written in C++ or Objective-C.
That list I just gave is the tip of the iceberg. And languages with fancy, new features are proposed all the time. Each new language attempts to leverage new and clever abstractions to make some aspect of programming easier or more powerful, or take advantage of emerging technologies and platforms, so that more people can do more amazing things, more quickly.
Many consider the holy grail of programming to be the use of “plain ol’ English”, where you can literally just speak what you want the computer to do, it figures it out, and executes it. This kind of intelligent system is science fiction… for now. And fans of 2001: A Space Odyssey may be okay with that.
Now that you know all about programming languages, we’re going to deep dive for the next couple of episodes, and we’ll continue to build your understanding of how programming languages, and the software they create, are used to do cool and unbelievable things. See you next week. Hey guys, this week’s episode was brought to you by CuriosityStream which is a streaming service full of documentaries and nonfiction titles from some really great filmmakers, including exclusive originals.
I just watched a great series called “Digits” hosted by our friend Derek Muller. It’s all about the Internet - from its origins, to the proliferation of the Internet of Things, to ethical, or white hat, hacking. And it even includes some special guest appearances… like that John Green guy you keep mentioning in the comments.
And Curiosity Stream offers unlimited access starting at $2.99 a month, and for you guys, the first two months are free if you sign up at curiositystream.com/crashcourse and use the promo code "crash course" during the sign-up process.
Hi, I’m Carrie Anne and welcome to CrashCourse Computer Science! So far, for most of this series, we’ve focused on hardware -- the physical components of computing -- things like: electricity and circuits, registers and RAM, ALUs and CPUs.
But programming at the hardware level is cumbersome and inflexible, so programmers wanted a more versatile way to program computers - what you might call a “softer” medium. That’s right, we’re going to talk about Software! INTRO In episode 8, we walked through a simple program for the CPU we designed.
The very first instruction to be executed, the one at memory address 0, was 0010 1110. As we discussed, the first four bits of an instruction is the operation code, or OPCODE for short. On our hypothetical CPU, 0010 indicated a LOAD_A instruction -- which moves a value from memory into Register A.
The second set of four bits defines the memory location, in this case, 1110, which is 14 in decimal. So what these eight numbers really mean is “LOAD Address 14 into Register A”. We’re just using two different languages.
You can think of it like English and Morse Code. “Hello” and “.... . .-.. .-.. ---” mean the same thing -- hello! -- they’re just encoded differently. English and Morse Code also have different levels of complexity. English has 26 different letters in its alphabet and way more possible sounds.
Morse only has dots and dashes. But, they can convey the same information, and computer languages are similar. As we've seen, computer hardware can only handle raw, binary instructions.
This is the “language” computer processors natively speak. In fact, it’s the only language they’re able to speak. It’s called Machine Language or Machine Code.
In the early days of computing, people had to write entire programs in machine code. More specifically, they’d first write a high-level version of a program on paper, in English, for example “retrieve the next sale from memory, then add this to the running total for the day, week and year, then calculate any tax to be added” ...and so on. An informal, high-level description of a program like this is called Pseudo-Code.
Then, when the program was all figured out on paper, they’d painstakingly expand and translate it into binary machine code by hand, using things like opcode tables. After the translation was complete, the program could be fed into the computer and run. As you might imagine, people quickly got fed up with this process.
So, by the late 1940s and into the 50s, programmers had developed slightly higher-level languages that were more human-readable. Opcodes were given simple names, called mnemonics, which were followed by operands, to form instructions. So instead of having to write instructions as a bunch of 1’s and 0’s, programmers could write something like “LOAD_A 14”.
We used this mnemonic in Episode 8 because it’s so much easier to understand! Of course, a CPU has no idea what “LOAD_A 14” is. It doesn’t understand text-based language, only binary.
And so programmers came up with a clever trick. They created reusable helper programs, in binary, that read in text-based instructions, and assemble them into the corresponding binary instructions automatically. This program is called -- you guessed it -- an Assembler.
It reads in a program written in an Assembly Language and converts it to native machine code. 3:07“LOAD_A 14” is one example of an assembly instruction. Over time, Assemblers gained new features that made programming even easier. One nifty feature is automatically figuring out JUMP addresses.
This was an example program I used in episode 8:Notice how our JUMP NEGATIVE instruction jumps to address 5, and our regular JUMP goes to address 2. The problem is, if we add more code to the beginning of this program, all of the addresses would change. That’s a huge pain if you ever want to update your program!
And so an assembler does away with raw jump addresses, and lets you insert little labels that can be jumped to. When this program is passed into the assembler, it does the work of figuring out all of the jump addresses. Now the programmer can focus more on programming and less on the underlying mechanics under the hood enabling more sophisticated things to be built by hiding unnecessary complexity.
As we’ve done many times in this series, we’re once again moving up another level of abstraction. A NEW LEVEL OF ABSTRACTION! However, even with nifty assembler features like auto-linking JUMPs to labels, Assembly Languages are still a thin veneer over machine code.
In general, each assembly language instruction converts directly to a corresponding machine instruction – a one-to-one mapping – so it’s inherently tied to the underlying hardware. And the assembler still forces programmers to think about which registers and memory locations they will use. If you suddenly needed an extra value, you might have to change a lot of code to fit it in.
Let’s go to the Thought Bubble. This problem did not escape Dr. Grace Hopper.
As a US naval officer, she was one of the first programmers on the Harvard Mark 1 computer, which we talked about in Episode 2. This was a colossal, electro-mechanical beast completed in 1944 as part of the allied war effort. Programs were stored and fed into the computer on punched paper tape.
By the way, as you can see, they “patched” some bugs in this program by literally putting patches of paper over the holes on the punch tape. The Mark 1’s instruction set was so primitive, there weren’t even JUMP instructions. To create code that repeated the same operation multiple times, you’d tape the two ends of the punched tape together, creating a physical loop.
In other words, programming the Mark 1 was kind of a nightmare! After the war, Hopper continued to work at the forefront of computing. To unleash the potential of computers, she designed a high-level programming language called “Arithmetic Language Version 0”, or A-0 for short.
Assembly languages have direct, one-to-one mapping to machine instructions. But, a single line of a high-level programming language might result in dozens of instructions being executed by the CPU. To perform this complex translation, Hopper built the first compiler in 1952.
This is a specialized program that transforms “source” code written in a programming language into a low-level language, like assembly or the binary “machine code” that the CPU can directly process. Thanks, Thought Bubble. So, despite the promise of easier programming, many people were skeptical of Hopper’s idea.
She once said, “I had a running compiler and nobody would touch it… they carefully told me, computers could only do arithmetic; they could not do programs.” But the idea was a good one, and soon many efforts were underway to craft new programming languages -- today there are hundreds! Sadly, there are no surviving examples of A-0 code, so we’ll use Python, a modern programming language, as an example. Let’s say we want to add two numbers and save that value.
Remember, in assembly code, we had to fetch values from memory, deal with registers, and other low-level details. But this same program can be written in python like so: Notice how there are no registers or memory locations to deal with -- the compiler takes care of that stuff, abstracting away a lot of low-level and unnecessary complexity. The programmer just creates abstractions for needed memory locations, known as variables, and gives them names.
So now we can just take our two numbers, store them in variables we give names to -- in this case, I picked a and b but those variables could be anything - and then add those together, saving the result in c, another variable I created. It might be that the compiler assigns Register A under the hood to store the value in a, but I don’t need to know about it! Out of sight, out of mind!
It was an important historical milestone, but A-0 and its later variants weren’t widely used. FORTRAN, derived from "Formula Translation", was released by IBM a few years later, in 1957, and came to dominate early computer programming. John Backus, the FORTRAN project director, said: "Much of my work has come from being lazy.
I didn't like writing programs, and so ... I started work on a programming system to make it easier to write programs." You know, typical lazy person. They’re always creating their own programming systems.
Anyway, on average, programs written in FORTRAN were 20 times shorter than equivalent handwritten assembly code. Then the FORTRAN Compiler would translate and expand that into native machine code. The community was skeptical that the performance would be as good as hand written code, but the fact that programmers could write more code more quickly, made it an easy choice economically: trading a small increase in computation time for a significant decrease in programmer time.
Of course, IBM was in the business of selling computers, and so initially, FORTRAN code could only be compiled and run on IBM computers. And most programming languages and compilers of the 1950s could only run on a single type of computer. So, if you upgraded your computer, you’d often have to re-write all the code too!
In response, computer experts from industry, academia and government formed a consortium in 1959 -- the Committee on Data Systems Languages, advised by our friend Grace Hopper -- to guide the development of a common programming language that could be used across different machines. The result was the high-level, easy to use, Common Business-Oriented Language, or COBOL for short. To deal with different underlying hardware, each computing architecture needed its own COBOL compiler.
But critically, these compilers could all accept the same COBOL source code, no matter what computer it was run on. This notion is called write once, run anywhere. It’s true of most programming languages today, a benefit of moving away from assembly and machine code, which is still CPU specific.
The biggest impact of all this was reducing computing’s barrier to entry. Before high level programming languages existed, it was a realm exclusive to computer experts and enthusiasts. And it was often their full time profession.
But now, scientists, engineers, doctors, economists, teachers, and many others could incorporate computation into their work . Thanks to these languages, computing went from a cumbersome and esoteric discipline to a general purpose and accessible tool. At the same time, abstraction in programming allowed those computer experts – now “professional programmers” – to create increasingly sophisticated programs, which would have taken millions, tens of millions, or even more lines of assembly code.
Now, this history didn’t end in 1959. In fact, a golden era in programming language design jump started, evolving in lockstep with dramatic advances in computer hardware. In the 1960s, we had languages like ALGOL, LISP and BASIC.
In the 70’s: Pascal, C and Smalltalk were released. The 80s gave us C++, Objective-C, and Perl. And the 90’s: python, ruby, and Java.
And the new millennium has seen the rise of Swift, C#, and Go - not to be confused with Let it Go and Pokemon Go. Anyway, some of these might sound familiar -- many are still around today. It’s extremely likely that the web browser you’re using right now was written in C++ or Objective-C.
That list I just gave is the tip of the iceberg. And languages with fancy, new features are proposed all the time. Each new language attempts to leverage new and clever abstractions to make some aspect of programming easier or more powerful, or take advantage of emerging technologies and platforms, so that more people can do more amazing things, more quickly.
Many consider the holy grail of programming to be the use of “plain ol’ English”, where you can literally just speak what you want the computer to do, it figures it out, and executes it. This kind of intelligent system is science fiction… for now. And fans of 2001: A Space Odyssey may be okay with that.
Now that you know all about programming languages, we’re going to deep dive for the next couple of episodes, and we’ll continue to build your understanding of how programming languages, and the software they create, are used to do cool and unbelievable things. See you next week. Hey guys, this week’s episode was brought to you by CuriosityStream which is a streaming service full of documentaries and nonfiction titles from some really great filmmakers, including exclusive originals.
I just watched a great series called “Digits” hosted by our friend Derek Muller. It’s all about the Internet - from its origins, to the proliferation of the Internet of Things, to ethical, or white hat, hacking. And it even includes some special guest appearances… like that John Green guy you keep mentioning in the comments.
And Curiosity Stream offers unlimited access starting at $2.99 a month, and for you guys, the first two months are free if you sign up at curiositystream.com/crashcourse and use the promo code "crash course" during the sign-up process.