TWiT News 401 Transcript
Please be advised this transcript is AI-generated and may not be word for word & speakers may be incorrect at times. Time codes refer to the approximate times in the ad-supported version of the show.
00:02 - Keynote (Announcement)
podcasts you love from people you trust.
00:06 - Mikah Sargent (Host)
This is this is Twitter special number 401 with Jeff Jarvis and Mike a Sargent, recorded Monday, march 18th 2024, nvidia GTC keynote 2024. Hello and welcome to our special Twitter News. This is the Nvidia GTC, that is, gpu Technology Conference conference. This is the AI conference from Nvidia that kicks off today, or actually I guess it kicked off yesterday. March 18th is the kind of official kickoff with the keynote. We will be watching the keynote today. I am Mike a Sargent and I am joined across the web by Jeff Jarvis, who's here to join us. Hello, hey, mike, good to see you. Good to see you as well.
00:56 - Jeff Jarvis (Host)
You know a thing Last time I saw you, you were getting headaches with the with a I was.
01:01 - Mikah Sargent (Host)
I hid in the vision, Bro yes, oh, that thing is a long since returned to Apple. Very happy that it's there. You know, I even had a moment of like am I going to miss it? I don't miss it at all, I really don't. Yeah, it was. It was a literal pain in the head Not the neck though, which is good, I guess but we are currently watching the kind of it's almost akin to when they flash the lights at a theater, right, it's getting everybody settled in.
01:30
So they're presenting this living art, as they call it, though we have no further details about it. It's just some sort of visualization on screen that Jeff and I have both called a wallpaper and a music visualizer and screensaver. And a screensaver yeah, I don't know if you notice what exactly the data is that's being used to create the sculpture, but there is an individual on stage who sort of longingly and lovingly looks at the screen every now and then and then taps on their tablets to, one would assume, make changes to what's happening on screen, as sort of a visualization DJ, if you will, which is going on my business cards. It's prompter, yeah, exactly. But outside of that, I'm not sure really what it is, but we are looking forward to seeing what Nvidia has to announce on stage today as they talk about AI advances, because, frankly, nvidia's hardware is kind of at the root of a lot of the generative AI pushes, yeah, so it should be interesting to see.
02:43 - Jeff Jarvis (Host)
Yeah, what's interesting to me is that I'm trying to give an analogy Nvidia makes the hardware. It's rather as if Samsung wanted to make a Hollywood presentation or Kodak wanted to make a Hollywood presentation. Like, we're the hardware behind this, this creativity, but we're not the fun part, right? Or the really rich part, or making tons of money on all this stuff, but I think they want to try to claim a piece for that. The other thing that interests me is that Jensen Wang, the CEO, said, I think two weeks ago at a discussion, that universities should stop training computer scientists. What? Which domain expertise? That's what's going to be needed. The machine the quote here is everybody in the world is now a programmer. This is the miracle of artificial intelligence. So it'll be really interesting to hear where he talks strategically.
03:38 - Mikah Sargent (Host)
Okay, that's fascinating because that could really upset some people. Okay, it's about to begin. All right, sounds like we're about to kick off with the Nvidia GTC conference keynote March 18th in the San Jose Convention Center. Looks like we'll be starting with a movie.
04:07 - Jeff Jarvis (Host)
It's just a black screen right now?
04:08 - Mikah Sargent (Host)
Yeah, probably starting with a movie film. Don't worry, do not adjust your dials.
04:28 - Keynote (Announcement)
Space, the final frontier.
04:35 - Keynote (Announcement)
I am a visionary illuminating galaxies to witness the birth of stars, Sharpening our understanding of extreme weather events.
04:58 - Mikah Sargent (Host)
It looks like they're showing some of the ways that Nvidia technology has been used in different Science applications. A large contraption that helps people move or ambulate while having low or no vision. I am a transformer, harnessing gravity to store renewable power.
05:36 - Jeff Jarvis (Host)
My friend Bill Gross has a company that does just that moving concrete to store electricity Hoh.
05:44 - Mikah Sargent (Host)
I'm not using a lot of that. How does that?
05:46 - Jeff Jarvis (Host)
work, that'd be good. That's cool. I am a trainer Teaching robots to us.
05:52 - Mikah Sargent (Host)
Oh dear Sorry, All right For those listening. There's a robot on the screen that is doing someone's physical therapy and that's not frightening at all.
06:05 - Keynote (Announcement)
And help save lives. I am a healer providing a new generation of cures.
06:18 - Jeff Jarvis (Host)
Protein structures, which is the most impressive thing that AI has done with age, absolutely.
06:23 - Keynote (Announcement)
Is it still okay to take the medications? Definitely, these antibiotics don't contain penicillin.
06:28 - Jeff Jarvis (Host)
I don't know that we need to talk into a fake human being, ai. No, no, no, no, no, no, no no no Practical applications is great. I agree when they get overboard and what they think they can do. That's where you get into danger.
06:53 - Mikah Sargent (Host)
Okay, showing self-driving. Oh interesting. Help to write the script. Yeah help to write the script. And also wrote the music.
07:12 - Jeff Jarvis (Host)
Original music performed by human beings. However.
07:14 - Mikah Sargent (Host)
Yes, it's composed by AI and performed by the London Symphony Orchestra. I wonder what they got paid for yeah. I also wonder if they had to make any changes to the music. I am AI, says the AI. No we are, yes, we are AI.
07:39 - Keynote (Announcement)
Please welcome to the stage NVIDIA founder and CEO Jensen Wong.
07:46 - Mikah Sargent (Host)
All right. Jensen Wong takes the stage after showing where NVIDIA technology has been used in practical applications to kick off the keynote.
07:55 - Jeff Jarvis (Host)
Large stage, large audience, I should say.
07:59 - Jensen Huang (Other)
Welcome to GTC. I hope you realize this is not a concert. You have arrived at a developer's conference. There will be a lot of science described algorithms, computer architecture, mathematics, uh-oh.
08:26 - Mikah Sargent (Host)
But we don't need any of that because the computers do it for us. That's what you said, jensen. Yeah.
08:31 - Jeff Jarvis (Host)
I got a.
08:34 - Jensen Huang (Other)
I sensed a very heavy weight in the room all of a sudden, almost like you were in the wrong place. No conference in the world. I don't know if we should start by insulting the audience Is there a greater assembly of researchers from such diverse fields of science, from climate tech, to radio sciences, trying to figure out how to use AI to robotically control MIMOs for next generation 6G radios, robotic self-driving cars, even artificial intelligence, even artificial intelligence.
09:13 - Jeff Jarvis (Host)
That was a laugh line.
09:14 - Jensen Huang (Other)
You guys didn't laugh. I noticed a sense of relief there all of a sudden. Also, this conference is represented by some amazing companies. This list, this is not the attendees. These are the presenters.
09:34 - Mikah Sargent (Host)
It's a huge, and what's amazing is this it's a huge number of companies that are presenting Amazon Adobe. Foxconn.
09:42 - Jensen Huang (Other)
Now it's just called Oracle Pixar In the IT industry.
09:49 - Mikah Sargent (Host)
Dell.
09:53 - Jensen Huang (Other)
All of the friends I grew up with in the industry. If you take away that list, this is what's amazing. These are the presenters of the non-IT industries using accelerated computing to solve problems that normal computers can't snapchat, Maybe PayPal.
10:10 - Mikah Sargent (Host)
Merch Chat Interesting.
10:11 - Jeff Jarvis (Host)
C Moon's Lowe's.
10:13 - Jensen Huang (Other)
G. It's represented in life sciences, healthcare, genomics, transportation, of course, retail, logistics, manufacturing, industrial. The gamut of industries represented is truly amazing. You're not here to attend, only you're here to present, to talk about your research.
10:36 - Mikah Sargent (Host)
I am, who knows?
10:38 - Jensen Huang (Other)
One hundred trillion dollars of the world's industries is represented in this room today. This is absolutely amazing.
10:45 - Jeff Jarvis (Host)
I hope AI doesn't destroy us today Not today. It goes to the stock market.
10:51 - Jensen Huang (Other)
There is absolutely something happening. There is something going on. The industry is being transformed, not just ours, because the computer industry, the computer, is the single most important instrument of society today. Fundamental transformations in computing affects every industry. But how did we start? How did we get here? I made a little cartoon for you. Literally I drew this In one page. This is Nvidia's journey, started in 1993. This might be the rest of the talk.
11:29 - Mikah Sargent (Host)
It's sort of an illustration of different strange graphs we were founded in 1993.
11:33 - Jensen Huang (Other)
There are several important events that happened along the way. I'll just highlight a few. In 2006, cuda, which has turned out to have been a revolutionary computing model. We thought it was revolutionary, then it was going to be an overnight success and almost 20 years later it happened. We saw it coming Two decades later, in 2012, alexnet, ai and CUDA made first contact, which stands for Compute Unified Device Architecture.
12:13 - Jeff Jarvis (Host)
Thank you.
12:14 - Jensen Huang (Other)
Recognizing the importance of this computing model, we invented a brand new type of computer. We call it a DGX-1. 170 teraflops in this supercomputer, eight GPUs connected together. For the very first time, I hand delivered the very first DGX-1 to a startup located in San Francisco called Open AI.
12:43 - Jeff Jarvis (Host)
I want to know how they shifted from graphics to changing the world.
12:47 - Jensen Huang (Other)
DGX-1 was the world's first AI supercomputer. Remember 170 teraflops 2017,. The transformer arrived 2022, chat GPT captured the world's imaginations. Thank you, we realized the importance and the capabilities of artificial intelligence.
13:06 - Mikah Sargent (Host)
I was going to say, just slide right over that.
13:10 - Jensen Huang (Other)
Generative AI emerged and a new industry begins. Why? Why is a new industry? Because the software never existed before. We are now producing software using computers to write software, producing software that never existed before. It is a brand new category. It took share from nothing. It's a brand new category. The way you produce the software is unlike anything we have ever done before.
13:45
In data centers generating tokens, producing floating point numbers at very large scale, as if in the beginning of this last industrial revolution, when people realized that you would set up factories, apply energy to it and this invisible, valuable thing called electricity came out AC generators. A hundred years later, 200 years later, we are now creating new types of electrons tokens using infrastructure we call factories, ai factories.
14:26 - Mikah Sargent (Host)
New types of electrons New incredibly valuable thing called electricity. Is this the extended conceit that is going?
14:35 - Jensen Huang (Other)
on here. We are going to talk about how we are going to do computer next.
14:42
We are going to talk about the type of software that you build because of this new industry, the new software, how you would think about this new software, what about applications in this new industry, and then, maybe, what's next, and how can we start preparing today for what is about to come next? Well, but before I start, I want to show you the soul of NVIDIA, the soul of our company, at the intersection of computer graphics, physics and artificial intelligence, all intersecting inside a computer, in omniverse, in a virtual world, simulation. Literally, everything we are going to show you today is a simulation, not animation. It's only beautiful because it's physics. The world is beautiful, it's only amazing because it's being animated with robotics.
15:47 - Mikah Sargent (Host)
I am honestly having a little trouble following this thread.
15:50 - Jensen Huang (Other)
It's bouncing all over the place. It's completely generated, completely simulated and omniverse, and all of it. What you are about to enjoy is the world's first concert where everything is homemade.
16:05 - Mikah Sargent (Host)
What? Okay, so NVIDIA omniverse is some sort of generative AI space.
16:11 - Jeff Jarvis (Host)
Everything is homemade. You are about to watch some home videos, so sit back and enjoy yourself.
16:18 - Mikah Sargent (Host)
Okay, so here we go with a demo Adobe Substance 3D Painter, blender and Max on ZBrush.
16:25 - Jeff Jarvis (Host)
A charming cabin, now overdone. Greek statue. Now a watch in fine detail.
16:34 - Mikah Sargent (Host)
Looks like an ad for one Now. Close-ups of ships, an architecture.
16:43 - Jeff Jarvis (Host)
Now code, now animation in a drone, robots made up though.
16:57 - Mikah Sargent (Host)
Yeah, lots of simulations, physics simulations and 3D modeling.
17:07 - Keynote (Announcement)
Fire.
17:13 - Mikah Sargent (Host)
I have to say that fire looks more real than I see in a lot of things. So each of these and each of these videos that they're showing includes a little bit of text about what tools are being used to produce them Some fabric, again code, and, together with robotics, 3D printing, robotic arms and Amazon robotics for doing shipping.
17:43 - Jeff Jarvis (Host)
It's basically NVIDIA inside.
17:45 - Mikah Sargent (Host)
There you go.
17:46 - Jeff Jarvis (Host)
yes, we're in all of it Now a warehouse. Hello, amazon.
18:00 - Mikah Sargent (Host)
Build a Sim Ready warehouse. Yeah, prompt to create a warehouse with issues, which is good. It's got some simulations going.
18:09 - Jeff Jarvis (Host)
They said it didn't pull on some on a fake human beam.
18:14 - Mikah Sargent (Host)
Ah, building the next mega ship to get stuck in the canal.
18:23 - Jeff Jarvis (Host)
Now humans talking almost human-like.
18:28 - Mikah Sargent (Host)
Oh my, the factory is going mad. The factory just absolutely falls apart in every way. How they handle that. It seems Driving self-driving.
18:42 - Jeff Jarvis (Host)
It doesn't matter because it's only in the omniverse, exactly.
18:45 - Mikah Sargent (Host)
It's all handmade or homemade. I mean, that's what the word they used A simulation of Earth called Earth 2. Okay, that's where I sign off. No, thank you.
19:03 - Jensen Huang (Other)
This is getting into 42 territory. Accelerated computing has reached the tipping point.
19:14 - Jeff Jarvis (Host)
That's another new phrase General purpose computing has run out of steam.
19:18 - Jensen Huang (Other)
We need another way of doing computing so that we can continue to scale, so that we can continue to drive down the cost of computing, so that we can continue to consume more and more computing while being sustainable, accelerating computing is a dramatic speed up over general purpose computing and in every single industry we engage and I'll show you many the impact is dramatic.
19:45
But in no industry is it more important than our own, the industry of using simulation tools to create products. In this industry it is not about driving down the cost of computing, it's about driving up the scale of computing.
20:03 - Mikah Sargent (Host)
They're worried that Moore's law is starting to fail and that we won't be able to keep making things smaller and smaller. So we've got to figure out a new way to bring up computing.
20:13 - Jensen Huang (Other)
What we call digital twins. We would like to design it, build it, simulate it, operate it completely digitally, bop it, twist it. In order to do that, we need to accelerate an entire industry, and today I would like to announce that we have some partners who are joining us in this journey to accelerate their entire ecosystem so that we can bring the world into accelerated computing.
20:42 - Jeff Jarvis (Host)
But there's a bonus.
20:44 - Jensen Huang (Other)
When you become accelerated your infrastructure is generated when you take this pill it's exactly the same infrastructure for generative AI, and so I'm just delighted to announce several very important partnerships. There are some of the most important companies in the world. Ansys Does engineering simulation for what the world makes. We're partnering with them to coot to accelerate the ANSYS ecosystem. To connect ANSYS to the omnivorous digital twin Incredible.
21:17 - Mikah Sargent (Host)
This is interesting because Microsoft had this concept of digital twins before Nvidia was doing it, and it was also for simulation reasons. I don't know that.
21:29 - Jensen Huang (Other)
We'll have a giant installed base to go serve. End users will have amazing applications and, of course, system makers and CSPs will have great customer demand. Synopsis Synopsis is Nvidia's literally first software partner. They were there in very first day of our company. Synopsis revolutionized the chip industry with high level design. We are going to COOTA accelerate synopsis. We're accelerating computational lithography, one of the most important applications that nobody's ever known about. In order to make chips, we have to push lithography to a limit. Nvidia has created a domain specific library that accelerates computational lithography as a Gutenberg geek.
22:16 - Jeff Jarvis (Host)
I like that. Lithography is a pretty technology, software-defined, all of.
22:20 - Jensen Huang (Other)
TSMC, who is announcing today that they're going to go into production with Nvidia Coulifo Once it's software-defined and accelerated. A lot of terms is thrown around. Next step is to apply generative AI to the future of semiconductor manufacturing, pushing geometry even further. Cool Cadence builds the world's essential EDA and SDA tools.
22:42 - Mikah Sargent (Host)
Go back to that screen and have actual information on it. Between these two Exactly.
22:46 - Jensen Huang (Other)
Synopsis and Cadence. We basically build Nvidia Together. We are COOTA accelerating Cadence. They're also building a supercomputer out of Nvidia GPUs so that their customers could do fluid dynamic simulation at a hundred, a thousand times scale Basically, a wind tunnel in real time. Cadence Millennium a supercomputer with Nvidia GPUs inside A software company building supercomputers. I love seeing that Building Cadence co-pilots together. Even a day when Cadence could synopsis, ansys tool providers would offer you AI co-pilots so that we have thousands and thousands of co-pilot assistants helping us design chips, design systems.
23:35 - Mikah Sargent (Host)
Interesting choice to use that same word that Microsoft is using for its branding across everything.
23:39 - Jensen Huang (Other)
As you can see the trend here, we're accelerating the world's CAE, eda and SDA so that we could create our future in digital twins, and we're going to connect them all to Omniverse, the fundamental operating system for future digital twins.
23:55 - Mikah Sargent (Host)
I would love to know what any of this actually means. What does this look like in practice?
23:59 - Jensen Huang (Other)
One of the industries that's been in this scale, and you all know this one very well large language models.
24:05
Basically, after the transformer was invented, we were able to scale large language models at incredible rates, effectively doubling every six months. Now, how is it possible that by doubling every six months that we have grown the industry, we have grown the computational requirements so far? The reason for that is quite simply this If you double the size of the model, you double the size of your brain. You need twice as much information to go fill it. Every time you double your parameter, count you also have to appropriately increase your training code count it's not like a shelf to fill Right.
24:44
So one of those two numbers becomes the computation scale you have to support. The latest, the state-of-the-art open AI model, is approximately 1.8 trillion parameters. 1.8 trillion parameters required several trillion tokens to go train. So a few trillion parameters on the order of a few trillion tokens, on the order of, when you multiply the two of them together, approximately 30, 40, 50 billion quadrillion floating point operations.
25:20 - Mikah Sargent (Host)
Here's the math that he promised.
25:21 - Jensen Huang (Other)
Now we just have to do some CO math right now. Just hang with me. So you have 30 billion quadrillion. A quadrillion is like a PETA and so if you had a PETA flop GPU, you would need 30 billion seconds to go compute, to go train that model. 30 billion seconds is approximately 1,000 years. Oh my God. Well, 1,000 years. It's worth it. I'd like to do it sooner, but it's worth it. This is usually my answer when most people come in.
25:59 - Jeff Jarvis (Host)
This is why the stochastic parents paper says the mistake is making ever bigger models. What's this?
26:07 - Jensen Huang (Other)
kind of macho size matters. What's the task? And so 1,000 years, 1,000 years. So what we need, what we need are bigger GPUs. Of course you do, because you make them.
26:23 - Mikah Sargent (Host)
Exactly what can we do to solve this problem?
26:26 - Jensen Huang (Other)
And we realize that the answer is to put a whole bunch of GPUs together and, of course, innovate a whole bunch of things along the way, like inventing tensor cores, inventing NV links so that we could create essentially virtually giant GPUs, and connecting them all together with amazing networks from a company called Melanox and Phinivan so that we could create these giant systems. And so DGX1 was our first version, but it wasn't the last. We built supercomputers all the way, all along the way. In 2021, we had Selene 4,500 GPUs or so and then in 2023, we built one of the largest AI supercomputers in the world. It's just come online EOS and as we're building these things, we're trying to help the world build these things, and in order to help the world build these things, we got to build them. First. We build the chips, the systems, the networking, all of the software necessary to do this. You should see these systems and writing a piece of software that runs across the entire system, distributing the computation across thousands of GPUs, but inside are thousands of smaller GPUs.
27:40 - Mikah Sargent (Host)
Millions of CPUs, millions of CPUs.
27:42 - Jensen Huang (Other)
Distribute work across all of that.
27:43 - Mikah Sargent (Host)
GPUs all the way down? Yeah, exactly.
27:46 - Jensen Huang (Other)
So that you can get the most energy efficiency, the best computation time, keep your cost down. And those fundamental innovations is what got us here. And here we are as we see the miracle of chat GPT emerging in front of us.
28:05 - Mikah Sargent (Host)
It's interesting to hear them say anything about the environment. We have a long ways to go.
28:10 - Jensen Huang (Other)
We need even larger models. We're going to train it with multi-modality data, not just text on the internet, but we're going to train it on text and images and graphs and charts and just as we learn watching TV, and so there's going to be a whole bunch of watching video so that these models can be grounded in physics. Understands that an arm doesn't go through.
28:33 - Jeff Jarvis (Host)
So far, they're not doing a very good job of that.
28:35 - Jensen Huang (Other)
Models would have common sense? Oh, by watching a lot of the world's video combined with a lot of the world's languages we drink hemlock when he says AGI and we're going to generate it just as you and I do when we try to learn, we might use our imagination to simulate how it's going to end up, just as I did when I was preparing for this keynote. I was simulating it all along the way.
28:58 - Mikah Sargent (Host)
So he's saying he's an AI. I hope it's going to turn out as well as I added it to my head. No, but that is an interesting thing to say, that me thinking about how something is going to go is the same way that an AI works when it's not Not really the case.
29:19 - Jensen Huang (Other)
Did her performance completely on a treadmill so that she could be in shape to deliver it with full energy. What I didn't do, that, if I get a little wind in about 10 minutes into this, you know what happened, and so so where were we? We're sitting here using synthetic data generation. We're going to use reinforcement learning. We're going to practice it in our mind. We're going to have AI working with AI training each other, just like student teacher debaters. All of that is going to increase the size of our model. It's going to increase the amount of data that we have, and we're going to have to build even bigger GPUs. Hopper is fantastic Ever bigger GPUs.
30:04
But we need bigger GPUs. Somewhere, timnit Gebru is screaming. I would like to introduce you to our bigger GPU to a very, very big GPU.
30:21 - Mikah Sargent (Host)
Yeah, that's the clap for the thing that's going to never mind.
30:29 - Jensen Huang (Other)
Game dapper, david Blackwell, mathematician, game theorists, probability. We thought it was a perfect, perfect, perfect name Blackwall. Ladies and gentlemen, enjoy this.
30:50 - Mikah Sargent (Host)
The largest chip physically possible. Let's join two of them together 208 billion transistors. Twice the size, a massive leap in compute 20 peneflops of AI performance. Oh my Lord, let's put two of them side by side.
31:17 - Jeff Jarvis (Host)
40 peneflops Trillion per meter scale 40 peneflops 80 peneflops of AI performance.
31:39 - Mikah Sargent (Host)
Oh, and then they just keep going. They're dropping it into the actual machine Infrastructure processor for the network and then you get a bunch of them stacked, stack upon stacked 18 computer is they think we read faster than we do?
32:15 - Jeff Jarvis (Host)
We're not. Yeah, they were hard to get all this. Exactly 72 blackwell CPUs connected in one rack 1.4 exaflops of AI performance.
32:28 - Mikah Sargent (Host)
Quantum Infiniband switch, again for networking stuff. Oh my God, folks are listening, people get. They just keep making a bigger.
32:42 - Jeff Jarvis (Host)
Yeah, people get weirded out by robots. That is weird beyond. This is weird to me.
32:46 - Mikah Sargent (Host)
Yeah, this is a little oh, and then, of course, like cooling and energy efficiency, yeah, well, you know we're going to talk more about that, oh, and then you stack a bunch of them next to each other.
32:58 - Jeff Jarvis (Host)
Oh God, and they're going to be on Mars before you know it. Now it's like an infinite view of these huge machines.
33:05 - Mikah Sargent (Host)
They see, that's not A full data center has 645 exaflops of AI performance. I don't even know. I can't even conceptualize 645 exaflops.
33:23 - Jensen Huang (Other)
Blackwell is not a chip. Blackwell is the name of a platform. People think we make GPUs, you do and we do. Okay. Gpus don't look the way they used to.
33:36
Here's the, here's the, here's the, the, if you will, the heart of the blackwell system. And this inside the company is not called blackwell, it's just the number. And this, this is blackwell sitting next to oh, this is the most advanced GPU in the world in production today. Close up, this is a chopper, this is a hopper. Hopper. Change the world. Good name. Yeah, this is blackwell.
34:08 - Mikah Sargent (Host)
It's bigger, just so everybody understands. It's bigger than hopper.
34:17 - Jeff Jarvis (Host)
People are applauding a square inch.
34:20 - Mikah Sargent (Host)
It's so funny. Yeah, we see. Thank you.
34:25 - Jensen Huang (Other)
You're very good.
34:27 - Jeff Jarvis (Host)
Now he's being kind of condescending to Grace Hopper and she's, she's yesterday's news. Oh, he did say good girl.
34:33 - Mikah Sargent (Host)
Did he say that?
34:35 - Jensen Huang (Other)
208 billion processors, and so so you could see. You, I can see there's a small line between two dies. This is the first time two dies have your face says it all in such a way that the two. The two dies think it's one chip. There's 10 terabytes of data between it.
34:55 - Mikah Sargent (Host)
Don't read the world outside that room. If you only simulated in that digital twin, you would have known not to say that there's no memory, locality issues, no cash issues.
35:06 - Jensen Huang (Other)
It's just one giant chip. And so when we were told that blackwell's ambitions were beyond the limits of physics, the engineer said so what?
35:16 - Mikah Sargent (Host)
I don't know because that's reminiscent of what Apple does with Apple silicon.
35:20 - Mikah Sargent (Host)
And so this is the blackwell to each other and it goes into 10 times the computer.
35:24 - Jensen Huang (Other)
Think it's just one. The first one is say that function compatible physics said it couldn't be done. You slide all hopper and you push in blackwell. That's the reason why one of the challenges of ramping is going to be so efficient. There are installations of hoppers all over the world and they could be.
35:42
They could be okay so they made this swappable with hoppers the power to buy more blackwells and the normal the software identical, push it right back, and so this is a hopper version for the current HGX configuration, and this is what the second hopper looks like. This Now, this is a prototype board and Jeanine can.
36:07 - Mikah Sargent (Host)
I just borrow Hopper that looks like it's blackwell on it.
36:11 - Jeff Jarvis (Host)
Yeah.
36:17 - Jensen Huang (Other)
And so this is the. This is a fully functioning board, and I'll just be careful here.
36:23 - Mikah Sargent (Host)
Yeah, they got to sell that, to sell that Right here is.
36:26 - Jensen Huang (Other)
I don't know, $10 billion.
36:31 - Mikah Sargent (Host)
It's got a face on it.
36:34 - Jensen Huang (Other)
Yeah, the second one's five. It gets cheaper after that, so any customers in the audience, I don't understand these jokes.
36:44 - Mikah Sargent (Host)
How about? What do you mean? It's $10 billion, the first one. This is this one's quite expensive.
36:52 - Jensen Huang (Other)
And the way it's going to go to production is like this one here Okay, and so you're going to take this. It has two blackwell chips and four blackwell dyes connected to a grace CPU. The grace CPU has a super fast chip to chip link, so grace hopper still gets to stay this computer is the first of its kind.
37:13
Yeah, she's just serving blackwell now this computation first of all fits into this small of a place. Second, it's memory coherent. They feel like they're just one big happy family working on one application together, and so everything is coherent within it, Just the amount of you know. You saw the numbers.
37:35 - Mikah Sargent (Host)
Yeah, I do wonder why there's a place at a diagonal to each other. Yeah, I wonder too.
37:40 - Jensen Huang (Other)
This is a miracle, this is a reason, this let's see what are some of the things on here. There's an NV link on top, PCI Express on the bottom. I'm going to get one of those for my personal computer. The big deal is mine and your left One of them, it doesn't matter. One of them is a CPU chip to chip link. It's my left or your, depending on which side. Okay just keep going, just sort that out and I just kind of it's okay, just keep going. It doesn't matter.
38:17 - Jeff Jarvis (Host)
Hopefully it comes like this.
38:18 - Mikah Sargent (Host)
He's kind of occasionally awkward, he's kind of almost endearing it does it does wrap around to endearing, you're right.
38:24 - Jensen Huang (Other)
Okay, so this is the Grace Blackwell system, but there's more.
38:38 - Mikah Sargent (Host)
He pulled something out of his pocket.
38:40 - Jensen Huang (Other)
So it turns out, it turns out all of this in your phone all of the specs is fantastic, but we need a whole lot of new features in order to push the limits beyond the limits beyond, if you will, the limits of physics. We would like to always get a lot more X factors, and so one of the things that we did was we invented another transformer engine. Another transformer engine, the second generation. It has the ability to dynamically and automatically rescale and recast numerical formats to a lower precision whenever it can Remember, artificial intelligence is about probability, and so you kind of have approximately 1.7 times approximately 1.4 to be approximately something else. Does that make sense? Yes, and so the ability for the mathematics to retain the precision and the range necessary in that particular stage of the pipeline super important.
39:40
And so this is not just about the fact that we designed a smaller ALU. It's not quite the world is not quite that simple. You've got to figure out when you can use that across a computation that is thousands of GPUs. It's running for weeks and weeks on weeks, and you want to make sure that the training job is going to converge. And so this new transformer engine. We have a fifth generation NV link. It's now twice as fast as Hopper but, very importantly, it has computation in the network, and the reason for that is because, when you have so many different GPUs working together, we have to share our information with each other, we have to synchronize and update each other, and every so often we have to reduce the partial products and then rebroadcast out the partial products, some of the partial products, back to everybody else, and so there's a lot of what is called all reduce and all to all and all gather.
40:38
It's all part of this scenario of synchronization and collectives, so that we can have GPUs working with each other. Having extraordinarily fast links and being able to do mathematics right in the network allows us to essentially amplify even further. So, even though it's 1.8 terabytes per second, it's effectively higher than that, and so it's many times that of Hopper. The likelihood of a supercomputer running for weeks on end is approximately zero, and the reason for that is because there's so many components working at the same time. The statistic, the probability of them working continuously is very low, and so we need to Lattie's not making boring jets.
41:23 - Jeff Jarvis (Host)
Amen, checkpoint and restart as often as we can, but he is making things inside them.
41:26 - Mikah Sargent (Host)
Yeah how to do them going forward, how to build them going forward.
41:30 - Jensen Huang (Other)
So if we have a weak chip or a weak node early, we can retire it and maybe swap in another processor. That ability to keep the utilization of the supercomputer high, especially when you just spent $2 billion building it, is super important. So we put in a RAS engine, a reliability engine that does 100% self-test in-system test of every single gate, every single bit of memory on the Blackwell chip and all the memory that's connected to it. It's almost as if we shipped with every single chip its own advanced tester that we test our chips with. This is the first time we're doing this. Super excited about it.
42:20 - Jeff Jarvis (Host)
Secure AI You're right, that board does look like a face. It has two gold. For those of you listening, it has two golden chips atop.
42:29 - Jensen Huang (Other)
Only in this conference do they call for that and a multi-colored nose.
42:32 - Mikah Sargent (Host)
And then a flat smiley face. Yeah, I guess not smiley, but yeah, face.
42:36 - Jensen Huang (Other)
Obviously, you've just spent hundreds of millions of dollars creating a very important AI, and the code, the intelligence of that AI is encoded in the parameters. You want to make sure that, on the one hand, you don't lose it, on the other hand, it doesn't get contaminated, and so we now have the ability to encrypt data, of course, at rest, but also in transit.
42:59 - Mikah Sargent (Host)
Okay, so these are purpose-built engines for creating AI.
43:04 - Jensen Huang (Other)
All encrypted, and so we now have the ability to encrypt in transmission, not just running but creating, and it's a trusted environment, trusted engine environment.
43:16
And the last thing is decompression. Moving data in and out of these nodes when the compute is so fast, becomes really essential, and so we've put in a high line speed compression engine and effectively moves data 20 times faster in and out of these computers. These computers are so powerful and there's such a large investment, the last thing we want to do is have them be idle, and so all of these capabilities are intended to keep Blackwell fed and as busy as possible. Overall, compared to Hopper, it is two and a half times the FP8 performance for training per chip.
44:05 - Mikah Sargent (Host)
It also has this new 4-back call of P6. Ai hardware engineers, I guess, even though the computation speed is the same.
44:12 - Jensen Huang (Other)
The bandwidth that's amplified because of the memory. The amount of parameters you can store in the memory is now amplified. Fp4 effectively doubles the throughput. This is vitally important for inference. One of the things that is becoming very clear is that whenever you use a computer with AI on the other side, when you're chatting with the chatbot, when you're asking it to review or make an image, remember, in the back is a GPU generating tokens. Some people call it inference, but it's more appropriately generation.
44:54
The way that computing has done in the past was retrieval. You would grab your phone, you would touch something, some signals go off. Basically, an email goes off to some storage. Somewhere there's pre-recorded content. Somebody wrote a story or somebody made an image or somebody recorded a video. That pre-recorded content is then streamed back to the phone and recomposed in a way based on a recommender system to present the information to you. You know that in the future, the vast majority of that content will not be retrieved. The reason for that is because that was pre-recorded by somebody who doesn't understand the context, which is the reason why we have to retrieve so much content.
45:38
If you can see working with AI and understands the context, who you are, for what reason you're fetching this information, and produces the information for you just the way you like it. If all these knew what the F of fact was.
45:55 - Mikah Sargent (Host)
To suggest that it's always newly generated is kind of false because it's partial retrieval. You are retrieving information if you're asking.
46:05 - Jensen Huang (Other)
It's modified. Retrieval right it's molded.
46:11 - Jeff Jarvis (Host)
It's like Burger. King, have it your way. You already have the burger. You want mayo?
46:15 - Jensen Huang (Other)
on that.
46:18 - Mikah Sargent (Host)
Blackwell brings the mayo. There's the new tag.
46:21 - Jensen Huang (Other)
This format is FP4. Well, that's a lot of computation 5x the token generation, 5x the inference capability of Hopper seems like enough.
46:39 - Mikah Sargent (Host)
But wait, there's more, but why stop?
46:42 - Jensen Huang (Other)
there Call today.
46:45 - Keynote (Announcement)
It's not enough.
46:48 - Jensen Huang (Other)
I'm going to show you why.
46:49 - Mikah Sargent (Host)
Is it math?
46:51 - Jensen Huang (Other)
We would like to have a bigger GPU, even bigger than this one, Bigger than Blackwell. No, we decided to scale it. But first let me just tell you how we've scaled. Over the course of the last eight years, we've increased computation by 1,000 times. Eight years, 1,000 times. Remember back in the good old days of Moore's law? We had 2x, well, 5x every 10x every five years. That's the easiest math. 10x every five years, 100 times every 10 years, 100 times every 10 years in the middle, the heydays of the PC revolution. 100 times every 10 years. In the last eight years we've gone 1,000 times. We have two more years to go, and so that puts it in perspective.
47:44 - Jeff Jarvis (Host)
So Moran Hopper are both outdone and replaced the rate at which we're advancing computing is insane.
47:50 - Jensen Huang (Other)
And it's still not fast enough. So we built another chip, this chip the size of a dinner plate, it's just an incredible chip. We call it the NVLink switch.
48:01 - Mikah Sargent (Host)
Oh, come on, not a good name.
48:03 - Jensen Huang (Other)
Come on, give it a name it's almost the size of Hopper all by itself. This switch chip has four NVlinks in it, each 1.8 terabytes per second, and and it has 50 billion frames sisters. What is this chip for?
48:21 - Mikah Sargent (Host)
And that's what I'm under.
48:22 - Jensen Huang (Other)
If we were to build such a chip, we can have every single GPU, talk to every other GPU at full speed at the same time. That's insane. It doesn't even make sense. But if you could do that, if you can find a way to do that and build a system to do that, that's cost-effective. That's cost-effective.
48:54 - Mikah Sargent (Host)
And good for the environment right.
48:55 - Jensen Huang (Other)
How incredible would it be that we could have all these GPUs connect over a coherent link so that they effectively are one giant GPU. Well, one of the great inventions in order to make a cost-effective is that this chip has to drive copper directly. The surities of this chip is just a phenomenal invention so that we could do direct drive to copper and, as a result, you can build a system that looks like this it looks like a black box.
49:33 - Jeff Jarvis (Host)
It's 720 nits blocks.
49:35 - Jensen Huang (Other)
Now this system, this system 1.44 xl blocks in inference. This is one DGX. This is what a DGX looks like now. Remember just six years ago. It was pretty heavy, but I was able to lift it.
49:54 - Mikah Sargent (Host)
Now I can't.
49:55 - Jensen Huang (Other)
I delivered the first DGX-1 to open AI and the researchers there. The pictures are on the internet and we all autographed it and if you come to my office it's autographed there. It's really beautiful, but you could lift it. This DGX, this DGX. That DGX, by the way, was 170 teraflops. If you're not familiar with the numbering system, that's 0.17 petaflops, so this is 720.
50:31 - Mikah Sargent (Host)
Wow, that's the first one I delivered to open.
50:33 - Jensen Huang (Other)
AI was 0.17. You could round it up to 0.2. It won't make any difference. And back then it was like, wow, you know 30 more teraflops. And so this is now 720 petaflops, almost an exaflop for training, and the world's first one exaflops machine in one rack.
50:55 - Mikah Sargent (Host)
That is pretty impressive. Mm-hmm.
51:02 - Jensen Huang (Other)
Just so you know, there are only a couple of, two, three exaflops machines on the planet as we speak, and so this is an exaflops AI system in one single rack. Well, let's take a look at the back of it. So this is what makes it possible.
51:22 - Jeff Jarvis (Host)
It looks like a cable box.
51:24 - Jensen Huang (Other)
That's the back.
51:24 - Mikah Sargent (Host)
That's the back, the DGX MV looks like oh wow, they have these things coming up out of the floor.
51:29 - Jensen Huang (Other)
That's kind of cool 130 terabytes per second goes through the back of that chassis. That is more than the academic family of the internet.
51:37 - Mikah Sargent (Host)
For those who like how people organize cables, it's a cool look at the cable organization on the back of the rack.
51:45 - Jeff Jarvis (Host)
Feels kind of old fashioned to still have cables.
51:47 - Jensen Huang (Other)
So we could basically send everything to everybody within a second, and so we have 5,000 cables, 5,000 MV link cables, in total two miles.
51:59
Two miles of cables Now, this is the amazing thing, if we had to use optics, we would have had to use transceivers and retimers, and those transceivers and retimers alone would have cost 20,000 watts Two kilowatts of just transceivers alone just to drive the MV link spine. As a result, we did it completely for free over MV link switch and we were able to save the 20 kilowatts for computation this entire pack is 100 and 40 kilowatts.
52:28 - Mikah Sargent (Host)
They're saying save it in total. But no, they're like no, we put it through the computation, it's the future.
52:32 - Jensen Huang (Other)
It's liquid cooled. What goes in is 25 degrees C about room temperature, what comes out is 45 degrees C about your jacuzzi. So room temperature goes in, jacuzzi comes out two liters per second.
52:49 - Mikah Sargent (Host)
So I can heat my jacuzzi with it. Is that what you're saying? Because that's not a bad idea.
52:55 - Jensen Huang (Other)
We could sell a peripheral. That's what I was, yeah.
52:58 - Jeff Jarvis (Host)
Oh, you can heat San Francisco Bay with that 600,000 parts.
53:06 - Jensen Huang (Other)
Somebody used to say you know you guys make GPUs and we do. But this is what a GPU looks like to me. When somebody says GPU, I see this. Two years ago when I saw a GPU was the HGX, it was 70 pounds, 35,000 parts. Our GPUs now are 600,000 parts and 3,000 pounds. 3,000 pounds, 3,000 pounds it's kind of like the weight of a carbon fiber Ferrari.
53:39 - Jeff Jarvis (Host)
I don't know if that's useful metric, but yeah, I don't go lifting San Francisco metric, I feel it.
53:45 - Jensen Huang (Other)
I feel it. I get it. I get that, now that you mentioned that, I feel it. I don't know what's 3,000 pounds, okay, so 3,000 pounds, ton and a half, so it's not quite an elephant. So this is what a DGX looks like. Now let's see what it looks like in operation. Okay, let's imagine how do we put this to work and what does that mean?
54:08 - Mikah Sargent (Host)
Two dairy cows worth of weight 1.8 trillion per hour model.
54:13 - Jeff Jarvis (Host)
We understand.
54:15 - Jensen Huang (Other)
It took apparently about three to five months or so with 25,000.
54:21 - Mikah Sargent (Host)
I don't know his current weight, I'm not.
54:25 - Jensen Huang (Other)
And it would consume 15 megawatts, 8,000 GPUs.
54:28 - Mikah Sargent (Host)
It does say it's 15 times the average mass of an adult male in.
54:32 - Jensen Huang (Other)
America. Oh, okay, there we go 15 Leos 15 Leos, groundbreaking AI model, and this is obviously not as expensive as anybody would think, but it's 8,000 GPUs. It's still a lot of money. And so 8,000 GPUs, 15 megawatts. If you were to use Blackwell to do this, it would only take 2,000 GPUs. 2,000 GPUs, same 90 days, but this is the amazing part Only four megawatts of power. So from 15, that's right.
55:05 - Mikah Sargent (Host)
Okay, that's good to see a quarter of the power Okay.
55:08 - Jensen Huang (Other)
But they're making them bigger and bigger and bigger, and that's our goal. Our goal is to continuously drive down the cost and the energy they're directly proportional to each other Cost and energy associated with the computing so that we can continue to expand and scale up the computation that we have to do to train the next generation so we can keep using the same amount of power. Inference, or generation, is vitally important going forward. Probably some half of the time that Nvidia GPUs are in the cloud these days it's being used for token generation. They're either doing co-pilot this, or chat, chat, gpt that, or all these different models that are being used when you're interacting with it, or generating images, or generating videos, generating proteins, generating chemicals. There's a bunch of generation going on.
55:58
All of that is in the category of computing we call inference. But inference is extremely hard for large language models because these large language models have several properties. One they're very large, and so it doesn't fit on one GPU. This is imagine Excel doesn't fit on one GPU. Imagine some application you're running on a daily basis doesn't fit on one computer.
56:22 - Jeff Jarvis (Host)
That's like my old Osborne. I had to take the disk in and out.
56:26 - Jensen Huang (Other)
Do, and many times in the past. Horns star would not fit in memory.
56:32 - Jeff Jarvis (Host)
And had to keep going back to the disk.
56:34 - Jensen Huang (Other)
All of a sudden this one inference application, or you're?
56:37 - Mikah Sargent (Host)
interacting with this. I remember doing that with a game called Crayola Rock. I had to switch CDs.
56:44 - Jensen Huang (Other)
And that's the future. The future is generative with these chatbots, and these chatbots are trillions of tokens, trillions of parameters, and they have to generate tokens at interactive rates. Now, what does that mean? Oh well, three tokens is about a word.
57:04 - Mikah Sargent (Host)
Wow, three tokens per word.
57:08 - Jensen Huang (Other)
You know, space, the final frontier, these are the adventures. That's like 80 tokens. Okay, I don't know if that's useful to you.
57:21 - Jeff Jarvis (Host)
How many leos is that?
57:23 - Mikah Sargent (Host)
I don't know if I can turn like tokens to leos.
57:29 - Jensen Huang (Other)
This is not going well.
57:30 - Mikah Sargent (Host)
Oh.
57:32 - Jensen Huang (Other)
My jokes are flopping. I don't know what he's talking about my jokes are kind of flopping.
57:40 - Jeff Jarvis (Host)
And so here we are with the tokens.
57:41 - Jensen Huang (Other)
When you're interacting with it, you're hoping that the tokens come back to you as quickly as possible and as quickly as you could read it, and so the ability for generation token is really important. You have to paralyze the work of this model across many, many GPUs so that you could achieve several things. One on the one hand, you would like throughput, because that throughput reduces the cost, the overall cost per token of generating, so your throughput dictates the cost of delivering the service. On the other hand, you have another interactive rate, which is another token per second, where it's about per user, and that has everything to do with quality of service, and so these two things compete against each other, and we have to find a way to distribute work across all of these different GPUs.
58:29 - Jeff Jarvis (Host)
What are we going to get to the Nvidia watch?
58:32 - Mikah Sargent (Host)
Yeah, where's the ring? Where's the wearables?
58:37 - Jensen Huang (Other)
You know, I told you there's going to be math involved. Oh, I thought we already did that. Everybody's going. Oh dear. I heard some gasps just now when I put up that slide. It was me.
58:48
So this right here the y-axis is tokens per second data center throughput. The x-axis is tokens per second interactivity of the person. And notice, the upper right is the best. You want interactivity to be very high. Number of tokens per second per user. You want the tokens per second per data center to be very high. The upper right is terrific.
59:10
However, it's very hard to do that and in order for us to search for the best answer across every single one of those intersections, x-y coordinates okay, so just look at every single x-y coordinate. All those blue dots came from. Some repartitioning of the software. Some optimizing solution has to go and figure out whether to use tensor parallel, expert parallel, pipeline parallel or data parallel and distribute this enormous model across all these different GPUs and sustain the performance that you need. This exploration space would be impossible if not for the programmability of NVIDIA's GPUs, and so, because of CUDA, because we have such rich ecosystem, we could explore this universe and find that green roof line.
01:00:02
It turns out that green roof line. Notice, you got TP2, epa, dp4, it means two tensor parallel, tensor parallel across two, gpus, expert parallels across eight, data parallel across four. Notice, on the other end, you got tensor parallel across four and expert parallel across 16. The configuration, the distribution of that software, it's a different runtime that would produce these different results, and you have to go discover that roof line. Well, that's just one model and this is just one configuration of a computer. Imagine all of the models being created around the world and all the different configurations of systems that are going to be available. So now that you understand the basics, let's take a look.
01:00:52 - Mikah Sargent (Host)
Oh yeah, thank you, I definitely understand.
01:00:55 - Jeff Jarvis (Host)
That was the basics, micah, the basics. Now we're going to grad school. Oh, okay.
01:01:00 - Jensen Huang (Other)
That's the extraordinary thing In one generation, because we created a system that's designed for trillion parameter generative AI. The inference capability of Blackwell is off the charts and in fact it is some 30 times hopper yeah.
01:01:19 - Jeff Jarvis (Host)
You know, I think that's all they needed to say it's 30 times better.
01:01:23 - Mikah Sargent (Host)
We didn't need to know all of the math and everything.
01:01:25 - Jensen Huang (Other)
For large language models. For large language models. I think I make fun of the doobsters talking about controlling compute.
01:01:31 - Jeff Jarvis (Host)
When you hear a presentation like this, I start to at least understand the we changed the architecture of hopper.
01:01:36 - Jensen Huang (Other)
We just made it a bigger chip.
01:01:39 - Jeff Jarvis (Host)
Fearful awe at the size of all this.
01:01:43 - Jensen Huang (Other)
The greatest 10 terabytes per second. We connected the two chips together, we got this giant 208 billion parameter chip. How would we have performed if nothing else changed? And it turns out quite wonderfully, quite wonderfully, and that's the purple line, but not as great as it could be. And that's where the FP4, tensorcore, the new transformer engine and, very importantly, the NVLink switch, and the reason for that is because all these GPUs have to share the results, partial products, whenever they do, all to all, all gather, whenever they communicate with each other. That NVLink switch is communicating almost 10 times faster than what we could do in the past using the fastest networks. So Blackwell is going to be just an amazing system for a generative AI, and in the future, data centers are going to be thought of, as I mentioned earlier, as an AI factory, and AI factories' goal in life is to generate revenues, generate, in this case, intelligence in this facility, not generating electricity, as in AC generators, but of the last industrial revolution, and this industrial revolution, the generation of intelligence. And so this ability is super, super important.
01:03:11 - Mikah Sargent (Host)
They're really pushing this idea that it's the industrial revolution number two. So are your thoughts on that, Jeff?
01:03:17 - Jensen Huang (Other)
Do you feel that at all? This is a year and a half ago, two years ago, I guess. Two years ago we never started to communicate. No, okay, we could talk about more, but yeah.
01:03:25 - Keynote (Announcement)
I just.
01:03:25 - Mikah Sargent (Host)
We had the benefit of Maybe we'll talk about it on twig, because I'd love to hear your in-depth thoughts on what you think it has any?
01:03:32 - Mikah Sargent (Host)
You presume much sir.
01:03:34 - Jensen Huang (Other)
We had two customers.
01:03:37 - Jeff Jarvis (Host)
We have more now, or this idea that you could manufacture intelligence. That's not sitting well, but that's the ego of this industry, right?
01:03:50 - Mikah Sargent (Host)
It's kind of the Unbelievable. What's that mandate? There's a historical mandate.
01:03:55 - Jensen Huang (Other)
Unbelievable excitement. And there's a whole bunch of different configurations. God was, he manifest destiny.
01:03:58 - Mikah Sargent (Host)
Yes, he manifest destiny. That's exactly what I was thinking.
01:04:00 - Jensen Huang (Other)
Confederation is to slide into the hopper form factor, so that's easy.
01:04:04 - Jeff Jarvis (Host)
I mean just say that the computer has intelligence, is itself anthropomorphic Right. Then to say that you could make more of it faster is egotistical. You brisk.
01:04:15 - Mikah Sargent (Host)
And, by the way, for folks who are listening wondering what's going on, why are you talking? All that's been said is that they had two customers a long time ago and now they have many customers. That's literally all that's been said, while Jeff and I were talking. I'm going to figure this out heared up.
01:04:30 - Jensen Huang (Other)
All the OEMs and ODMs, regional clouds, sovereign AIs and telcos all over the world are signing up to launch with Blackwell. This Blackwell would be the most successful product launch in our history, and so I can't wait to see that. I want to thank some partners that are joining us in this. Aws is gearing up for Blackwell. They're going to build the first GPU with secure AI. They're building out a 222XFLOPS system Just now. When we animated just now the digital twin, if you saw all of those clusters coming down by the way, that is not just art, that is a digital twin of what we're building. That's how big it's going to be.
01:05:26
Besides infrastructure, we're doing a lot of things together with AWS. We're CUDA accelerating SageMaker AI. We're CUDA accelerating Bedrock AI. Google Robotics is working with us using NVIDIA Omniverse and Isaac Sim. Aws Health has NVIDIA Health integrated into it, so AWS has really leaned into accelerated computing.
01:05:49
Google is gearing up for Blackwell. Gcp already has A100s, h100s, t4s, l4s a whole fleet of NVIDIA CUDA GPUs, and they recently announced the Gemma model that runs across all of it. We're working to optimize and accelerate every aspect of GCP. We're accelerating Dataproc, which is data processing, data processing engine, jax, xla, vertex AI and Mujoko for robotics. So we're working with Google and GCP across a whole bunch of initiatives. Oracle is gearing up for Blackwell. Oracle is a great partner of ours for NVIDIA DGX Cloud. We're also working together to accelerate something that's really important to a lot of companies Oracle Database. Microsoft is accelerating and Microsoft is gearing up for Blackwell. Microsoft NVIDIA has a wide-ranging partnership where we're accelerating CUDA, accelerating all kinds of services when you chat, obviously and AI services that are in Microsoft Azure. It's very, very likely NVIDIA is in the back doing the inference and the token generation.
01:06:54
They built the largest NVIDIA Infiniband supercomputer, basically a digital twin of ours or a physical twin of ours. We're bringing the NVIDIA ecosystem to Azure, NVIDIA DGX Cloud to Azure. Nvidia Omniverse is now hosted in Azure, NVIDIA Healthcare is in Azure and all of it is deeply integrated and deeply connected with Microsoft Fabric. The whole industry is gearing up for Blackwell. This is what I'm about to show you. Most of the scenes that you've seen so far of Blackwell are the full fidelity design of Blackwell. Everything in our company has a digital twin. In fact, this digital twin idea is really spreading and it helps companies build very complicated things perfectly. The first time. That could be more exciting than creating a digital twin to build a computer that was built in a digital twin. Let me show you what Wisdron is doing.
01:07:57 - Mikah Sargent (Host)
Thank you. I need to see some example of what this means and how that is helpful To meet the demand for.
01:08:02 - Keynote (Announcement)
NVIDIA accelerated computing. Wisdron, one of our leading manufacturing partners, is building digital twins of NVIDIA DGX and HGX factories using custom software developed with Omniverse SDKs and APIs. For their newest factory, wisdron started with the digital twin to virtually integrate their multi-cad and process simulation data into a unified view. Testing and optimizing layouts in this physically accurate digital environment increased worker efficiency by 51%. During construction, the Omniverse digital twin was used to verify that the physical build matched the digital plans. Identifying any discrepancies early has helped avoid costly change orders and the results have been impressive.
01:08:47
Using a digital twin borrowing Wisdron's factory online in half the time just two and a half months instead.
01:08:53 - Jeff Jarvis (Host)
Of five Mike is getting an empathetic headache. I am indeed the.
01:08:56 - Keynote (Announcement)
Omniverse digital twin helps Wisdron rapidly test new layouts through cognitive processes or improve operations in the existing space.
01:09:08 - Mikah Sargent (Host)
I just imagine you start to build a digital twin and then you go in as a person who has no construction knowledge and you tell the person who does have construction knowledge that they're doing it wrong.
01:09:19 - Keynote (Announcement)
That's kind of troubling. Nvidia's global ecosystem of partners is a really new era of extraordinary. Ai-enabled digitalization.
01:09:27 - Mikah Sargent (Host)
It's a little bit, I don't know. You just have to see. So there's a factory that was built after the creation of a digital twin.
01:09:39 - Jensen Huang (Other)
That's the way it's going to be in the future, when a manufacturer everything digitally first and then will manufacture it physically.
01:09:44 - Mikah Sargent (Host)
It's like taking the Sims and making it into a business platform.
01:09:50 - Jensen Huang (Other)
What was it that you saw that caused you to put it all in on this incredible idea? And it's this. Sorry, what Hang on a second oh okay, Guys, that was going to be such a moment.
01:10:16 - Jeff Jarvis (Host)
Oh, another punchline Teraflops.
01:10:19 - Mikah Sargent (Host)
That's what happens when you don't rehearse. That's what happens when you don't rehearse. He said oh, no rehearsal.
01:10:25 - Jensen Huang (Other)
Yes, as you know, was first contact 2012, alexnet.
01:10:32 - Mikah Sargent (Host)
Okay, how do you feel?
01:10:33 - Mikah Sargent (Host)
about using that term first contact, as if it's a new life form.
01:10:42 - Mikah Sargent (Host)
And we said first contact with AI change everything via computer vision.
01:10:49 - Jensen Huang (Other)
You take one million numbers. You take one million numbers across three channels, rgb. These numbers make no sense to anybody. You put it into this software and it compress it dimensionally, reduce it. It reduces it from a million dimensions, a million dimensions. It turns it into three letters, one vector, one number and it's generalized.
01:11:18
You could have the cat be different cats and you could have it be the front of the cat and the back of the cat. And you look at this thing and you say unbelievable. You mean any cats, yeah, any cat. And it was able to recognize all these cats. And we realized how it did it Systematically, structurally, it's scalable. How big can you make it? Well, how big do you want to make it? And so we imagine that this is a completely new way of writing software. And now, today, as you know, you can have you type in the word CAT and what comes out is a cat. It went the other way. Am I right? Unbelievable? How is it possible? That's right. How is it possible? You took three letters and you generated a million pixels from it and it made sense. Well, that's the miracle, it had five legs. But other than that, yeah, exactly Ten years later, where we recognize text, we recognize images, we recognize videos and sounds.
01:12:37 - Jeff Jarvis (Host)
Google has learned to understand everything. Google organized everything they want to learn and understand it yeah, in 2017, the transformer model hit.
01:12:50 - Jensen Huang (Other)
It understood, not just recognize the English. It understood the English. It doesn't just recognize the pixels, it understood the pixels. And you can you can.
01:12:58 - Jeff Jarvis (Host)
It understands brain waves and amino acids and proteins.
01:13:00 - Jensen Huang (Other)
You can have language, condition, image and generate all kinds of interesting things. Well, if you can understand these things, what else can you understand that you've digitized? The reason why we started with text and you know images is because we digitized those. But what else have we digitized? Well, it turns out we digitized a lot of things proteins and genes and brain waves Anything you can digitize so long as there's structure. We can probably learn some patterns from it, and if we can learn the patterns from it, we can understand its meaning. If we can understand its meaning, we might be able to generate it as well, and so, therefore, the generative AI revolution is here. Well, what else can we generate?
01:13:42 - Jeff Jarvis (Host)
Pattern, to meaning, to generation.
01:13:44 - Jensen Huang (Other)
Well, one of the things that we would love to learn. We would love to learn is we would love to learn climate. We would love to learn extreme weather Before we rule in it. We would love to learn how we can predict future weather at regional scales at sufficiently high resolution such that we can keep people out of harm's way before harm comes. Extreme weather costs the world $150 billion surely more than that and it's not evenly distributed. $150 billion is concentrated in some parts of the world and, of course, to some people of the world. We need to adapt and we need to know what's coming, and so we're creating Earth 2.
01:14:27
Or better, earth, a digital twin of the Earth for predicting weather, and we've made an extraordinary invention called CORDIV the ability to use generative AI to predict weather at extremely high resolution.
01:14:41 - Mikah Sargent (Host)
It will say the way that you stop climate change from happening is by stopping the company in video from continuing to create larger and larger GPUs, which is why they have not asked that question.
01:14:58 - Jeff Jarvis (Host)
That would be a great novel. You build the most powerful, brilliant machine and it tells you you're the problem. Yes, I like that.
01:15:07 - Keynote (Announcement)
I have no sense of how much NVIDIA does models, whether that competes with their customers. Yeah, that's a good question.
01:15:28 - Mikah Sargent (Host)
I don't know what the business relationship is there. What is an AI voice?
01:16:07 - Jeff Jarvis (Host)
That accent is unusually British.
01:16:15 - Jensen Huang (Other)
The weather company has to trust the source of global weather prediction. We are working together to accelerate their weather simulation, first principle base of simulation. However, they're also going to integrate Earth 2.CORDIV so that they could help businesses and countries do regional high resolution weather prediction. And so if you have some weather prediction you'd like to know, I'd like to do reach out to the weather company. Really exciting, really exciting work. Nvidia Healthcare something we started 15 years ago. We're super, super excited about this.
01:16:48 - Mikah Sargent (Host)
This is an area where we're very, very proud.
01:16:50 - Jensen Huang (Other)
I'm honestly, genuinely excited about this aspect as well. Whether it's medical imaging or gene sequencing or computational chemistry, it is very likely that NVIDIA is the computation behind it. It's not just diagnostic.
01:17:01 - Jeff Jarvis (Host)
We've done so much like in this area.
01:17:04 - Jensen Huang (Other)
Today we're announcing Personalized treatment. That we're going to do something, really really.
01:17:07 - Mikah Sargent (Host)
Because the long history of healthcare research and study. Like white, people have the advantage of having had that amount of research done for them, and now the people of color have had such a short amount. So to be able to accelerate that part of it and give us the same, I think is a really cool idea Is now passed through machine learning so that we understand the language of life.
01:17:36 - Jensen Huang (Other)
I gave a talk at a pharma company some years ago and it was like listen to all of them talk.
01:17:41 - Jeff Jarvis (Host)
It's all about finding a molecule.
01:17:43 - Keynote (Announcement)
It's all about drugs as a molecule.
01:17:46 - Jeff Jarvis (Host)
It's a single molecule, single molecule. That's what the whole business is. A molecule Just amazed me when you think about it. Yes, of course it is. Yeah, but finding the molecule, finding out what it will do, predicting it, testing it, 200,000 of them.
01:18:03 - Jensen Huang (Other)
in just what is it? Less than a year or so, alphafold has reconstructed 200 million proteins, basically every protein of every living thing is sequenced.
01:18:16 - Keynote (Announcement)
This is completely revolutionary.
01:18:18 - Jensen Huang (Other)
Well, those models are incredibly hard to use, incredibly hard for people to build, and so what we're going to do is we're going to build them. We're going to build them for the researchers around the world, and it won't be the only one. There will be many other models that we create, and so let me show you what we're going to do with it.
01:18:41 - Keynote (Announcement)
Virtual screening for new medicines is a computationally intractable problem. Existing techniques can only scan billions of compounds and require days on thousands of standard compute nodes to identify new drug candidates. Nvidia, bionimo, nims enable a new generative screening paradigm. Using NIMS for protein structure prediction with AlphaFold, molecule generation with MoLMIM and docking with DiffDock, we can now generate and screen candidate molecules in a matter of minutes. There it is, the MoLMIM. Molmim can connect to custom applications to steer the generative process, iteratively optimizing for desired properties. These applications can be defined with Bionimo microservices or built from scratch. Here, a physics-based simulation optimizes for a molecule's ability to bind to a target protein while optimizing for other favorable molecular properties. In parallel, molmim generates high-quality drug-like molecules that bind to the target and are synthesizable, translating to a higher probability of developing successful medicines faster.
01:19:47
Bionimo is enabling a new paradigm in drug discovery, with NIMS providing on-demand microservices that can bind to be powerful drug discovery workflows like de novo protein or guided molecule generation or virtual screening. Bionimo and NIMS are helping researchers and developers reinvent competition.
01:20:07 - Jeff Jarvis (Host)
A lot of human testing needed. Yes, a lot of steps between here and there, absolutely, but maybe cutting out some steps of human testing too, if you find out what doesn't work Exactly.
01:20:18 - Jensen Huang (Other)
MoLM, corediff. There's a whole bunch of other models, whole bunch of other models Computer vision models, robotics models and even, of course, some really, really terrific open-source language models. These models are groundbreaking. However, it's hard for companies to use. How would you use it? How would you bring it into your company and integrate it into your workflow? How would you package it up and run it?
01:20:43
Remember, earlier I just said that inference is an extraordinary computation problem. How would you do the optimization for each and every one of these models and put together the computing stack necessary to run that supercomputer so that you can run these models in your company? And so we have a great idea. We're going to invent a new way for you to receive and operate software. This software comes basically in a digital box. We call it a container and we call it the NVIDIA inference microservice, a NIM. And let me explain to you what it is A NIM. It's a pre-trained model, so it's pretty clever and it is packaged and optimized to run across NVIDIA's installed base, which is very, very large. What's inside? It is incredible. You have all these pre-trained, state-of-the-art, open source models. They could be open source, they could be from one of our partners, it could be created by us. Like NVIDIA Moment, it is packaged up with all of its dependencies, so CUDA the right version.
01:21:56
CUDNN the right version, and then you have the NVIDIA RT LLM distributing across the multiple GPUs, trident inference server all completely packaged together. It's optimized depending on whether you have a single GPU, multi-gpu or multi-node of GPUs. It's optimized for that and it's connected up with APIs that are simple to use. Now this think about what an AI API is. An AI API is an interface that you just talked to, and so this is a piece of software in the future that has a really simple API, and that API is called Human, and these packages incredible bodies of software will be optimized and packaged.
01:22:37
This is why he said you could stop training computer science and you can download it, you could take it with you, you could run it in any cloud, you could run it in your own data center, you could run it in workstations if it fit, and all you have to do is come to ainvidiacom. We call it Nvidia inference microservice, but inside the company we all call it NIMS. Okay, applause, just imagine you know one. Some day there's going to be one of these chatbots and these chatbots are just going to be in a NIM and you'll assemble. Did you watch?
01:23:20 - Mikah Sargent (Host)
Yay, there it is Our little pin that you put on your shirt.
01:23:24 - Jensen Huang (Other)
Yes, it is unlikely that you'll write it from scratch or write a whole bunch of Python code or anything like that. It is very likely that you assemble a team of ai's. There's probably going to be a super ai that you use that takes the mission that you give it and breaks it down into an execution plan. Some of that execution plan could be handed off to another NIM.
01:23:47 - Jeff Jarvis (Host)
This idea of agent tree is the next step that ai people see, but it doesn't come. Until we trust it, we're not going to give it important tasks. We shouldn't give it important tasks like driving a car and killing people unless we fully trust it, or even make it an airplane reservation, or deciding what medicine to give.
01:24:06 - Mikah Sargent (Host)
Agent tree is an opportunity to what pull away from the larger systems. Is that the idea there, where each agent is greater service.
01:24:15 - Jeff Jarvis (Host)
So you think about the first stage of AI was Computational comparison and so on. The next stage is generational it'll make stuff. The next stage is agentic.
01:24:29 - Jensen Huang (Other)
Got it. It knows what answer.
01:24:32 - Jeff Jarvis (Host)
Right answer.
01:24:33 - Mikah Sargent (Host)
I don't trust it yeah, yeah, yeah, I don't trust it to order a meal for me, right, oh?
01:24:37 - Jensen Huang (Other)
you know top of the hour, that has something to do with a bill plan or some forecast or Some customer alert or some bugs database or whatever it happens to be, and we could assemble it using all these names and because these names have been packaged up and Ready to work on your systems, so long as you have video GPUs in your data center in the cloud, this, this nims, will work together as a team and do amazing things, and so we decided this is such a great idea, we're gonna go do that. And so Nvidia has nims running all over the company. We have chatbots being created all over the place, and one of the most important chatbots, of course, is a chip designer chatbot.
01:25:18
You might always hope you're gonna say the lunch I was hoping, so too and so we want something more relatable pilots, that our co-designers with our engineers, and so this is the way we did it.
01:25:32
So we got ourselves a llama. Llama to this is a 70 B and it's, you know, packaged up in a name. And we asked it you know what is a CTL? What turns out, ctl is an internal Program and it has a internal proprietary language. But it thought the CTL was a combinatorial timing logic and so it describes, you know, conventional knowledge of CTL. But that's not very useful to us llama to, by the way from that.
01:25:59
We gave it a whole bunch of new examples. You know this is no different than employee onboarding an employee. We say you know? Thanks for that answer.
01:26:09 - Mikah Sargent (Host)
It's completely wrong and and then we present to them say I appreciate your enthusiasm, even if you're a CTL is at Nvidia and the CTL as you can see.
01:26:21 - Jensen Huang (Other)
You know CTL stands for compute trace Library, which makes sense. You know we're tracing compute cycles all the time and it wrote the program. Is that amazing?
01:26:32 - Mikah Sargent (Host)
Yay training works.
01:26:38 - Jensen Huang (Other)
And so the productivity of our chip designers can go up. This is what you can do with a NIM. First thing you can do with this customize it. We have a service called Nemo micro service that helps you curate the data, preparing the data so that you could teach this on board this AI. You find, tune them and then you guard rail it. You can even evaluate the answer, evaluate its performance or guard rail it as a further other examples.
01:27:01
And so that's called the mean Nemo micro service. Now the thing that's that's emerging here is this there are three elements, three pillars of what we're doing. The first pillar is, of course, inventing the technology for AI models and running AI models and packaging it up for you. The second is To create tools to help you modify it. First is having the AI technology. Second is to help you modify it. And third is infrastructure For you to fine-tune it and, if you like, deploy it.
01:27:29
You could deploy it on our infrastructure called dgx cloud, or you can employ it, deploy it on-prem, you could deploy it anywhere you like. Once you develop it, it's yours to take anywhere, and so we are effectively an AI foundry. We will do for you and the industry on AI what TSMC does for us building chips and so we go to it with our go to TSMC with our big ideas. They manufacture and we take it with us, and so the exactly the same thing here AI foundry and the three pillars are the name I'm to Gutenberg Nemo micro service type is made by the other thing that you could teach the name to do is Understand your.
01:28:10
Remember, inside our company. The vast majority of our data is not in the cloud, it's inside our company. It's been sitting there, you know, being used all the time and, and gosh it's it's basically invidious intelligence. We would like to take that data, learn its meaning, like we learned the meaning of almost anything else that we just talked about. Learn its meaning and then re-index that knowledge into a new type of database called a vector database. And so you're essentially take structured data or unstructured data. You learn its meaning, you encode this meaning. So now this becomes an AI database, and then that AI database in the future, once you create it, you can talk to it, and so let me give you an example of what you could do.
01:28:54
So suppose you create, you get, you've got a whole bunch of multi Modality data, and one good example of that is PDF. So you take the PDF, you take all of your PDFs to all of all your favorite. You know the stuff that Damned PDFs is proprietary to you, critical to your company. You can encode it, damned PDFs. Just as we encode the pixels of a cat and it becomes the word cat, we can encode all of your PDF and it turns into if you didn't make it a new PDF for the first place, you'd still have the data.
01:29:22 - Jeff Jarvis (Host)
That is very true.
01:29:24 - Jensen Huang (Other)
Information of your company and once you have that proprietary information you could chat to it.
01:29:29 - Mikah Sargent (Host)
So it looks like invidious kind of trying to show look, we can do the things that open AI and all of the others. Our customers do this is what's weird about this? To me very strange our software team.
01:29:40 - Jensen Huang (Other)
You know they just chat with the bugs database because adobe already has that functionality in adobe acrobat.
01:29:45 - Jeff Jarvis (Host)
I was just chatting with my PDF the other day talking to this right, it's not a euphemism, am you PDF? You said.
01:29:54 - Jensen Huang (Other)
So so we have another chat box.
01:29:56 - Jeff Jarvis (Host)
It's a classic business to business versus business to consumer problem. I seen this in the media business all the time you can do it. If you're a B2B you can't compete with your C's right. I Going down through your bees, seems.
01:30:11 - Jensen Huang (Other)
Okay. So we call this Nemo retriever, and the reason for that is because, ultimately, its job is to go retrieve information as quickly as possible. And you just talk to it hey, retrieve me this information. It goes on if it brings it back to you. Do you mean this? You go, yeah, perfect, okay, and so we call it the Nemo retriever. Well, the Nemo service helps you create all these things, and we have all these different Nims. We even have nims of digital humans.
01:30:34 - Keynote (Announcement)
I'm Rachel, your AI care manager.
01:30:39 - Jensen Huang (Other)
No, no, no, no, no no so it's a really short clip, but there were so many videos to show you, I guess so many other demos to show you, and so yeah, he really did not do a rehearsal before this. Diana, she is a digital human, nim, and and you just talked to her and she's connected, in this case, to Hippocratic AI's large language model for healthcare, and it's truly amazing. She is just super smart about healthcare things.
01:31:08
I believe it when I see after you're done, after my, my Dwight, my VP of software engineering, talks to the chatbot for bugs database. Then you come over and talk to Diane, and and so so a Diane is Completely animated with AI and she's a digital human.
01:31:25 - Mikah Sargent (Host)
When I would go to the Hippocratic AI website. Like requires a password to access the website.
01:31:30 - Jensen Huang (Other)
The enterprise IT industry is sitting on a goldmine. It's a goldmine because they have so much understanding of the way work is done. They have all these amazing tools that have been created over the years and they're sitting on a lot of data. If they could take that Goldmine and turn them into co-pilots, these co-pilots could help us do things, and so just about every IT Franchise, it platform in the world that has valuable tools that people use is sitting on a goldmine for co-pilots and they would like to build their own co-pilots and their own chatbots. And so we're announcing that NVIDIA AI foundry is working with some of the world's great companies. Sap generates 87% of the world's global commerce. Basically, the world runs on SAP. We run on an SAP. Nvidia and SAP are building SAP dual co-pilots using NVIDIA, nemo and DGX cloud service. Now they run 80 85% of the world's fortune 500 companies run their people and customer service operations on service now, and they're using NVIDIA AI foundry to build service now Assist virtual three other Keynote brings on people from the company data.
01:32:42 - Jeff Jarvis (Host)
He's going on an hour and a half now, followed by just himself 10,000 companies.
01:32:48 - Jensen Huang (Other)
Nvidia AI foundry is working with them, helping them build their Gaia generative AI agent. Snowflake is a company that stores the world's digital warehouse in the cloud and serves over three billion queries a day for 10,000 enterprise customers. Snowflake is working with NVIDIA AI foundry to build co-pilots With NVIDIA, nemo and NIMS net app. Nearly half of the files in the world Are stored on prem on net app. Nvidia AI foundry is helping them Build chatbots and co-pilots, like those vector databases and retrievers with NVIDIA, nemo and NIMS, and we have a great partnership with Dell. Everybody who, everybody who is building these chatbots and generative AI. When you're ready to run it, you're gonna need an AI factory, and Nobody is better at building end-to-end systems of very large scale for the enterprise than Dell. And so anybody, any company, every company will need to build AI factories, and it turns out that Michael is here. He's happy to take your order, oh.
01:34:00 - Jeff Jarvis (Host)
Wow, they were like all over the lip section of ladies and gentlemen's in front Michael Dell.
01:34:06 - Mikah Sargent (Host)
Hello, it's me Michael Dell.
01:34:08 - Jensen Huang (Other)
Hello, Okay, okay, let's talk about the next wave of robotics, the next wave of AI robotics, physical AI. So far, all of the AI that we've talked about is one computer. Data comes into one computer. Lots of the worlds, if you will experience, in digital text form. The AI imitates us by reading a lot of the language to predict the next words. It's imitating you by studying all of the patterns and all the other previous examples. Of course it has to understand context and so on, so forth, but once it understands the context, is essentially imitating you. We take all of the data, we put it into a system like DGX. We compress it into a large language model. Trillions and trillions of parameters become billions and billion trillions of tokens becomes billions of parameters. These billions of parameters becomes your AI.
01:35:01
Well, in order for us to go to the next wave of AI, where the AI understands the physical world, we're gonna need three computers. The first computer is still the same computer. It's that AI computer that now it's gonna be watching video and maybe it's doing synthetic data generation and maybe there's a lot of human Examples. Just as we have human examples in text form, we're gonna have human examples in articulation form and the AIs will watch us, understand what is happening and Try to adapt it for themselves into the context and Because it can generalize with these foundation models, maybe these robots can also perform in the physical world a fairly generally. So I just described in very simple terms Essentially what just happened in large language models, except the chat GPT moment for robotics may be right around the corner.
01:35:55
He just went from a GI building the end-to-end system for robotics for some time the robotics servant, I could do anything. We have the AI system on the GX. Yeah, we have the lower system, which is called a GX for autonomous systems a.
01:36:10
Robotics processor. When we first built this thing, people are what are you guys building? It's a SOC, it's one chip, is designed to be very low power, but it's designed for high-speed sensor processing and AI. And so if you want to run Transformers in a car or you want to run transformers in a you know anything that moves we have the perfect computer for you. It's called the Jetson, and so the DGX on top for training the AI. That's just the autonomous processor, and in the middle we need another computer.
01:36:42
Whereas large language models have the benefit of you providing your examples and then doing reinforcement learning, human feedback, what is the reinforcement learning, human feedback of a robot? Well, it's reinforcement learning, physical feedback. That's how you align the robot, that's how you, that's how the robot knows that, as this learning, these Articulation capabilities and manipulation capabilities, it's going to adapt properly into the laws of physics. And so we need a simulation engine that represents the world digitally for the robot, so that the robot has a gym to go learn how to be a robot. We call that Virtual world omniverse there we go.
01:37:28
There's a computer that runs omniverse is called OVX and OVX, the computer itself, is hosted in the Azure cloud. Oh, microsoft, azure cloud.
01:37:36 - Mikah Sargent (Host)
So basically we built these three things, no wonder they could say co-pilot on top of it. We have algorithms for everything With Microsoft. Now I'm going to show you one super example of how AI and Omniverse are going to work together.
01:37:46 - Jensen Huang (Other)
The example I'm going to show you. It's kind of insane, but it's going to be very, very close to tomorrow. It's a robotics building. This robotics building is called the warehouse. Inside the robotics building are going to be some autonomous systems. Some of the autonomous systems are going to be called humans and some of the autonomous systems are going to be called Forklifts, and these autonomous systems are going to interact with each other, of course, autonomously, and it's going to be overlooked Upon by this warehouse to keep everybody out of harm's way. The warehouse is essentially an air traffic controller and whenever it sees something happening, it will redirect traffic and give new waypoints just new waypoints to the robots and the people, and they'll know exactly what to do.
01:38:32
This warehouse, this building, you can also talk to. Of course you could talk to it hey, you know SAP center, how are you feeling today, for example? And so you could ask the same the warehouse, the same questions. Basically, the system I just described will have Omniverse cloud that's hosting the virtual simulation, and AI running on DGX cloud, and all of this is running in real time. Let's take a look. All right, the future of heavy industries starts as a digital twin.
01:39:05 - Keynote (Announcement)
The AI agents helping robots, workers and infrastructure navigate unpredictable events in complex industrial spaces, will be built and evaluated first in sophisticated digital twins.
01:39:20 - Mikah Sargent (Host)
This makes me wonder if this is part of what Microsoft is doing, because they are called Azure digital twins.
01:39:25 - Keynote (Announcement)
Ah, and he just said that it's hosted in Azure, so maybe Nvidia does the other part of the digital twin.
01:39:34 - Mikah Sargent (Host)
I mean, as you pointed out earlier on the show. It's simply a strategy. Yeah.
01:39:39 - Keynote (Announcement)
I'll tell you software in loop testing of AI agents in this physically accurate simulated environment Enables us to evaluate and refine how the system adapts to real-world Unpredictability and learn how to replace. Here's an incident occurs along this AMR's planned route, blocking its path as it moves to pick up a palette Nvidia Metropolis updates and sends a real-time occupancy map to co-opt, where a new optimal route is calculated. The AMR is enabled to see around corners and improve its mission efficiency With generative AI powered Metropolis vision foundation models, operators can even ask questions using natural language. The visual model understands nuanced activity and can offer immediate insights to improve operations. All of the sensor data is created in simulation and passed to the real-time AI Running as Nvidia inference microservices or NIMS, and when the AI is ready to be deployed in the physical twin, the real warehouse, we connect Metropolis and Isaac NIMS to real-time operations, metropolis and Isaac NIMS to real sensors with the ability for continuous improvement Got it.
01:40:50 - Mikah Sargent (Host)
So instead of Simulating in real life boxes falling, you can do all of the unsafe stuff in the digital twin and then it in theory, learn from, learn from that and bring it in real life.
01:41:03 - Jensen Huang (Other)
That's pretty cool.
01:41:04 - Mikah Sargent (Host)
Future facility Warehouse factory building will be software defined simulate a new place is running, kind of a thing.
01:41:12 - Jensen Huang (Other)
How else would you test the software? So you, you test the software to build the warehouse.
01:41:16 - Jeff Jarvis (Host)
The human's become what, directed by the AI, this becomes.
01:41:20 - Mikah Sargent (Host)
You know, we got to think about Quality of work, exactly quality of work and, frankly, quality of life for the humanists. Yeah we're being told what to do by an AI for robotic systems is with digital twins.
01:41:32 - Jeff Jarvis (Host)
Or you hope it's the opposite, that the humans are the master exactly access we're gonna create basically.
01:41:37 - Jensen Huang (Other)
An omnivorous idol master is not right.
01:41:39 - Jeff Jarvis (Host)
That's working.
01:41:40 - Jensen Huang (Other)
Yeah, well just channel and you can connect your application to it. So this is, this is going to be as wonderfully, beautifully simple in the future that omniverse is going to be, and with these apis, you're going to have these magical digital twin capability. We also have turned Omniverse into an AI and integrated it with the ability to chat, usd, the. The language of our language is, you know, human, and Omniverse's language, as it turns out, is universal. Scene description.
01:42:11 - Mikah Sargent (Host)
Oh, usd, that's good language, that's a common rather complex, and so we've taught our form that language, and so you can speak to it in English
01:42:19 - Jensen Huang (Other)
And it would directly generate usd and it would talk back in usd but converse back to you in English. You could also look for information in this world semantically. Instead of the world being encoded semantically in language now, it's encoded semantically in scenes and so you could ask it of Certain objects or certain conditions and certain scenarios and it can go and find that scenario for you. It also can collaborate with you in generation. You could design some things in 3d, it could simulate some things in 3d, or you could use AI to generate something in 3d. Let's take a look at how this is all going to work. We have a great partnership with seamen's. Seamans is the world's largest industrial engineering and operations platform. You've seen now so many different companies in the industrial space. Heavy industries is one of the greatest final frontiers of it, and we finally now have the necessary technology to go and make a real impact. Seamans is building the industrial metaverse and today we're announcing that seamans is connecting their crown jewel Accelerator to nvidia omniverse. Let's take a look.
01:43:28 - Keynote (Announcement)
Seamans technologies transform every day for everyone. Teamcenter acts, our leading product lifecycle management software from the seamans accelerator platform Is used every day by our customers to develop and deliver products at scale. Now we are bringing the real and additional worlds even closer by integrating nvidia, ai and omniverse technologies into teamcenter x. Omniverse, apis enabled data interoperability and physics based rendering to industrial scale design and manufacturing projects. Our customers hd, hyundai, don't use this accent around robots.
01:44:09
Often comprising over seven million discrete parts, omniverse, apis teams and the x. Lets companies like hd Hyundai unify and visualize these massive Engineers. I'll be back and integrate general defi to generate 3d objects or hdri backgrounds to see their projects in context.
01:44:32
The result an ultra intuitive photo reel by ui have many children friends that eliminates waste and errors, delivering huge savings in cost and time huge. And we are building this for collaboration, whether across more seamans accelerator tools like seamans anix or star ccm Plus, or across teams working on their favorite devices in the same scene together and this is just the beginning working with nvidia, we will bring accelerate computing, generate if ai and omniverse Integrations across the seamans accelerator portfolio.
01:45:17 - Jensen Huang (Other)
The, the, the professional, the professional voice actor happens to be a good friend of mine, roland bush, who happens to be the CEO of seamans.
01:45:30 - Mikah Sargent (Host)
Good, thank you.
01:45:35 - Jensen Huang (Other)
Once you get omniverse connected into your workflow, your ecosystem, from the beginning of your design to engineering, to manufacturing planning, all the way to digital twin operations Once you connect everything together, it's insane how much productivity you can get, and it's just really really wonderful. All of a sudden, everybody's operating on the same ground. Truth you don't have to exchange data and convert data, make mistakes.
01:46:07
Generated is it ground, truth, the design department's the architecture department, all the way to the engineering and even the marketing department. Let's take a look how Nissan has integrated Omniverse into their workflow, and it's all because it's connected by all these wonderful tools and these developers that we're working with. Take a look.
01:46:33 - Jeff Jarvis (Host)
We're seeing vehicles, we're seeing purses and nice design is the best example of design.
01:47:01 - Jensen Huang (Other)
Every color can make this car look unique.
01:47:04 - Keynote (Announcement)
I like the gray color. Very good choice.
01:47:08 - Mikah Sargent (Host)
Interesting. So talking to an AI to have a certain color or other design change and having it do that In any language, yeah, good point. Rendering an environment based on a description Show me a vehicle that can go anywhere. Add on the Explorer pack, so that is obviously trained specifically to what Nissan has for it to know what the Explorer pack was. I'm looking forward to more contextual generative AI, because right now it's just not contextual enough.
01:48:05 - Jeff Jarvis (Host)
I like that car. I wonder what it is.
01:48:07 - Jensen Huang (Other)
That was not an animation, that was Omniverse. Today we're announcing that Omniverse Cloud streams to the Vision Pro.
01:48:20 - Mikah Sargent (Host)
So Omniverse Cloud comes to the Vision Pro so you could walk around in a physical space and see your digital twin in real life, that you walk around virtual doors and blame everything wrong with your twin Exactly it's my human twin. Why did you do that?
01:48:38 - Jensen Huang (Other)
It's really quite amazing. The Vision Pro, connected to Omniverse, portals you into Omniverse and because all of these CAD tools and all these different design tools are now integrated and connected to Omniverse, you can have this type of workflow really incredible. Let's talk about robotics. Everything that moves will be robotic. There's no question about that. It's safer, it's more convenient and one of the largest industries is going to be automobile, everything that moves will be.
01:49:06
Robot Would be products that would run automatically from the top to the back, from top to bottom, as I was mentioned, from the computer system. But in the case of self-driving cars, including the self-driving applications, at the end of this year or at the beginning of next year we will be shipping in Mercedes and shortly after that, jlr. These autonomous robotic systems are software defined. They take a lot of work to do artificial intelligence, control and planning, all kinds of very complicated technology and takes years to refine. We're building the entire stack. However, we open up our entire stack for all of the automotive industry.
01:49:42
This is just the way we work, the way we work in every single industry. We try to build as much of it as we can so that we understand it, but then we open it up so that everybody can access it, Whether you would like to buy just our computer, which is the world's only full, functional, safe, a-sold system that can run AI this functional, safe, a-sold, quality computer, or the operating system on top, or, of course, our data centers, which is in basically every AV company in the world. However you would like to enjoy it, we're delighted by it. Today, we're announcing that BYD, the world's largest AV company is adopting our next generation.
01:50:25
It's called Thor. Thor is designed for transformer engines. Thor, our next generation AV computer, will be used by BYD. You probably don't know this, back that we have over a million robotics developers. We created Jetson, this robotics computer. We're so proud of it. The amount of software that goes on top of it is insane. But the reason why we can do it at all is because it's 100% CUDA compatible. Everything that we do, everything that we do in our company, is in service of our developers, and by us being able to maintain this rich ecosystem and make it compatible with everything that you access from us, we can bring all of that incredible capability to this little tiny computer. We call Jetson a robotics computer. We also today are announcing this incredibly advanced new SDK. We call it Isaac Perceptor.
01:51:25
Isaac Perceptor, most of the robots today are pre -programmed. They're either following Rails on the ground, digital Rails or, if they'd be, following April, tags. But in the future they're going to have perception. And the reason why you want that is so that you could easily program it. You say, would you like to go from point A to point B? And it will figure out a way to navigate its way there. So by only programming waypoints, the entire route could be adaptive, the entire environment could be reprogrammed, just as I showed you at the very beginning with the warehouse. You can't do that with pre-programmed AGVs. If those boxes fall down, they just all gum up and they just wait there for somebody to come clear it. And so now, with the Isaac Perceptor, we have incredible state-of-the-art vision odometry, 3d reconstruction. In addition to 3D reconstruction, depth perception. The reason for that is so that you can have two modalities to keep an eye on what's happening in the world.
01:52:25 - Mikah Sargent (Host)
So like an advanced robotic vacuum?
01:52:28 - Jensen Huang (Other)
Because a lot of them are able to see stuff and move around.
01:52:32 - Mikah Sargent (Host)
Yeah, yeah.
01:52:32 - Jensen Huang (Other)
Manufacturing arms and they are also pre-programmed the computer vision algorithms, the AI algorithms, the control and path planning algorithms that are geometry aware, incredibly computational.
01:52:43 - Jeff Jarvis (Host)
This is where virtual worlds make sense to me. We make these really excellent.
01:52:47 - Mikah Sargent (Host)
We have the world first two accelerated motion planner.
01:52:52 - Jensen Huang (Other)
That is geometry aware.
01:52:55 - Mikah Sargent (Host)
You put something in your pocket, it comes up with an airplane and our daily plan is excellent for protection for pollution, as long as it does for a human to do it at the Starbucks that's next door and I wonder about the practicality of that yeah it's the pedantry of motion. Wow, you are those phrases I swear. They're so good.
01:53:23 - Jensen Huang (Other)
And they also just run on video computers. We are starting to do some really great work in the next generation of robotics. The next generation of robotics will likely be a humanoid robotics. We now have the necessary technology and, as I was describing earlier, the necessary technology to imagine generalized human robotics. In a way, human robotics is likely easier.
01:53:51 - Jeff Jarvis (Host)
This is the longest sermon in the biggest super church I've ever seen.
01:53:55 - Mikah Sargent (Host)
And you're so right about being a. Hey look, you may be getting all excited about this stuff, but we're the reason that it all exists. Kind of thing trying to suggest. And they're almost putting their flag into future changes that are coming down the pipeline too, to say yeah that robot, that's us.
01:54:18 - Jensen Huang (Other)
You want our chips, then use our stuff. While we're creating, just like we're doing with the others, the entire stack starting from the top a foundation model that learns from watching video. Human, human examples.
01:54:37 - Mikah Sargent (Host)
Hopefully it's not watching America's funniest home videos, because then we'll just learn how to fall a lot how?
01:54:41 - Jeff Jarvis (Host)
to fall a lot.
01:54:42 - Jensen Huang (Other)
I'm going to call Isaac Reinforcement Learning Jim, which allows the humanoid robot to learn how to adapt to the physical world. And then an incredible computer, the same computer that's going to go into a robotic car. This computer will run inside a humanoid robot called Thor. It's designed for transformer engines. We've combined several of these into one video. This is something that you're going to really love. Take a look. Don't tell me how I feel.
01:55:13 - Keynote (Announcement)
If not enough for humans to imagine, we have to invent and explore and push beyond what's been done.
01:55:36 - Mikah Sargent (Host)
The video we're watching is showing different clips of robotics. We push it to fail. An arm squeezing mustard, picking up mustard and squeezing mustard. An arm avoiding obstacles?
01:55:52 - Keynote (Announcement)
Then help it teach itself, we broaden its understanding.
01:55:57 - Mikah Sargent (Host)
Wow, a simulation of an arm doing some tricks with a pencil.
01:56:02 - Keynote (Announcement)
With absolute precision and succeed One of those weird robot dogs climbing stuff. We make it perceive and move and even reason.
01:56:23 - Jeff Jarvis (Host)
So it can show our world. They keep pushing that. Yeah, not yet, just kind of assuming that's the case. Fine reason.
01:56:43 - Mikah Sargent (Host)
Sorting tomatoes. What?
01:56:45 - Keynote (Announcement)
was that. This is where inspiration leads us. The next frontier.
01:56:50 - Mikah Sargent (Host)
They're showing a robot that's kind of mimicking a human video Project, project Groot, it's called a general purpose foundation model for humanoid robot learning.
01:57:05 - Keynote (Announcement)
The group model takes multimodal instructions and past interactions as input and produces the next action for the robot.
01:57:12 - Jeff Jarvis (Host)
I could ask a robot to the prom.
01:57:13 - Mikah Sargent (Host)
It would have had better look, if you wrote it in binary, at least your invitation, I think.
01:57:22 - Keynote (Announcement)
And we scale out with Osmo, a new compute orchestration service that coordinates workflows across dgx systems for training and ovx systems for simulation. With these tools we can train Groot in physically based simulation and transfer zero shot to the real world. The group model will enable a robot to learn from a handful of human demonstrations so it can help with everyday tasks.
01:57:50 - Mikah Sargent (Host)
Wow. So they teleoperate at first and then let the robot use that training to actually do the activity.
01:57:58 - Keynote (Announcement)
This is made possible with NVIDIA's technologies that can understand humans from videos, train models and simulations. So mimicry from demonstration Directly to physical robots, connecting Groot to a large language model.
01:58:11 - Jeff Jarvis (Host)
We can extract every function that the brain or body does Let. It could be a bit of that. That's a lot of work. Yeah, let's high five.
01:58:22 - Keynote (Announcement)
Can you give us some cool moves?
01:58:24 - Mikah Sargent (Host)
Dirt, I suspect we're all gonna have to train our own robots if we want to use them.
01:58:30 - Keynote (Announcement)
All this incredible intelligence. This is how Normas will be set in the future. The little prepackage with them.
01:58:35 - Mikah Sargent (Host)
Design for Groot built for the future. Oh, and then you could probably get the lad-ons that, the dance package or something like that, providing the building blocks for the next generation of AI powered robotics. The barista package. You'd pay like a monthly subscription for a robot that'll make you coffee at home. Now they're bringing what appear to be actual. No, I think it's. No, that's just on the screen. I think Robots they're showing.
01:59:04 - Jeff Jarvis (Host)
Could be worse what Elon did with the dancing human. Oh my yeah.
01:59:10 - Jensen Huang (Other)
The soul of Nvidia, the intersection of computer graphics, physics, artificial intelligence. It all came to bear at this moment the name of that project in general robotics. Zero, zero, three, groot, I know, super good, super good.
01:59:32 - Mikah Sargent (Host)
So how did that become Groot?
01:59:33 - Jensen Huang (Other)
Well, I think we have some special guests.
01:59:36 - Mikah Sargent (Host)
Oh, three, three, isn't he? Oh, no, okay, these look like Star Wars robots, little Star Wars droids.
01:59:52 - Jensen Huang (Other)
So I understand, you guys are powered by Jetson, they're powered by Jetsons.
01:59:58 - Mikah Sargent (Host)
These are actually coming out on stage as opposed to the other robots that are just cute, oh they're speaking.
02:00:08 - Jensen Huang (Other)
Ladies and gentlemen, this is orange and this is the famous green. They are the BDX robots of Disney, amazing Disney research.
02:00:23 - Mikah Sargent (Host)
But they're being teleoperated.
02:00:24 - Jensen Huang (Other)
Come on, you guys, let's wrap up, let's go. Five things when are you going?
02:00:32 - Mikah Sargent (Host)
Maybe they're not being teleoperated.
02:00:33 - Jensen Huang (Other)
I sit right here.
02:00:38 - Mikah Sargent (Host)
Or they did that on purpose to make us think they weren't teleoperated. Hurry up.
02:00:45 - Jensen Huang (Other)
What are you saying? Oh, the green ones stopper. It's not time to eat. I'll give you a snack in a moment. Let me finish up real quick. Come on, green, hurry up. Is it coming? Please tell me it's coming. Wasting time. Five things, five things. First, a new industrial revolution. Every data center should be accelerated. A trillion dollars worth of installed data centers will become modernized over the next several years. Second, because of the computational capability we brought to bear, a new way of doing software has emerged.
02:01:25
This is five years, is the claim it's going to create new infrastructure dedicated to doing one thing and one thing only, not for multi user data centers, but AI generators. These AI generation will create incredibly valuable software, a new industrial revolution. Second, the computer of this revolution, the computer of this generation Generative AI, trillion parameters, blackwell, insane amounts of computers and computing. Third, I'm trying to concentrate.
02:02:01 - Jeff Jarvis (Host)
The robot is making noise Good job.
02:02:05 - Jensen Huang (Other)
Third new computer.
02:02:07 - Mikah Sargent (Host)
It is behaving Rather realistically, I guess.
02:02:11 - Jensen Huang (Other)
Distributed in a new way. It reminds me of my dog.
02:02:14 - Mikah Sargent (Host)
That's what I'll say.
02:02:22 - Jensen Huang (Other)
Your intelligence should be packaged up in a way that allows you to take it with you. We call them NIMS. And third, these NIMS are going to help you create a new type of application for the future, not one that you wrote completely from scratch, but you're going to integrate them like teams, create these applications. We have a fantastic capability between NIMS, the AI technology, the tools, nemo and the infrastructure DGX cloud in our AI foundry to help you create proprietary applications, proprietary chat bots. And then, lastly, everything that moves in a future will be robotic. You're not going to be the only one, and these robotic systems, whether they are humanoid AMRs, self-driving cars, forklifts, manipulating arms they will all need one thing Giant stadiums, warehouses, factories there can be factories that are robotic, orchestrating factories, manufacturing lines that are robotics, building cars that are robotics. These systems all need one thing they need a platform, a digital platform, a digital twin platform, and we call that omniverse the operating system of the robotics world.
02:03:35
These are the five things that we talked about today. What does NVIDIA look like? What does NVIDIA look like when we talk about GPUs? There's a very different image that I have when people ask me about GPUs. First, I see a bunch of software stacks and things like that. And second, I see this this is what we announced to you today. This is Blackwell, this is the platform.
02:04:00 - Mikah Sargent (Host)
You have to do the Bush.
02:04:03 - Jensen Huang (Other)
Amazing processors, nvlink switches, networking systems, and the system design is a miracle. This is Blackwell, and this, to me, is what a GPU looks like in my mind. Listen, orange green. I think we have one more treat for everybody. What do you think Should we?
02:04:31 - Jeff Jarvis (Host)
We're now over two hours.
02:04:32 - Jensen Huang (Other)
We have one more thing to show you Roll it.
02:04:35 - Mikah Sargent (Host)
Will those robots actually get off stage? They're gonna end up staying there. Oh good, there Now the universe. We've created a digital twin of Mars.
02:04:54 - Keynote (Announcement)
Oh, this is dorky, so the alien typing.
02:04:59 - Jeff Jarvis (Host)
these aliens had fingers in the spaceship, so show me Nvidia.
02:05:07 - Mikah Sargent (Host)
And the Miiva platform. Oh, it's so little. It just came out inside of one of these Blackwell platform towers and it's a tiny, tiny, tiny little ship flying over the different things we've seen today the Blackwell GPUs that get Grace CPU, as well as probably the NVLink switch chip. I think I was sleeping. What's a DPU? I missed that one. I don't know what a DPU is, either Digital. There's the NVLink switch. There's the spine, which of course has all the wires to save on electricity. Oh, it's an amphibious ship.
02:06:11 - Jeff Jarvis (Host)
Water, it's water because it's water cool, so it's gotta go through the water. Oh, there's a leak.
02:06:17 - Mikah Sargent (Host)
Don't worry, they're on it. There was a little person inside of the ship I suppose it was supposed to look like. So yeah, that was just a little overview of what we saw.
02:06:53 - Jensen Huang (Other)
Thank you, have a great, have a great GTC, thank you.
02:06:57 - Mikah Sargent (Host)
I wonder if we had a single sip of water. He just simulated the water.
02:07:03 - Jeff Jarvis (Host)
I mean, he did do pretty well, considering the whole show himself?
02:07:06 - Mikah Sargent (Host)
Did the whole show himself? That's two hours of almost consistent and constant communication, paired with no rehearsal, allegedly or very little Enough.
02:07:19
Yeah, not enough rehearsal. I think an impressive job. This has been the Nvidia GTC keynote. I think you nailed it at the beginning whenever you suggested that this was an opportunity for Nvidia to say we've seen how y'all out there getting very excited about what OpenAI is doing, what Microsoft is doing, what Google is doing, what all of these companies are doing. But don't forget, we're the behind the scenes folks that are helping make all of that possible, because this technology that is going to power the next set of whatever happens in AI we may not have heard about otherwise, because those companies are buying it and then building the tools that we actually use on top of it, and that's the stuff that we as consumers see. So I think you're dead on in what this was meant to be, outside of the weird math and jargon that was used a lot of the time, where that did not feel very much for us.
02:08:20 - Jeff Jarvis (Host)
No, it was not at all.
02:08:22 - Mikah Sargent (Host)
We were strangers in a strange land.
02:08:25
But I don't know. I appreciated seeing some of the technology in the sense that it I've mentioned. I've heard about digital twins before on the Microsoft Azure side and not really conceptualizing what that meant outside of some just simulation to go oh, if I was to build a factory here, what would that look like? This actually being the place where you're training these robots is a really cool idea. I think that that makes sense. And then, just overall, again, me as a person who has to talk about this stuff regularly, knowing some of the behind the scenes, knowing the spine, if we want to use one of the terms from the conference, is very helpful. I'm curious your thoughts in particular, you can mention or remind the listeners that the conversation that we had briefly about how the CEO of NVIDIA said this brings the end of the computer scientist to the forefront. What do you think about that?
02:09:34 - Jeff Jarvis (Host)
So Huang said on some stage somewhere, I can't remember what it was there's a video up online that just says that we don't need to train computer scientists anymore probably overstated but instead what matters is domain expertise, that if you're a doctor, that expertise is what matters and that the machine you can talk to the machine, and that's what is miraculous about AI. So it brings it to our level, which I think is real and is true. And I talk a lot about wanting to start a new program in Internet Studies or called Internet AI or Technology Humanities. We've got to bring the humanities back into this. It's about humanity and how we interact with this. It's less about the technology.
02:10:21
One thing I learned in pardon me, I'm going to do a plug here in the Gutenberg parenthesis. I had watched this show. There it is. There it is Is that after some time, the technologists and the technology fade into the scenery. What matters is what we do with it. That was 150 years for print before we got the novel and the essay and newspaper, and I think it's going to happen here too, where both on the Internet and the AI. Of course we need technologists you and I can't build this stuff Right and they're there, but what matters is what we do with it. I think that's what he was really saying is that here's an incredible tool set and it's up to you what you do with it.
02:11:03 - Mikah Sargent (Host)
Go forth and create that attitude, and do you think that this will happen at a more rapid pace than that 150 years, or do you think it will still take a long time?
02:11:13 - Jeff Jarvis (Host)
I think it's still take a long time. This is what people argue with me about, because I say that I think we have time. I think we still don't know what the Internet is. We still don't know what AI is. We're seeing the future and the analog of the past. We haven't really reinvented things yet. If you look online, shows look like shows and books look like books and newspapers look like newspapers, and I think it takes some time to reinvent this. Now, what he was trying to argue basically is we aren't going to generate that. His machines will. I don't know about that. That bothers me.
02:11:43
What struck me more than anything else is scale. I have been arguing that the Internet, the connected world, scales to such an extent that small matters. Now that because everyone can connect to everyone, then small groups and small communities can find each other and do things, and I very much believe that. Yeah, and so scale matters to the extent that it makes small possible, big enough, possible, big enough, big enough. So that's my my cant that I'm using in the next book when we weave out this fall from basic books.
02:12:20
Sorry, but here the constant I don't want to say drumbeat the constant kind of jet noise of scale was somewhat overwhelming to me. Well, we made it this big, and now it's n times that big. It is a thousand times that big. It's faster and faster and faster. And I made reference earlier to the stochastic parents paper written by Tim Nick Evru and Margaret Mitchell and Emily Bender and the fourth person I always apologize, forget and they complained at the time, a couple years ago, that this rush to make ever bigger models is a mistake because it can't be audited, it can't be tracked and traced and it's kind of macho. And we go to smaller models. And if you listen to Jan LeCun at Metta, he talks about smaller models. He talks about different ways to learn, imagining it's a three year old learning language or learning movement, and so I think that brings it back into a human scale of conception, whereas today it blew all that apart. It's just, my God, it is bigger upon bigger, upon bigger, upon bigger, and that's what he wants as a business.
02:13:34 - Mikah Sargent (Host)
Right, yes, it's very what capitalistically minded in the background, even if it's not just ego, just that idea of, yes, a company who makes its money by selling these is obviously going to try to imagine doing more and more and more and getting you to invest in more and more and more.
02:13:53 - Jeff Jarvis (Host)
Right. The last thing that strikes me is the relationship with his customers. Somewhat the fuddles me. Yeah, you know he brags about, about the advances of Transformer, but he gives no credit to Google for that. He talks about the digital double. What do we call it? The digital twin, thank you, twin Close. And, as you pointed out many times, that's kind of Microsoft's phrase. And yes, he mentions the customers and yes, he mentions what they're doing, but he also is competing with them in creating models. And I don't I don't understand the business well enough to get the full dynamics of that. I kind of think there's some trouble here that he thinks they think at Nvidia, you all need us, but Google makes its own chips too, apple makes its own chips too. I don't know that. There's other competitors out there and Nvidia is doing Lord knows is doing extremely well. It's amazing. But at what point does hubris tear them down, right? I don't know.
02:14:59 - Mikah Sargent (Host)
Yeah, that's I'm, and I'm still very confused about it as well. What specifically in that area? I went to some of the pages that were mentioned the AINVIDIAcom, to learn more about the NIMS in particular, which was that thing they were talking about, with their own AI models. But what's interesting is all of the models that they are going to use a strange verb here, nimming are actually models from other companies. So it's stability AI's SDXL Turbo, it's Google's Gemma 7B, it's Metaslama 2 that you're actually interacting with, it's stability's AI, stability video diffusion. So it's almost as if what Nvidia is providing is the platform on which to sit some of these open models, maybe as opposed to being them creating models of their own. So that may have just been, because again, we talked about how it was all kind of what couched in jargon. Maybe a misunderstanding.
02:16:05 - Jeff Jarvis (Host)
Yeah, I don't know, I don't understand. It sounded to me at one point that he was talking about them making their own models, right, and so I don't understand it well enough to know what's going on and kind of what their market position is versus others. And finally, it's amazing too to think about the world used to be owned by Intel and Intel inside it. Now it's Nvidia inside everything.
02:16:29 - Mikah Sargent (Host)
Yeah, and they want to remind us of that. I think that's what today was about. You saw all of the companies mentioned, sap being one that, as Nvidia pointed out themselves, runs a lot of the stuff behind the scenes at Nvidia too. That's a company that has deep roots in so much so learning that they're using Nvidia in many ways. There were some different kind of terms, verb phrases that were used.
02:16:58
That's still very confusing to me and I would like I plan to kind of learn more about what it means, particularly early on when they were talking about being able to oh goodness, I'm forgetting what the terminology was, but essentially a company could accelerate their compute by working with Nvidia in these specific ways, and it wasn't quite clear what that meant and what it meant to go forth with what Nvidia was providing. So there was a lot of kind of Kuda accelerate, yeah, yeah, that's it Kuda accelerate, kuda accelerate, right, Right. So what that actually means in practice was not made clear, I feel, other than we'll make your compute even more computing. That was kind of the extent of it, just to say compute as a noun.
02:17:53 - Jeff Jarvis (Host)
It's like a heat using gift as a verb. Yeah, it's give. It's not give, it's give. And so now that compute becomes a noun, how much compute you have, I understand. It's shorter than saying computer power or something else.
02:18:06 - Mikah Sargent (Host)
The amount of compute in a thing.
02:18:08 - Jeff Jarvis (Host)
But it's very to me. It's a very recently popular phrase. I'm sure they've been using it in computer science for a long time, but now that AI comes, it's out there in language.
02:18:19 - Mikah Sargent (Host)
Yeah that's just not clear right, At least in the sense of everything else that this presentation, this two hour presentation, provided. There were times where I genuinely did feel a little lost by what was being said.
02:18:33 - Jeff Jarvis (Host)
Oh yeah, I fully admit it. Yeah, I was the wrong person to have here, except that I'm interested in this, but that's it.
02:18:39 - Mikah Sargent (Host)
That's what we're both. We're enthusiasts, right, exactly exactly.
02:18:43 - Jeff Jarvis (Host)
Omniverse accelerated computing, guard railet oh, the guard rail. Ai foundry. Everything that moves will be robotic. Those were some of the other phrases that I wrote down that struck me, and then, of course, the embarrassing moment which I tweeted about is when he said to Grace Hopper good girl, Oof, that was oof, oof.
02:19:07 - Mikah Sargent (Host)
Yeah, that was a bad moment there. It was, and I'm sure that will play out too Interestingly in many cases there are still. So the Hopper CPU is still a part of this new GPU architecture, despite the upgrades to the Blackwell GPUs. A lot of that technology is kind of being updated in componentized yeah, componentized areas and ultimately I remain curious at what companies can even afford some of the stuff that they were showing off today because, wow, that goes back to scale.
02:19:50 - Jeff Jarvis (Host)
Yeah, I want to believe that open source models in AI, brought down to the level of researchers in schools and startups, will do amazing things, and I believe that will be the case. But there's this pressure for huge and you see these pictures of them multiplying servers upon racks upon racks upon things that can all talk to each other, and that's fine. And he said you got to fill it with data then and that creates an omnivorous speaking of omnipotent hunger for data that's going to freak people out?
02:20:24
Yes, it will. Data's become a bad word, and I think that's a problem, because data is knowledge at some point. But they're going to try to hoover up everything they possibly can, and the problem remains that what is available is available from those who had the power to publish and the power to create in the past. That is what makes these systems biased. What we really should be talking about is what we add to them, what's missing from them, not to feed their omnivorous omnivores, omniverse, omniverses, but instead to find more diversity and value in how we present the world.
02:21:08 - Mikah Sargent (Host)
Yeah, perhaps, to these machines. Perhaps this is where it is a good opportunity, what we hear about, what's upcoming and sort of what's underway right now. We just heard from OpenAI that they're training on publicly available video, and I think about all of the everyday people who that was kind of their first foray into online publishing and how maybe that will better capture the zeitgeist. Perhaps TikTok publishing and YouTube videos and all that Perhaps is a better representation of the state of the everyday person Versus what I think you're saying, which is that traditionally it's been things from the New York Times, which the New York Times has problems with, and all of those publications that exist on the web that have hold and form a large archive of what the internet is. Yeah, maybe this is going to make things a little bit more, even in terms of where the information's coming from we have to do that intentionally.
02:22:21 - Jeff Jarvis (Host)
Yeah, I see what you mean.
02:22:22 - Mikah Sargent (Host)
Yeah, right, and make that choice, make that choice.
02:22:25 - Jeff Jarvis (Host)
Exactly and recognize what's not there and say we gotta go forward. If you listen to Wikipedia, they do that regularly. They say we have a bunch of mainly white male I think it's heavily male Wikipedia Wikipedia's who work their butts off and do amazing things, great things, yeah, but they recognize that because some people are missing from the core of Wikipedia's, some things are missing in Wikipedia and they have to intentionally go out and encourage other people to do this. And I think this you know, when we have this talk about generative AI and learning training sets and how the New York Times says take our stuff down, all the publishers say take our stuff down and put up paywalls for the machine. Well, what are we? What's the world we're presenting to the AI that's gonna be out there making decisions Oftentimes they shouldn't make?
02:23:12
Yeah, but will be the skewed view of the world, and so the responsibility to me that we don't talk about is what we have to add into all that, and that's again goes back to stochastic parents. If you're dealing with smaller models, you can audit them, you can see what's there and you can see what's missing. You get to the scale he was talking about today.
02:23:31 - Mikah Sargent (Host)
There's no you have to build AI to even come close to doing the task of auditing, and then it still doesn't give you, the human being, the full understanding of how it's been audited.
02:23:42 - Jeff Jarvis (Host)
There's no reasoning and there's no judgment and there's no sense of fact. There's no sense of meaning. Yeah, there was the other jumps here. He didn't talk about AGI per se. He did talk about one of the scales here was to learn and understand everything, and the first comes pattern and then meeting and then generation, and that's kind of the sequence that they have here. Well, there's a few things missing in there. Do you really understand meaning machine? No, you don't. There isn't a sense of meaning, and so when you generate, you generate from something that doesn't have meaning. That's what generative AI does. Right, it doesn't have meaning and because it sounds like us, we think it does. But, as Emily Benderer points out, we impute the meaning in what it produces. It has no sense of meaning, and so this entire industry thinks that it can find meaning and it can't. And so, if it's slowed down, and fill in their gaps as well and the functionality, but instead they're just go blasting ahead with scale.
02:24:41 - Mikah Sargent (Host)
So do you think that it's even? Is it possible for it to? Is it possible for it to actually have that meaning, first and foremost? Secondarily, how do you determine that it does have meaning? And then, I guess, thirdly, it's almost a bit Schrodinger's cat situation where, like by the observation of us as human beings to provide that meaning to it, that's what gives it the meaning. So then, what is another? Can it even be possible for it to have meaning if we don't ascribe meaning to it by giving it meaning? Wow.
02:25:23 - Jeff Jarvis (Host)
Yeah, this is what fascinates me about all this, and what I like about this field is because it forces us to ask questions of ourselves as humans. Mm-hmm, we call it artificial intelligence. Well, what is intelligence? The New York Times starts to believe that AI has a self, which I know it doesn't have. But okay, what is a self? How would we know if it had it? Would we know if it has a sense of meaning? What's the test for that?
02:25:51
The other part that always interests me is how these people make and I didn't hear this today, but you hear it from the AI people, the AGI people, is the goal is to be human, is to duplicate a human function. I don't know that's the right scale for compute. I don't know that's the right judgment of its power and its ability, and so does it have to have a self? No, it doesn't. Can it do the things we want it to do? That's the real test here. Right, and we saw a little bit of that at the very end today, where, look, here's a neat car, we can design it, we can show you stuff, but it's gotta be done in the context of our desires. Damn it. We were in charge.
02:26:34 - Mikah Sargent (Host)
Yeah, I see what you're saying there. It's this idea that we're pushing for it to be human. For what purpose exactly? When really all we want is a tool that can do the things that we need it to do. In a new conception it is. I guess that's a little egotistical, huh? Exactly, yes, human is what it needs to be, because that's the best thing it could be, right.
02:27:03 - Jeff Jarvis (Host)
But it could be in some ways immensely more, but in other ways immensely less. You're right. It's egotistical in two ways. One, that we become the basis of judgment. If you're as smart as us, then you've done it Right. However, the next piece is if I'm the one who makes that machine that makes it smarter than us, then look how smart I was to do it. Yeah, ego squared, wow. That's why this is a fun field to talk about, and this is how I blathered with our friend Jason Howell on AIinsideshow every Wednesday had there Before a twig, and we admit today we don't know what half of what he was talking about. Right, but it still affects us and so we still can have a seat at this table to discuss it, and we need to.
02:27:58 - Mikah Sargent (Host)
We have to.
02:27:59
Yeah, because, as you said exactly, we are not the people. We need those people to create the stuff, but they're creating all of this stuff for the impact that it has on us and the tools that we will eventually have at our disposal. And so understanding as much of it early on, being aware of it and then, I think, more importantly, helping to give other people the understanding such that we're all able to make informative decisions and informative opinions about it, is all very important because, you're right, this is foot on the gas in so many ways, and when you just develop a base fear or misunderstanding of a thing, that results in you being avoidant and sort of oh that's over there and I'm not going to pay any attention to it, and then suddenly it smacks you in the face no, we need more engagement, we need people to be aware of it, so that we're all part of how this goes forward, especially if this is the next industrial revolution, this is the next new thing. It's got to be informed by everyone, not just the people with the CS degrees.
02:29:16 - Jeff Jarvis (Host)
Amen to that Exactly. That's why I like Huang saying we can stop training CS people and start caring about domain expertise. That goes to my heart, that's right. And he says that we can do that because the machine can listen and talk to us. I think that's correct. It can get this right, but we can use our own language to command it and that puts us back in charge and I felt today a little loss of that power that he gave me only two weeks ago in that talk, but I think we can get it back.
02:29:46 - Mikah Sargent (Host)
Yeah Well, this was a really good conversation. I appreciate you being here, jeff Jarvis, to join me in talking about Confidious yes, gtc Keynote. If folks want to check out your other work, where should they go to do that?
02:30:05 - Jeff Jarvis (Host)
GutenbergParenthesescom with a discount code for Gutenberg and my other book magazine. Thank you very much for the plug opportunity.
02:30:12 - Mikah Sargent (Host)
You're very welcome. You can find me all this week on many a show, as Leo Laporte is on a vacation, so tune in all throughout the week. For tomorrow it will be security now Wednesday, Windows Weekly and Thursday will be Wait, wait, wait, you left one out. Oh, my gosh, Twig is also on. Yes, Twig as well.
02:30:33 - Jeff Jarvis (Host)
I'm so sorry. Are we chuffed?
02:30:34 - Mikah Sargent (Host)
Liva. No, you're not chuffed Liva. I'm so sorry. I was thinking because there's a special event on Thursday morning, did you see me?
02:30:40 - Jeff Jarvis (Host)
I'm done with Jarvis now I can forget him for the week. Oh, no more of him.
02:30:46 - Mikah Sargent (Host)
We will be on Twig. I'm very much looking forward to it. Actually, the last time I was on Twig, one of my A very, very good time and that was a really re-energizing moment. So I appreciate that. Thursday morning we will be doing a news event for Microsoft. They're talking about AI and then, of course, tech News Weekly on Thursday as well. So, yeah, tune in this week if you want to hear more of me. And thank you all for tuning in to this event Nvidia GTC 2024. I'll catch you later. Bye-bye.