There’s No AI Without Nvidia. Meet the CEO Powering the Future

23 Febbraio 2024

Talking to Jensen Huang should come with a warning label. The Nvidia CEO is so invested in where AI is headed that, after nearly 90 minutes of spirited conversation, I came away convinced the future will be a neural net nirvana. I could see it all: a robot renaissance, medical godsends, self-driving cars, chatbots that remember. The buildings on the company’s Santa Clara campus weren’t helping. Wherever my eyes landed I saw triangles within triangles, the shape that helped make Nvidia its first fortunes. No wonder I got sucked into a fractal vortex. I had been Jensen-pilled.

Huang is the man of the hour. The year. Maybe even the decade. Tech companies literally can’t get enough of Nvidia’s supercomputing GPUs. This is not the Nvidia of old, the supplier of Gen X video game graphics cards that made images come to life by efficiently rendering zillions of triangles. This is the Nvidia whose hardware has ushered in a world where we talk to computers, they talk back to us, and eventually, depending on which technologist you talk to, they overtake us.

For our meeting, Huang, who is now 61, showed up in his trademark leather jacket and minimalist black sneakers. He told me on that Monday morning that he hates Monday mornings, because he works all day Sunday and starts the official work week already tired. Not that you’d know it. Two days later, I attended a health care investment symposium—so many biotech nerds, so many blazers—and there onstage was Huang, energetic as ever.

“This is not my normal crowd. Biologists and scientists, it’s such an angry crowd,” Huang said into a microphone, eliciting laughter. “We use words like creation and improve and accelerate, and you use words like target and inhibit.” He worked his way up to his pitch: “If you want to do your drug design, your drug discovery, in silicon, it is very likely that you’ll have to process an enormous amount of data. If you’re having a hard time with computation of artificial intelligence, you know, just send us an email.”

Huang has made a pattern of positioning Nvidia in front of every big tech trend. In 2012 a small group of researchers released a groundbreaking image recognition system, called AlexNet, that used GPUs, instead of CPUs, to crunch its code and launched a new era of deep learning. Huang promptly directed the company to chase AI full-steam. When, in 2017, Google released the novel neural network architecture known as a transformer—the T in ChatGPT—and ignited the current AI gold rush, Nvidia was in a perfect position to start selling its AI-focused GPUs to hungry tech companies.

Nvidia now accounts for more than 70 percent of sales in the AI chip market and is approaching a $2 trillion valuation. Its revenue for the last quarter of 2023 was $22 billion—up 265 percent from the year prior. And its stock price has risen 231 percent in the last year. Huang is either uncannily good at what he does or ridiculously lucky—or both!—and everyone wants to know how he does it.

But no one reigns forever. He’s now in the crosshairs of the US-China tech war and at the mercy of regulators. Some of Huang’s challengers in the AI chip world are household names—Google, Amazon, Meta, and Microsoft—and have the deepest pockets in tech. In late December the semiconductor company AMD rolled out a large processor for AI computing that is meant to compete with Nvidia. Startups are taking aim too. In last year’s third quarter alone, venture capitalists funneled more than $800 million into AI chips, according to the research firm Pitchbook.

So Huang never rests. Not even during interviews, as I learned when, to my surprise, he started interviewing me, asking me where I was from and how I ended up living in the Bay Area.

Jensen Huang: You and I are both Stanford grads.

Lauren Goode: Yes. Well, I went to the journalism program, and you did not go to the journalism program.

I wish I had.

Why is that?

Well, somebody who I really admire, as a leader and a person, is Shantanu Narayen, the CEO of Adobe. He said he always wanted to be a journalist because he loved telling stories.

It seems like an important part of building a business, being able to tell its story effectively.

Yes. Strategy setting is storytelling. Culture building is storytelling.

You’ve said many times you didn’t sell the idea of Nvidia based on a pitch deck.

That’s right. It was really about telling the story.

So I want to start with something that another tech executive told me. He noted that Nvidia is one year older than Amazon, but in many ways Nvidia has more of a “day one” approach than Amazon does. How do you maintain that outlook?

That’s really a good phrase, frankly. I wake up every morning like it’s day one, and the reason is there’s always something we’re doing that has never been done before. There’s also the vulnerable side of it. We very well could fail. Just now, I was having a meeting where we’re doing something that is brand-new for our company, and we don’t know how to do it right.

What is the new thing?

We’re building a new type of data center. We call it an AI factory. The way data centers are built today, you have a lot of people sharing one cluster of computers and putting their files in this one large data center. An AI factory is much more like a power generator. It’s quite unique. We’ve been building it over the last several years, but now we have to turn this into a product.

What are you going to call it?

We haven’t given it a name yet. But it will be everywhere. Cloud service providers will build them, and we’ll build them. Every biotech company will have it. Every retail company, every logistics company. Every car company in the future will have a factory that builds the cars—the actual goods, the atoms—and a factory that builds the AI for the cars, the electrons. In fact, you see Elon Musk doing that as we speak. He’s well ahead of most in thinking about what industrial companies will look like in the future.

You’ve said before that you run a flat organization, with between 30 to 40 executives who report directly to you, because you want to be in the information flow. What has piqued your interest lately, that makes you think, “I may need to bet Nvidia on this eventually?”

Information doesn’t have to flow from the top to the bottom of an organization, as it did back in the Neanderthal days when we didn’t have email and texts and all those things. Information can flow a lot more quickly today. So a hierarchical tree, with information being interpreted from the top down to the bottom, is unnecessary. A flat network allows us to adapt a lot more quickly, which we need because our technology is moving so quickly.

If you look at the way Nvidia’s technology has moved, classically there was Moore’s law doubling every couple of years. Well, in the course of the last 10 years, we’ve advanced AI by about a million times. That’s many, many times Moore’s law. If you’re living in an exponential world, you don’t want information to be propagated from the top down one layer at a time.

But I’m asking you, what’s your Roman Empire? Which is a meme. What’s today’s version of the transformer paper? What’s happening right now that you feel is going to change everything?

There are a couple things. One of them doesn’t really have a name, but it’s some of the work that we’re doing in foundational robotics. If you could generate text, if you could generate images, can you also generate motion? The answer is probably yes. And then if you can generate motion, you can understand intent and generate a generalized version of articulation. Therefore, humanoid robotics should be right around the corner.

And I think the work around state-space models, or SSMs, that allow you to learn extremely long patterns and sequences without growing quadratically in computation, probably is the next transformer.

What does that enable? What’s a real-life example?

You could have a conversation with a computer that lasts a very long time, and yet the context is never forgotten. You could even change topics for a while and come back to an earlier one, and that context could be retained. You might be able to understand the sequence of an extremely long chain, like a human genome. And just by looking at the genetic code, you understand its meaning.

How far away are we from that?

In the recent past, from the time that we had AlexNet to superhuman AlexNet, that was only about five years. A robotic foundation model is probably around the corner—I’ll call it next year sometime. From that point, five years down the road, you’re going to see some pretty amazing things.

Which industry stands to benefit the most from a broadly trained model for robot behavior?

Well, heavy industries represent the largest industries in the world. Moving electrons is not easy, but moving atoms is extremely hard. Transportation, logistics, moving heavy things from one place to another, discovering the next drug—all of that requires an understanding of atoms, molecules, proteins. Those are the large, incredible industries that AI hasn’t affected yet.

You mentioned Moore’s law. Is it irrelevant now?

Moore’s law is now much more of a systems problem than a chip problem. It’s much more about the interconnectivity of multiple chips. About 10, 15 years ago, we started down the journey of disaggregating the computer so that you could take multiple chips and connect them together.

Which is where your acquisition of the Israeli company Mellanox comes in, in 2019. Nvidia said at the time that modern computing has put enormous demands on data centers and that Mellanox’s networking technology would make accelerated computing more efficient.

Right, exactly. We bought Mellanox so that we could take an extension of our chip and make an entire data center into a super chip, which enabled the modern AI supercomputer. That was really about recognizing that Moore’s law has come to an end and that if we want to continue to scale computing we have to do it at data center scale. We looked at the way Moore’s law was formulated, and we said, “Don’t be limited by that. Moore’s law is not a limiter to computing.” We have to leave Moore’s law behind so we can think about new ways of scaling.

Mellanox is now recognized as a really smart acquisition for Nvidia. More recently, you attempted to acquire Arm, one of the most important chip IP companies in the world, until you were thwarted by regulators.

That would’ve been wonderful!

I’m not sure the US government agrees, but yes, let’s put a pin in that. When you think about acquisitions now, what specific places are you looking at?

The operating system of these large systems is insanely complex. How do you create an operating system in a computing stack that orchestrates the tens of millions, hundreds of millions, and now coming up to billions of little tiny processors that are in our GPUs? That’s a very hard problem. If there are teams outside our company that do that, we can either partner with them or we could do more than that.

So what I hear you saying is that it’s crucial for Nvidia to have an operating system and to build it into more of a platform, really.

We are a platform company.

The more you become a platform, the more problems you face. People tend to put a lot more onus and responsibility on a platform for its output. How the self-driving car behaves, what the margin of error is on the health care device, whether there’s bias in an AI system. How do you address that?

We’re not an application company, though. That’s probably the easiest way to think about it. We will do as much as we have to, but as little as we can, to serve an industry. So in the case of health care, drug discovery is not our expertise, computing is. Building cars is not our expertise, but building computers for cars that are incredibly good at AI, that’s our expertise. It’s hard for a company to be good at all of those things, frankly, but we can be very good at the AI computing part of it.

Last year reports emerged that some of your customers were waiting several months for your AI GPUs. How are things looking now?

Well, I don’t think we’re going to catch up on supply this year. Not this year, and probably not next year.

What’s the current wait time?

I don’t know what the current lead time is. But, you know, this year is also the beginning of a new generation for us.

Do you mean Blackwell, your rumored new GPU?

That’s right. It’s a new generation of GPUs coming out, and the performance of Blackwell is off the charts. It’s going to be incredible.

Does that equate to customers needing fewer GPUs?

That’s the goal. The goal is to reduce the cost of training models tremendously. Then people can scale up the models they want to train.

Nvidia invests in a lot of AI startups. Last year it was reported that you invested in more than 30. Do those startups get bumped up in the waiting line for your hardware?

They face the same supply crunch as everyone, because most of them use the public cloud, so they had to negotiate for themselves with the public cloud service providers. What they do get, though, is access to our AI technology, meaning they get access to our engineering capabilities and our special techniques for optimizing their AI models. We make it more efficient for them. If your throughput goes up by a factor of five, you’re essentially getting five more GPUs. So that’s what they get from us.

Do you consider yourself a kingmaker in that regard?

No. We invest in these companies because they’re incredible at what they do. It’s a privilege for us to be investing in them, not the other way around. These are some of the brightest minds in the world. They don’t need us to support their credibility.

What happens as machine learning turns more toward inference rather than training—basically, if AI work becomes less computationally intensive? Does that reduce the demand for your GPUs?

We love inference. In fact, I would say that Nvidia’s business today is probably, if I were to guess, 70 percent inference, 30 percent training. The reason why that’s a good thing is because that’s when you realize AI is finally making it. If Nvidia’s business is 90 percent training and 10 percent inference, you could argue that AI is still in research. That was the case seven or eight years ago. But today, whenever you type a prompt into a cloud and it generates something—it could be a video, it could be an image, it could be 2D, it could be 3D, it could be text, it could be a graph—it’s most likely that there’s an Nvidia GPU behind it.

Do you see demand waning at any point for your GPUs for AI?

I think we’re at the beginning of the generative AI revolution. Today most of the computing that’s done in the world is still retrieval-based. Retrieval means you touch something on your phone and it sends a signal out to the cloud to retrieve a piece of information. It might compose a response with a few different things and, using Java, present it to you on your phone, on your nice screen. In the future, computing is going to be more RAG-based. [Retrieval-augmented generation is a framework that allows a large language model to pull in data from outside its usual parameters.] The retrieval part of it will be less, and the personalized generation part will be much, much higher.

That generation will be done by a GPU somewhere. So I think we’re in the beginning of this retrieval-augmented, generative computing revolution, and generative AI is going to be integral to almost everything.

The latest news is that you’ve been working with the US government to come up with sanctions-compliant chips that you can ship to China. My understanding is that these are not the most advanced chips. How closely were you working with the administration to ensure that you could still do business in China?

Well, to take a step back, it’s an export control, not sanctions. The United States has determined that Nvidia’s technology and this AI computing infrastructure are strategic to the nation and that export control would apply to it. We complied with the export control the first time—

In August 2022.

Yes. And the United States added more provisions to the export control in 2023, which caused us to have to reengineer our products again. So we did that. We’re in the process of coming up with a new set of products that are in compliance with today’s export control rules. We work closely with the administration to make sure that what we come up with is consistent with what they had in mind.

How big is your concern that these constraints will spur China to spin up competitive AI chips?

China has things that are competitive.

Right. This isn’t data-center scale, but the Huawei Mate 60 smartphone that came out last year got some attention for its homegrown 7-nanometer chip.

Really, really good company. They’re limited by whatever semiconductor processing technology they have, but they’ll still be able to build very large systems by aggregating many of those chips together.

How concerned are you in general, though, that China will be able to match the US in generative AI?

The regulation will limit China’s ability to access state-of-the-art technology, which means the Western world, the countries not limited by the export control, will have access to much better technology, which is moving fairly fast. So I think the limitation puts a lot of cost burden on China. You can always, technically, aggregate more of the chipmaking systems to do the job. But it just increases the cost per unit on those. That’s probably the easiest way to think about it.

Does the fact that you’re building compliant chips to keep selling in China affect your relationship with TSMC, Taiwan’s semiconductor pride and joy?

No. A regulation is specific. It’s no different than a speed limit.

You’ve said quite a few times that of the 35,000 components that are in your supercomputer, eight are from TSMC. When I hear that, I think that must be a tiny fraction. Are you downplaying your reliance on TSMC?

No, not at all. Not at all.

So what point are you trying to make with that?

I’m simply emphasizing that in order to build an AI supercomputer, a whole lot of other components are involved. In fact, in our AI supercomputers, just about the entire semiconductor industry partners with us. We already partner very closely with Samsung, SK Hynix, Intel, AMD, Broadcom, Marvell, and so on and so forth. In our AI supercomputers, when we succeed, a whole bunch of companies succeed with us, and we’re delighted by that.

How often do you talk to Morris Chang or Mark Liu at TSMC?

All the time. Continuously. Yeah. Continuously.

What are your conversations like?

These days we talk about advanced packaging, planning for capacity for the coming years, for advanced computing capacity. CoWoS [TSMC’s proprietary method for cramming chip dies and memory modules into a single package] requires new factories, new manufacturing lines, new equipment. So their support is really, really quite important.

I recently had a conversation with a generative-AI-focused CEO. I asked who Nvidia’s competitors might be down the road, and this person suggested Google’s TPU. Other people mention AMD. I imagine it’s not such a binary to you, but who do you see as your biggest competitor? Who keeps you up at night?

Lauren, they all do. The TPU team is extraordinary. The bottom line is, the TPU team is really great, the AWS Trainium team and the AWS Inferentia team are really extraordinary, really excellent. Microsoft has their internal ASIC development that’s ongoing, called Maia. Every cloud service provider in China is building internal chips, and then there’s a whole bunch of startups that are building great chips, as well as existing semiconductor companies. Everybody’s building chips.

That shouldn’t keep me up at night—because I should make sure that I’m sufficiently exhausted from working that no one can keep me up at night. That’s really the only thing I can control.

But what wakes me up in the morning is surely that we have to keep building on our promise, which is, we’re the only company in the world that everybody can partner with to build AI supercomputers at data-center scale and at the full stack.

I have some personal questions I wanted to ask you.

[Huang to a public relations representative.] She’s done her homework. Not to mention, I’m just enjoying the conversation.

I’m glad. I am as well. I did want to—

By the way, whenever Morris, or people who I’ve known a long time, ask me to be the moderator of interviews, the reason for that is because I’m not going to sit there and interview them by asking them questions. I’m just having a conversation with them. You have to be empathetic to the audience and what they might want to hear about.

So I asked ChatGPT a question about you. I wanted to know if you had any tattoos, because I was going to propose that for our next meetup, that we get you a tattoo.

If you get a tattoo, I’ll get one.

I already have one, but I’ve been looking to expand.

I have one too.

Yes. This is what I learned from ChatGPT. It said Jensen Huang got a tattoo of the company logo when the stock price reached $100. Then it said, “However, Huang has expressed that he’s unlikely to get any more tattoos, noting the pain was more intense than he anticipated.” It said you cried. Did you cry?

A little bit. My recommendation is you should have a shot of whiskey before you do it. Or take Advil. I also think that women can take a lot more pain, because my daughter has a fairly large tattoo.

So if you were down to get a tattoo, I was thinking a triangle might be nice, because who doesn’t like triangles? They’re perfect geometry.

Or the silhouette of Nvidia’s building! It’s composed of triangles.

That’s a commitment. I was wondering, how often do you personally use ChatGPT or Bard, or the like?

I’ve been using Perplexity. I enjoy ChatGPT as well. I use both almost every day.

For what?

Research. For example, computer-aided drug discovery. Maybe you would like to know about the recent advancements in computer-aided drug discovery. And so you want to frame the overall topic so that you could have a framework, and from that framework, you could ask more and more specific questions. I really love that about these large language models.

I heard you used to lift weights. Do you still do that?

No. I’ll try to do 40 push-ups a day. That doesn’t take any longer than a couple of minutes. I’m a lazy exerciser. I’ll do squats while I’m brushing my teeth.

Recently you made a comment on the Acquired podcast that went viral. The hosts asked, if you were 30 years old today and were thinking about starting a company, what would you start? And you said you wouldn’t start a company at all. Do you have any amendments to that?

That question could be answered in two ways, and I answered it this way, which is: If I knew then all the things that I know now, I would be too intimidated to do it. I would be too afraid. I wouldn’t have done it.

You have to be somewhat delusional to start a business.

That’s the advantage of ignorance. You don’t know how hard it’s going to be, you don’t know how much pain and suffering is involved. When I meet entrepreneurs these days, and they tell me how easy it’s going to be, I’m very supportive of them, and I don’t actually try to burst their bubble. But I know it in the back of my mind, “Oh, boy, this is not going to turn out anything like they think.”

What would you say is the biggest sacrifice that you’ve had to make in running Nvidia?

The same sacrifices other entrepreneurs make. You work really, really hard. And for a long time, nobody thinks you’re going to succeed. You’re alone in believing that you’re going to make it. The insecurities, the vulnerability, sometimes humiliation, it’s all true. Nobody talks about it, but it’s all true. CEOs and entrepreneurs are human like anybody else. And when they fail publicly, it’s embarrassing.

So when somebody said, “Jensen, with everything you have today, you wouldn’t have started it?” Like, “No, no, no, of course not.” But if I had known then that Nvidia would become what it is today, would I have started the company, are you kidding me? I would’ve sacrificed everything to do it.

Source : Wired

There’s No AI Without Nvidia. Meet the CEO Powering the Future

WEATHER

Exchange

The man Canada is sending to battle Trump isn’t a people-person...

iPhone 16e review: A study in contrasts

Apple was smart to leave Apple Intelligence off the iPad

Israel and the occupied territories: Ceasefire agreement essential to avoid region...

The humanitarian impacts of nuclear weapons are beyond the capacity of...

Strengthening Respect for IHL: Building on Commonalities with Islamic Law