Why Amazon Web Services CEO Matt Garman Is Playing the Long Game on AI

(To receive weekly emails of conversations with the world’s top CEOs and decisionmakers, click here.)

Matt Garman took the helm at Amazon Web Services (AWS), the cloud computing arm of the U.S. tech giant, in June, but he joined the business around 19 years ago as an intern. He went on to become AWS’s first product manager and helped to build and launch many of its core services, before eventually becoming the CEO last year.

Like many other tech companies, AWS, which is Amazon’s most profitable unit, is betting big on AI. In April 2023, the company launched Amazon Bedrock, which gives cloud customers access to foundation models built by AI companies including Anthropic and Mistral. At its re:Invent conference in Las Vegas in December, the AWS made a series of announcements, including a new generation of foundation AI models, called Nova. It also said that it’s building one of the world’s most powerful AI supercomputers with Anthropic, which it has a strategic partnership with, using a giant cluster of AWS’s Trainium 2 training chips.

TIME spoke with Garman a few days after the re:Invent conference, about his AI ambitions, how he’s thinking about ensuring the technology is safe, and how the company is balancing its energy needs with its emissions targets.

This interview has been condensed and edited for clarity.

When you took over at AWS in June, there was a perception that Amazon had fallen behind somewhat in the AI race. What have your strategic priorities been for the business over the past few months?

We’ve had a long history of doing AI inside of AWS, and in fact, most of the most popular AI services that folks use, like SageMaker, for the last decade have all been built on AWS. With generative AI we started to really lean in, and particularly when ChatGPT came out, I think everybody was excited about that, and it sparked everyone’s imagination. We [had] been working on generative AI, actually, for a little while before that. And our belief at the time, and it still remains now, was that that AI was going to be a transformational technology for every single industry and workflow and user experience that’s out there. And because of who our customer base is, our strategy was always to build a robust, secure, performance featureful platform that people could really integrate into their actual businesses. And so we didn’t rush really quickly to throw a chatbot up on our website. We really wanted to help people build a platform that could deeply integrate into their data, that would protect their data. That’s their IP, and it’s super important for them, so [we] had security front of mind, and gave you choice across a whole bunch of models, gave you capabilities across a whole bunch of things, and really helped you build into your application and figure out how you could actually get inference and really leverage this technology on an ongoing basis as a key part of what you do in your enterprise. And so that’s what we’ve been building for the last couple of years. In the last year we started to see people realize that that is what they wanted to [do] and as companies started moving from launching a hundred proof of concepts to really wanting to move to production. They realized that the platform is what they needed. They had to be able to leverage their data. They wanted to customize models. They wanted to use a bunch of different models. They wanted to have guardrails. They needed to integrate with their own enterprise data sources, a lot of which lived on AWS, and so their applications were AWS.

We took that long-term view of: get the right build, the right platform, with the right security controls and the right capabilities, so that enterprises could build for the long term, as opposed to [trying to] get something out quickly. And so we’re willing to accept the perception that people thought we were behind, because we had the conviction that we were building the right thing. And I think our customers largely agree.

You’re offering $1 billion worth in cloud credits, in addition to millions previously, for startups. Do you see that opening up opportunities for closer tie-ups at an earlier stage with the next Anthropic or OpenAI?

Yeah, we’ve long invested in startups. It’s one of the core customer bases that AWS has built our business on. We view startups as important to the success of AWS. They give us a lot of great insight. They love using cutting-edge technologies. They give us feedback on how we can improve our products. And frankly, they’re the enterprises of tomorrow, so we want them to start building on AWS. And so from the very earliest days of AWS, startups have been critically important to us, and that’s just doubling down on our commitment to them to help them get going. We recognize that as a startup, getting some help early on, before you get your business going, can make a huge difference. That’s one of the things that we think helps us build that positive flywheel with that customer base. So we’re super excited about continuing to work deeply with startups, and that commitment is part of that. 

You’re also building one of the largest AI supercomputers in the world, with the Trainium 2 chips. Is building the hardware and infrastructure for AI development at the center of your AI strategy? 

It’s a core part of it, for sure. We have this idea that across all of our AWS businesses, that choice is incredibly important for our customers. We want them to be able to choose from the very best technology, whether it comes from us or from third parties. Customers can pick the absolute best product for their application and for their use case and for what they’re looking for from a cost performance trade-off. And so, on the AI side, we want to provide that same amount of choice. Building Tranium 2, which is our our second generation of high-performance AI chip, we think that’s going to provide choice.

Nvidia is an incredibly important partner of ours. Today, the vast majority of AI workloads run on Nvidia technology, and we expect that to continue for a very long time. They make great products, and the team executes really well. And we’re really excited about the choice that Trainium 2 brings. Cost is one of the things that a lot of people worry about when they think about some of these AI workloads, and we think that Trainium 2 can help lower the cost for a lot of customers. And so we’re really excited about that, both for AI companies who are looking to train these massive clusters, [for example] Anthropic is going to be training their next generation, industry-leading model on Trainium 2—We’re building a giant cluster, it’s five times the size of their last cluster—but then the broad swath of folks that are doing inference or using Bedrock or making smaller clusters, I think there’s a good opportunity for customers to lower costs with Trainium.

Those clusters were 30% to 40% cheaper in comparison to Nvidia GPU clusters. What technical innovations are enabling these cost savings?

Number one is that the team has done a fantastic job and produced a really good chip that performs really well. And so from an absolute basis, it gives better performance for some workloads. It’s very workload dependent, but even Apple [says] in early testing, they see up to 50% price performance benefit. That’s massive, if you can really get 30%, 40%, even 50% gains. And some of that is pricing, where we focused on building a chip that we think we can really materially lower the cost to produce for customers. But also then increasing performance—the team has built some innovations, where we see bottlenecks in AI training and inference, that we’ve built into the chips to improve particular function performance, etc. There are probably hundreds of thousands of things that go into delivering that type of performance, but we’re quite excited about it and we’re invested long term in the Trainium line.

The company recently announced the Nova foundation model. Is that aimed at competing directly with the likes of GPT-4 and Gemini?

Yes. We think it’s important to have choice in the realm of these foundational models. Is it a direct competitor? We do think that we can deliver differentiated capabilities and performance. I think that this is such a big opportunity, and has such a material opportunity to change so many different workloads. These really large foundational models—I think there’ll be half a dozen to a dozen of them, probably less than 10. And I think they’ll each be good at different things. [With] our Nova models, we focused on: how do we deliver a really low latency [and] great price performance? They’re actually quite good at doing RAG [Retrieval-Augmented Generation] and agentic workflows. There’s some other models that are better at other things today too. We’ll keep pushing on it. I think there’s room for a number of them, but we’re very excited about the models and the customer reception has been really good.

How does your partnership with Anthropic fit into this strategy?

I think they have one of the strongest AI teams in the world. They have the leading model in the world right now. I think most people consider Sonnet to be the top model for reasoning and for coding and for a lot of other things as well. We get a lot of great feedback from customers on them. So we love that partnership, and we learn a lot from them too, as they build their models on top of Trainium, so there’s a nice flywheel benefit where we get to learn from them, building on top of us. Our customers get to take advantage of leveraging their models inside of Bedrock, and we can grow the business together.

How are you thinking about ensuring safety and responsibility in the development of AI?

It’s super important. And it goes up and down the stack. One of the reasons why customers are excited about models from us, in addition to them being very performant, is that we care a ton about safety. And so there’s a couple of things. One is, you have to start from the beginning when you’re building the models, you think about, how do you have as many controls in there as possible? How do you have safe development to the models? And then I think you need belt and suspenders in this space, because you can, of course, make models say things that you can then say “oh, look what they said.” Practically speaking our customers are trying to integrate these into their applications. And different from being able to produce a recipe for a bomb or something, which we definitely want to have security controls around, safety and control models actually extends specific to very use cases. If you’re building an insurance application, you don’t want your application to give out healthcare advice, whereas, if you’re building healthcare one, you may. So we give a lot of controls to the customers so that they can build guardrails around the responses for models to really help guide how they want models to answer those questions. We launched a number of enhancements at re:Invent including what we call automated reasoning checks, which actually can give you a mathematical proof for if we can be 100% sure that an answer coming back is correct, based on the corpus of data that you have fed into the model. Eliminating hallucinations for a subset of answers is also super important. What’s unsafe in the context of a customer’s application can vary pretty widely, and so we try to give some really good controls for customers to be able to define that, because it’s going to depend on the use cases.

Energy requirements are a huge challenge for this business. Amazon is committed to a net zero emissions target by 2040 and you reported some progress there. How are you planning to continue reducing emissions while investing in large-scale infrastructure for AI?

Number one is you just have to have that long term view as to how we ensure that the world has enough carbon-zero power. We’ve been the single biggest purchaser of renewable energy deals, new energy deals to the grid, so commissioning new solar—solar farms, or wind farms, etc. We’ve been the biggest corporate purchaser each of the last five years, and will continue to do that. Even on that path, that may not be fast enough, and so we’ve actually started investing in nuclear. I do think that that’s an important component. It’ll be part of that portfolio. It can be both large scale nuclear plants as well as, we’ve invested in and we’re very bullish about small modular reactor technology, which is probably six or seven years out from really being in mass production. But we’re optimistic that that can be another solve as part of that portfolio as well.

On the path to carbon zero across the whole business, there’s a lot of invention that’s still going to need to happen. And I won’t sit here and tell you we know all of the answers of how you’re going to have carbon-zero shipping across oceans and airplanes for the retail side of it. And there’s a whole bunch of challenges that the world has to go after, but that’s part of why we made that commitment. We’re putting together plans with with milestones along the way, because it’s an incredibly important target for us. There’s a lot of work to do but we’re committed to doing it.

And as part of that nuclear piece, you’re supporting the development of these nuclear energy projects. What are you doing to ensure that the projects are safe in the communities where they’re deployed?

Look, I actually think one of the worst things for the environment was the mistakes the nuclear industry made back in the ’50s, because it made everyone feel like technology wasn’t that safe, which it may not have been way back then, but, it’s been 70 years, and technology has evolved, and it is actually an incredibly safe, secure technology now. And so a lot of these things are actually fully self-contained and there is no risk of big meltdown or those kind of events that happened before. It’s a super safe technology that has been well-tested and has been in production across the world safely for multiple decades now. There’s still some fear, I think, from people, but, actually, increasingly, many geographies are realizing it’s a quite safe technology.

What do you want to see in terms of policy from the new presidential administration?

We consider the U.S. government to be one of our most important customers that we support up and down the board and will continue to do so. So we’re very excited, and we know many of those folks and are excited to continue to work on that mission together, because we do view it as a mission. It’s both a good business for us, but it’s also an ability to help our country move faster, to control costs, to be more agile. And I think it’s super important, as you think about where the world is going, for our government to have access to the latest technologies. I do think AI and technology is increasingly becoming an incredibly important part of our national defense, probably as much so as guns and other things like that, and so we take that super seriously, and we’re excited to work with the administration. I’m optimistic that President Trump and his administration can help us loosen some of the restrictions on helping build data centers faster. I’m hopeful that they can help us cut through some of that bureaucratic red tape and move faster. I think that’ll be important, particularly as we want to maintain the AI lead for the U.S. ahead of China and others. 

What have you learned about leadership over the course of your career?

We’re fortunate at Amazon to be able to attract some of the most talented, most driven leaders and employees in the world, and I’ve been fortunate enough to get to work with some of those folks [and] to try to clear barriers for them so that they can go deliver outstanding results for our customers. I think if we have a smart team that is really focused on solving customer problems versus growing their own scope of responsibility or internal goals, [and] if you can get those teams focused on that and get barriers out of their way and remove obstacles, then we can deliver a lot. And so that’s largely my job. I view myself as not the expert in any one particular thing. Every one of my team is usually better at whatever we’re trying to do than I am. And my job is to let them go do their job as much as possible, and occasionally connect dots for them on where there’s other parts of the company or other parts of the organization or other customer input that they may not have, that they can integrate and incorporate.

You’ve worked closely with Andy Jassy, is there anything in particular that you’ve learned from watching him as a leader?

I’ve learned a ton. He’s a he’s an exceptional leader. Andy is very good at having very high standards and having high expectations for the teams, and high standards for what we deliver for customers. He had a lot of the vision, together with some of the core folks who were starting AWS, of some important tenets of how we think about the business, of focusing on security and operational excellence and really focusing on how we go deliver for customers. 

What are your priorities for 2025?

Our first priority always is to maintain outstanding security and operational excellence. We want to help customers get ready for that AI transformation that’s going to happen. Part of that, though, is also helping get all of their applications in a place that they can take advantage of AI. So it’s a hugely important priority for us to help customers continue on that migration to the cloud, because if their data is stuck on premise and legacy data stores and other things, they won’t be able to take advantage of AI. So helping people modernize their data and analytics stacks to get that into the cloud and get their data links into a cloud and organized in a way that they can really start to take advantage of AI, is that is a big priority for us. And then it’s just, how do we help scale the AI capabilities, bring the cost down for customers, while [we] keep adding the value. And for 2025, our goal is for customers to move AI workloads really into production that deliver great ROI for their businesses. And that crosses making sure all their data is in the right place, and make sure they have the right compute platforms. We think Trainium is going to be an important part of that. The last bit is helping add some applications on top. We think that we can add [the] extra benefit of helping employees and others get that effectiveness. Some of that is moving contact centers to the cloud. Some of that is helping get conversational assistants and AI assistants in the hands of employees, and so Amazon Q is a big part of that for us. And then it’s also just empowering our broad partner ecosystem to go fast and help customers evolve as well.

You May Also Like