AWS Deploys 1M Nvidia GPUs Through 2027
Inference now accounts for roughly two-thirds of AI compute, up from about one-third in 2023. Nvidia is embedding its full stack across compute, networking, and inference inside AWS data centers, shifting from chip vendor to infrastructure layer.
Key Takeaway
Nvidia's grip on cloud AI infrastructure tightens as AWS can't compete without its chips despite building custom silicon.
A Nvidia executive confirmed with Reuters that AWS will deploy 1 million GPUs across global cloud regions by the end of 2027, with the rollout starting this year.
The scale of the deal reflects a shift in AI compute workloads. Inference now accounts for roughly two-thirds of AI compute, up from about one-third in 2023, according to ZenGen Labs Co-founder Dermot McGrath. Deloitte estimates the market for inference-focused chips will exceed ₱3 trillion ($50 billion) by 2026.
ZenGen Labs Co-founder Dermot McGrath said Nvidia is becoming the infrastructure layer underneath cloud providers, not just a chip vendor to them. Boardy Ventures Deal Partner Berna Misa described the situation as an infrastructure flip, with Nvidia embedding its full stack across compute, networking, and inference inside AWS data centers that ran proprietary gear for years. When embedded that deeply in a customer's stack, switching costs and the context layer become the moat, Misa said.
The partnership includes Nvidia's NVLink Fusion technology, which AWS is integrating into its custom silicon including Trainium4, Graviton CPUs, and Nitro System. The integration enables high-speed connections across non-Nvidia chips, tying AWS even closer to Nvidia's architecture.
Gather Beyond Founder Pichapen Prateepavanich said cloud providers want independence over the long term, but in the near term they need Nvidia to remain competitive. AWS is developing its own AI chips, but the company needs Nvidia's most advanced GPUs to keep pace with rivals. Nvidia's most advanced chips have been tightly controlled since 2022 as part of U.S. strategy, even as prosecutors pursue a case alleging Nvidia chips were smuggled to China.
AWS and Nvidia have worked together for over 15 years, beginning with the launch of the world's first GPU cloud instance in 2011. The 1 million GPU deployment will be complete by the end of 2027.
This article was written based on reporting from Decrypt.



