With Nvidia Groq 3, the Era of AI Inference Is (Probably) Here
TL;DR
At GTC in San Jose (30,000+ attendees), Nvidia CEO Jensen Huang unveiled the Vera Rubin chip line — Nvidia's first chip designed specifically for AI inference.
Key Points
- The Nvidia Groq 3 LPU (language processing unit) incorporates IP licensed from startup Groq for US $20 billion, a deal struck on Christmas Eve 2024.
- Huang declared the inflection point of inference has arrived: AI must now 'think' and 'do,' both of which require inference rather than training workloads.
- Training and inference demand distinct computational profiles; Nvidia has historically focused on training hardware.
Nauti's Take
Nvidia spends $20 billion licensing Groq's IP, brands the result 'Groq 3,' and calls it an inflection point — which is both a smart competitive move and an implicit admission that its training-focused architecture needed help for the inference era. Huang's soundbites about AI needing to 'think' and 'do' are vintage GTC theater, but the underlying point is real: inference is where the volume and the margin will be for the next decade.
The more interesting story is what this means for Groq as a standalone company and for AMD and Intel, who now face a Nvidia that has shored up its last obvious weakness. The inference wars just got a lot more expensive.