18 / 1523

A startup claims it broke through a bottleneck that’s holding back LLMs

TL;DR

Miami startup Subquadratic claims it has solved a core mathematical scaling problem for LLMs: the expensive token-to-token comparisons that make long context windows slow and costly. The stealth launch came with a huge claim but thin technical detail. Many researchers stayed skeptical because AI efficiency breakthroughs often look cleaner in announcements than in independent benchmarks.

Nauti's Take

The interesting part is not the startup launch, but the attempt to attack the math underneath the hype. Anyone who truly weakens the attention bottleneck changes the cost equation for long documents, agent runs, and search across large working contexts.

The bar has to stay high: papers, reproducible benchmarks, real hardware, real models. Until then, Subquadratic is a promising candidate carrying a very heavy burden of proof.

Briefingshow

If the breakthrough holds, this is more than a faster inference trick. Cheaper long context could make agents, code analysis, research and enterprise search much more practical. If it does not, it becomes another case study in how quickly mathematical progress turns into a PR story.

Sources