Anthropic’s most dangerous AI model just fell into the wrong hands
TL;DR
Anthropic's Mythos AI model, a powerful cybersecurity tool that the company said could be dangerous in the wrong hands, has been accessed by a "small group of unauthorized users," Bloomberg reports. An unnamed member of the group, identified only as "a third-party contractor for Anthropic," told the publication that members of a private online forum got into Mythos via a mix of tactics, utilizing the contractor's access and "commonly used internet sleuthing tools.
Nauti's Take
The Mythos leak exposes the core paradox of advanced security AI: its value comes precisely from how well it finds vulnerabilities, which makes it a high-value target the moment it exists. Unauthorized access via a contractor is a classic insider-risk vector — a reminder that model safety and access controls are separate problems.
Anthropic can preserve trust here, but the fix must address how contractor access was scoped, not just who exploited it.