Anthropic’s most dangerous AI model just fell into the wrong hands

TL;DR

Anthropic's Mythos AI model, a powerful cybersecurity tool that the company said could be dangerous in the wrong hands, has been accessed by a "small group of unauthorized users," Bloomberg reports.

Nauti's Take

The Mythos leak exposes the core paradox of advanced security AI: its value comes precisely from how well it finds vulnerabilities, which makes it a high-value target the moment it exists. Unauthorized access via a contractor is a classic insider-risk vector — a reminder that model safety and access controls are separate problems.

Anthropic can preserve trust here, but the fix must address how contractor access was scoped, not just who exploited it.

Summary

Anthropic's Mythos AI model, a powerful cybersecurity tool that the company said could be dangerous in the wrong hands, has been accessed by a "small group of unauthorized users," Bloomberg reports. An unnamed member of the group, identified only as "a third-party contractor for Anthropic," told the publication that members of a private online forum got into Mythos via a mix of tactics, utilizing the contractor's access and "commonly used internet sleuthing tools.

Sources