Anthropic’s Mythos breach was humiliating

TL;DR

Anthropic's tightly controlled rollout of Claude Mythos has taken an awkward turn. After spending weeks insisting the AI model is so capable at cybersecurity that it is too dangerous to release publicly, it appears the model fell into the wrong hands anyway. According to Bloomberg, a "small group of unauthorized users" has had access to Mythos - whose existence was first revealed in a leak - since the day Anthropic announced plans to offer it to a select group of companies for testing.

Nauti's Take

Anthropic holding back Mythos over genuine cybersecurity capability concerns demonstrates that AI safety evaluations can have real consequences - that's the system working as intended. But a controlled rollout that leaks is a significant operational failure, raising questions about whether Anthropic can actually manage its most sensitive models internally.

Trust is the product here, and this incident costs some of it.

Sources