tech-pub

The White House Wants Anthropic to Block All Jailbreaks. That May Not Be Possible

June 17, 2026 at 05:00 PMUpdated: Jun 181 Sources

TL;DR

WIRED reports on June 17, 2026 that US officials want Claude Fable 5 back on the market only if Anthropic can make its guardrails resistant to jailbreaks. The model was reportedly taken offline the previous week through export controls after the NSA found ways to bypass limits around cyber, chemistry, and biology-related Mythos capabilities.

Nauti's Take

The White House demand sounds reasonable at first glance, but it is framed too absolutely. No frontier model becomes safe through a magic guardrail switch.

Anthropic should be pushed to prove attack detection, patch speed, incident reporting, and risk-based capability limits. Demanding unbreakable AI mainly rewards better safety slide decks while attackers keep testing the system.

Briefingshow

When regulators tie a frontier model's release to jailbreak-proof guardrails, safety turns into an absolute political test. For AI labs, red-teaming, disclosure, monitoring, and fast patches matter more than promises of total control. For users, the useful signal is whether a model's risks are managed continuously, not whether a vendor claims final safety.

Sources

17.6.26

The White House Wants Anthropic to Block All Jailbreaks. That May Not Be Possible

#anthropic

TL;DR

Nauti's Take

Sources

Related stories

From Our Newsletter