The White House Wants Anthropic to Block All Jailbreaks. That May Not Be Possible
TL;DR
WIRED reports that US officials want Anthropic to rerelease Claude Fable 5 only if it can ensure the model's guardrails cannot be bypassed through jailbreaks. The model was reportedly taken offline last week through export controls after officials raised concerns about jailbreak risks. Anthropic says the concerns are overstated and told the Commerce Department and the Office of the National Cyber Director that the effects are minimal.
Nauti's Take
The political instinct is understandable: frontier models with cyber, chemistry, and biology capabilities should not be treated like ordinary software updates. But demanding that all jailbreaks be blocked sounds more like a control fantasy than a workable safety program.
A stronger path would be mandatory red teaming, reporting duties, usage monitoring, and clear limits around high-risk capabilities. Anyone promising absolute safety is usually selling better PR.
Briefingshow
The dispute shows how wide the gap has become between political safety demands and technical reality. If regulators require absolute jailbreak resistance, they move the debate from measurable risk reduction to a standard current models may not be able to meet. That can slow releases without automatically improving safety.