596 / 1229

Show HN: ACE – A dynamic benchmark measuring the cost to break AI agents

TL;DR

The team built 'Adversarial Cost to Exploit' (ACE), a benchmark quantifying how many tokens – expressed in dollars – an autonomous adversary must spend to breach an LLM agent, replacing binary pass/fail metrics. Six budget-tier models were tested under identical agent configurations: Gemini Flash-Lite, DeepSeek v3.2, Mistral Small 4, Grok 4.1 Fast, GPT-5.4 Nano, and Claude Haiku 4.5.

Nauti's Take

A benchmark that prices security in dollars is not a gimmick – it speaks the language that budget holders actually understand. The fact that four of six tested models can be broken for under a dollar should alarm anyone running agents with real permissions and real data access.

The Haiku 4.5 outlier is fascinating, but caution is warranted: six models, one setup, early methodology – this is a promising first swing, not a definitive verdict. What the community needs now is independent replication and an honest debate about whether 'adversarial cost' truly holds up across different attack strategies.

Briefingshow

Most agent security benchmarks return binary verdicts – nearly useless for real-world risk decisions. ACE translates resilience into money, giving developers, security teams, and decision-makers a shared language. The dramatic lead of Haiku 4.5 is both a signal and an open question: is it model training, RLHF specifics, or a measurement artifact?

Until the methodology matures, the numbers should be treated as directional indicators, not ground truth.

Sources