Entity · product

HackAgent

productactivehackagent-84a9de51·1 events·first seen Jun 17, 2026

Aliases: HackAgent

Co-occurring entities

tree-of-attacks Anthropic Fable 5 Claude Opus 4.8 Anthropic

More like this (12)

Baseline Agent PentestAgent Agent-S Computer-Using Agent GridDebugAgent VideoAgent AgentSnare agent sandboxing ProjAgent RD-Agent SafeAgentBench OptiAgent

Recent events (1)

7arXiv · cs.AI·Jun 17, 2026·source ↗

Red-team study finds Anthropic Fable 5 and Opus 4.8 remain reliably breakable under automated jailbreak attacks

A preprint evaluates adversarial robustness of two Anthropic frontier models—Fable 5 and Opus 4.8—against four families of automated jailbreak attacks across 7,826 harmful intents. Using the HackAgent framework, the study generated hundreds of thousands of adversarial attempts and confirmed 1,620 harmful completions from Opus 4.8 and 702 from Fable 5 via a three-judge panel. Tree-of-attacks adaptive search achieved 11.5% intent-level success against Opus 4.8 and 6.1% against Fable 5, with static obfuscation nearly fully neutralized. The authors conclude that even the most hardened frontier models remain reliably breakable under sustained automated pressure, cautioning against reading aggregate resistance rates as reassurance.

Frontier Model Releases Evaluation and Benchmarking tree-of-attacks Anthropic Fable 5 Claude Opus 4.8 +3 more