HackerOne
hackerone-1c72f46c·2 events·first seen 14d agoAliases: HackerOne
Co-occurring entities
More like this (12)
Recent events (2)
Anthropic expands model safety bug bounty to target universal jailbreaks in CBRN and cybersecurity domains
Anthropic is expanding its HackerOne-partnered bug bounty program to offer up to $15,000 for novel universal jailbreak attacks against a next-generation safety mitigation system not yet publicly deployed. The program specifically targets high-risk domains including CBRN (chemical, biological, radiological, nuclear) and cybersecurity, with participants given early access to test the new safeguards before release. The initiative begins as invite-only and aligns with Anthropic's commitments under the White House Voluntary AI Commitments and G7 Hiroshima Process Code of Conduct.
Anthropic launches bug bounty program to stress-test ASL-3 Constitutional Classifiers
Anthropic launched an invite-only bug bounty program in partnership with HackerOne to find universal jailbreaks in its Constitutional Classifiers system before public deployment, offering up to $25,000 per verified vulnerability. The program targets CBRN-related safety bypasses on Claude 3.7 Sonnet and is part of Anthropic's work to meet its AI Safety Level-3 (ASL-3) Deployment Standard under its Responsible Scaling Policy. A follow-up update extended the program to test Constitutional Classifiers on the new Claude Opus 4 model and began accepting reports of universal jailbreaks found on public platforms. The initiative reflects Anthropic's structured approach to pre-deployment safety validation for increasingly capable models.