Entity · product

Nuclear Proliferation Risk Classifier

productactivenuclear-proliferation-risk-classifier-dcc087fb·1 events·first seen Jun 2, 2026

Aliases: Nuclear Proliferation Risk Classifier

Co-occurring entities

Claude Anthropic Policy Frontier Red Team U.S. Department of Energy Frontier Model Forum Anthropic U.S. Department of Energy National Nuclear Security Administration

More like this (12)

CBRN (Chemical, Biological, Radiological, Nuclear) risk category nuclear norm National Nuclear Security Administration NuclearQAv2 U.S. Nuclear Regulatory Commission Beyond Aggregate Risk: Role-Stratified Conformal Risk Control for LLM Tool Calls U.S. Department of Energy National Nuclear Security Administration TrustX Agent Risk Classification Framework PRNet NIST AI RMF Conformal Risk Control NURC-SP

Recent events (1)

7Anthropic News·Jun 2, 2026·source ↗

Anthropic and NNSA Co-Develop Nuclear Safeguards Classifier for Claude Traffic

Anthropic, in partnership with the U.S. Department of Energy's National Nuclear Security Administration (NNSA) and DOE national laboratories, has co-developed an AI classifier that distinguishes between concerning and benign nuclear-related conversations with 96% accuracy in preliminary testing. The classifier has already been deployed on live Claude traffic as part of Anthropic's misuse-detection infrastructure. Anthropic plans to share the approach with the Frontier Model Forum as a replicable blueprint for other AI developers. This represents the first public-private partnership of this kind for nuclear proliferation risk monitoring in frontier AI systems.

Evaluation and Benchmarking AI Safety Research Nuclear Proliferation Risk Classifier Claude Anthropic Policy Frontier Red Team +5 more