7Anthropic News·16d ago

Dario Amodei's AI Safety Summit remarks detail Anthropic's Responsible Scaling Policy and ASL framework

Dario Amodei delivered prepared remarks at the UK AI Safety Summit (November 2023) explaining Anthropic's Responsible Scaling Policy (RSP), which was the first such policy published by a major AI lab. The RSP introduces AI Safety Levels (ASL-1 through ASL-4), modeled on biosafety level frameworks, with capability thresholds triggering mandatory safeguards before further training or deployment. Key implementation lessons include deep executive involvement, integrating RSP requirements into product roadmaps, and formal accountability through Anthropic's board and Long Term Benefit Trust. The remarks outline specific ASL-3 requirements around CBRN misuse prevention and security, and preview ASL-4 criteria involving near-human autonomy or becoming a primary source of global security threats.

Frontier Model Releases AI Safety Research Regulatory Developments Dario Amodei UK AI Safety Summit Alignment Research Center Responsible Scaling Policy Long-Term Benefit Trust AI Safety Level (ASL)Anthropic

Related guides (4)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Anthropic

Anthropic: The AI Safety Company at the Center of the Frontier

Read asBeginner

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Regulatory DevelopmentsTopic guide

AI Regulatory Developments: From Voluntary Frameworks to Government Enforcement

Read asIn-depth

Related events (8)

8Anthropic News·17d ago·source ↗

Anthropic publishes Responsible Scaling Policy with AI Safety Level framework

Anthropic released its Responsible Scaling Policy (RSP), a formal framework of technical and organizational protocols for managing catastrophic risks from increasingly capable AI systems. The policy introduces AI Safety Levels (ASL-1 through ASL-5+), modeled on US biosafety level standards, requiring progressively stricter safety, security, and operational standards as models become more capable. Current Claude models are classified as ASL-2; ASL-3 triggers stricter deployment constraints including adversarial red-teaming requirements. The policy has been approved by Anthropic's board and is intended as a template for industry-wide adoption.

Frontier Model Releases AI Safety Research ARC Evals Claude Responsible Scaling Policy +3 more

8Anthropic News·19d ago·source ↗

Anthropic Releases Responsible Scaling Policy Version 3.0

Anthropic has published the third version of its Responsible Scaling Policy (RSP), a voluntary framework for mitigating catastrophic risks from increasingly capable AI systems. The update reflects two-plus years of experience with the original RSP, reinforcing what worked (ASL-3 safeguards activated in May 2025, industry adoption by OpenAI and Google DeepMind, informing early AI policy) while addressing shortcomings in accountability and transparency. The new version refines the AI Safety Level (ASL) framework and introduces new measures for decision-making transparency. Anthropic acknowledges that some elements of its original theory of change—particularly multilateral coordination and government action at higher capability thresholds—have not fully materialized as hoped.

Frontier Model Releases Evaluation and Benchmarking RAISE Act EU AI Act California SB 53 +8 more

7Anthropic News·16d ago·source ↗

Anthropic publishes major update to Responsible Scaling Policy with new capability thresholds and ASL standards

Anthropic released a significant revision to its Responsible Scaling Policy (RSP), its risk governance framework for managing catastrophic risks from frontier AI. The update introduces two explicit capability thresholds—autonomous AI R&D and CBRN weapons uplift—that trigger mandatory upgrades to AI Safety Level (ASL) standards, with current models operating under ASL-2. New elements include safety-case-inspired documentation processes, internal governance stress-testing, and external expert input mechanisms, drawing on risk management practices from high-consequence industries like biosafety.

Frontier Model Releases AI Safety Research AI Safety Level Standards Responsible Scaling Policy Anthropic

6Anthropic News·17d ago·source ↗

Dario Amodei calls for stronger AI safety focus at Paris AI Action Summit

Anthropic CEO Dario Amodei issued a statement following the Paris AI Action Summit, expressing concern that the event underweighted critical issues including democratic leadership in AI, CBRN and autonomous-risk governance, and labor market disruption. Amodei forecasts that by 2026-2027 AI capabilities may be equivalent to 'a country of geniuses in a datacenter,' framing this as both an opportunity and an urgent governance challenge. He called for governments to enforce transparency of frontier lab safety plans, fund third-party evaluations, and monitor economic impacts—pointing to Anthropic's newly released Economic Index as a model. The statement also reaffirmed Anthropic's Responsible Scaling Policy as the first of its kind among frontier labs.

AI Safety Research Regulatory Developments Dario Amodei French Government Responsible Scaling Policy +3 more

7Anthropic News·16d ago·source ↗

Anthropic reflects on Responsible Scaling Policy implementation and previews updated framework

Anthropic published a retrospective on operationalizing its Responsible Scaling Policy (RSP), originally released in summer 2023, sharing lessons learned and announcing an updated RSP is forthcoming. The post outlines five high-level commitments: establishing Red Line Capabilities, conducting Frontier Risk Evaluations, responding to Red Line Capabilities via an ASL-3 Standard, iteratively extending the policy toward ASL-4, and implementing Assurance Mechanisms. Key reflections include the difficulty of anticipating emergent capabilities in future models, expert disagreement on CBRN risk prioritization, and the value of quantitative threat modeling. Anthropic signals intent to move from voluntary commitments toward industry best practices and eventual regulation.

AI Safety Research Regulatory Developments Anthropic Long-Term Benefit Trust Responsible Scaling Policy Anthropic

6Anthropic News·17d ago·source ↗

Anthropic policy recap: US Executive Order, G7 Code of Conduct, and Bletchley Park AI Safety Summit

Anthropic published a policy commentary summarizing three major AI governance events from late October/early November 2023: the US Executive Order on AI, the G7 International Code of Conduct for advanced AI developers, and the UK-hosted Bletchley Park AI Safety Summit. The post covers Anthropic's positions on each, including support for NIST capacity-building, the G7 Code of Conduct, and the newly announced UK and US AI Safety Institutes. Dario Amodei presented Anthropic's Responsible Scaling Policy at Bletchley as a potential regulatory prototype, and the 28-country Bletchley Declaration notably included China among its signatories.

AI Safety Research Regulatory Developments Dario Amodei Trump Administration Executive Order on AI Bletchley Declaration +6 more

7Anthropic News·16d ago·source ↗

Anthropic launches initiative to fund third-party AI safety evaluations

Anthropic announced a funded initiative to source third-party evaluations measuring advanced AI capabilities and safety risks, with priority areas including cybersecurity, CBRN threats, model autonomy, national security risks, social manipulation, and misalignment. The initiative is tied to Anthropic's Responsible Scaling Policy and AI Safety Level (ASL) framework, aiming to address a gap between demand and supply of high-quality safety-relevant evals. Proposals are solicited via an application form, with Anthropic framing the effort as benefiting the broader AI safety ecosystem rather than just internal use.

Evaluation and Benchmarking AI Safety Research METR Google-Proof Q&A Responsible Scaling Policy +1 more

6Anthropic News·17d ago·source ↗

Anthropic commits to signing the EU General-Purpose AI Code of Practice

Anthropic announced its intention to sign the EU's General-Purpose AI Code of Practice, citing alignment with its existing Responsible Scaling Policy on transparency, safety, and accountability. The company frames the Code's mandatory Safety and Security Frameworks—including CBRN risk assessment—as complementary to its own internal standards. Anthropic also signals continued collaboration with the EU AI Office and third-party bodies like the Frontier Model Forum to keep standards adaptive as the technology evolves.

AI Safety Research Regulatory Developments EU AI Act EU General-Purpose AI Code of Practice Frontier Model Forum +3 more