5Anthropic News·17d ago

Anthropic launches Transparency Hub with safety metrics and governance documentation

Anthropic launched a Transparency Hub consolidating documentation on model evaluation methodologies, platform abuse enforcement, internal governance, societal impact assessment, and safety research. The hub includes a first periodic report covering banned accounts, appeals, NCMEC reports, and government requests. The initiative is framed as a unified response to fragmented transparency requirements across regulatory frameworks and voluntary commitments, with plans for ongoing expansion.

AI Safety Research Regulatory Developments National Center for Missing and Exploited Children Anthropic

Related guides (3)

Anthropic

Anthropic: The AI Safety Company at the Center of the Frontier

Read asBeginner In-depth

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Regulatory DevelopmentsTopic guide

AI Regulatory Developments: From Voluntary Frameworks to Government Enforcement

Read asBeginner In-depth

Related events (8)

7Anthropic News·16d ago·source ↗

Anthropic launches initiative to fund third-party AI safety evaluations

Anthropic announced a funded initiative to source third-party evaluations measuring advanced AI capabilities and safety risks, with priority areas including cybersecurity, CBRN threats, model autonomy, national security risks, social manipulation, and misalignment. The initiative is tied to Anthropic's Responsible Scaling Policy and AI Safety Level (ASL) framework, aiming to address a gap between demand and supply of high-quality safety-relevant evals. Proposals are solicited via an application form, with Anthropic framing the effort as benefiting the broader AI safety ecosystem rather than just internal use.

Evaluation and Benchmarking AI Safety Research METR Google-Proof Q&A Responsible Scaling Policy +1 more

5Anthropic News·18d ago·source ↗

Anthropic responds to California Governor Newsom's AI working group draft report

Anthropic published a formal response to the California Governor's Working Group on AI Frontier Models draft report, endorsing its emphasis on transparency and evidence-based policy. Anthropic argues that light-touch mandatory disclosure of safety and security practices would be beneficial without impeding innovation, noting that current voluntary practices are uneven across frontier labs. The response also references Anthropic's Responsible Scaling Policy and Economic Index as examples of existing transparency efforts, and signals urgency given Anthropic's view that powerful AI systems may arrive as early as end of 2026.

AI Safety Research Regulatory Developments Gavin Newsom California Working Group on AI Frontier Models Responsible Scaling Policy +2 more

5Anthropic News·16d ago·source ↗

Anthropic submits AI accountability recommendations to NTIA, covering evals, red teaming, and pre-registration

Anthropic submitted a formal response to the NTIA's Request for Comment on AI Accountability, outlining a multi-part policy framework for governing advanced AI systems. Key recommendations include increased government funding for evaluation research, mandatory disclosure of evaluation methods, pre-registration of large training runs with national governments, mandated external red teaming before model release, and antitrust guidance to enable industry safety collaboration. The submission reflects Anthropic's core policy positions and advocates for risk-tiered oversight proportional to model capabilities.

Evaluation and Benchmarking AI Safety Research National Institute of Standards and Technology National Telecommunications and Information Administration Anthropic +1 more

5Anthropic News·17d ago·source ↗

Anthropic achieves ISO/IEC 42001:2023 certification for AI management systems

Anthropic has received accredited certification under ISO/IEC 42001:2023, the first international standard for AI governance and management systems, issued by Schellman Compliance LLC. The certification covers Anthropic's policies, testing, monitoring, transparency measures, and oversight structures for responsible AI development. Anthropic claims to be among the first frontier AI labs to achieve this certification, positioning it as external validation of their safety commitments alongside existing frameworks like their Responsible Scaling Policy and Constitutional AI.

AI Safety Research Regulatory Developments Schellman Compliance LLC Constitutional AI ANSI National Accreditation Board +3 more

5Anthropic News·18d ago·source ↗

Anthropic publishes structured harm assessment framework covering physical, psychological, economic, and societal impacts

Anthropic has released a policy document describing their evolving framework for assessing and mitigating AI harms across five dimensions: physical, psychological, economic, societal, and individual autonomy impacts. The framework complements their existing Responsible Scaling Policy and informs decisions on usage policies, red-teaming, detection, and enforcement. Concrete examples include safeguards for computer use capabilities (fraud, phishing) and a reported 45% reduction in unnecessary refusals in Claude 3.7 Sonnet through improved handling of ambiguous prompts. Anthropic frames this as a work-in-progress and invites collaboration from the broader AI ecosystem.

AI Safety Research Alignment and RLHF Responsible Scaling Policy Claude 3.7 Sonnet Anthropic

6Anthropic News·17d ago·source ↗

Anthropic commits to signing the EU General-Purpose AI Code of Practice

Anthropic announced its intention to sign the EU's General-Purpose AI Code of Practice, citing alignment with its existing Responsible Scaling Policy on transparency, safety, and accountability. The company frames the Code's mandatory Safety and Security Frameworks—including CBRN risk assessment—as complementary to its own internal standards. Anthropic also signals continued collaboration with the EU AI Office and third-party bodies like the Frontier Model Forum to keep standards adaptive as the technology evolves.

AI Safety Research Regulatory Developments EU AI Act EU General-Purpose AI Code of Practice Frontier Model Forum +3 more

5Anthropic News·16d ago·source ↗

Anthropic updates Usage Policy with election integrity, high-risk use case, and privacy rules

Anthropic revised its Acceptable Use Policy (renamed Usage Policy), effective June 6, 2024, consolidating prohibited-use categories into 'Universal Usage Standards.' Key changes include explicit bans on AI-assisted election interference and political campaigning, new safety requirements for high-risk use cases (healthcare, legal), expanded access for minors via API partners with safety disclosures, and stronger privacy protections including prohibitions on biometric inference and government-directed censorship. The update reflects both evolving regulatory context and Anthropic's stated safety mission.

AI Safety Research Regulatory Developments Anthropic Usage Policy Anthropic

6Anthropic News·19d ago·source ↗

Anthropic Responds to White House AI Action Plan, Calls for Transparency Standards and Export Controls

Anthropic published a policy response to the White House's 'Winning the Race: America's AI Action Plan,' endorsing its focus on AI infrastructure, federal adoption, and safety research while urging additional steps on export controls and mandatory AI development transparency standards. The company highlighted alignment between the plan and its prior OSTP submissions, and noted its proactive activation of ASL-3 protections with Claude Opus 4 as evidence that safety and innovation are compatible. Anthropic called for a single national standard for frontier model transparency rather than a state-by-state patchwork, and encouraged continued investment in NIST's CAISI for evaluating frontier models on national security risks including CBRN capabilities.

Frontier Model Releases AI Safety Research Claude Opus 4.6 Center for AI Standards and Innovation Office of Management and Budget +9 more