6OpenAI Blog·22d ago

A shared playbook for trustworthy third party evaluations

OpenAI has published guidance outlining a shared framework for conducting trustworthy third-party evaluations of frontier AI systems. The playbook covers methodology for assessing model capabilities, safeguards, and evaluation validity. This represents OpenAI's attempt to standardize and legitimize external auditing practices for frontier models.

Evaluation and Benchmarking AI Safety Research Regulatory Developments frontier model evaluation OpenAI third-party AI evaluations

Related guides (3)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Regulatory DevelopmentsTopic guide

AI Regulatory Developments: From Voluntary Frameworks to Government Enforcement

Read asBeginner In-depth

Related events (8)

5Openai Blog·1mo ago·source ↗

OpenAI Expands External Safety Testing Ecosystem

OpenAI published a post describing its use of independent experts to evaluate frontier AI systems through third-party testing. The initiative aims to strengthen safety validation, verify safeguards, and increase transparency around capability and risk assessments. The announcement signals a continued push toward external accountability mechanisms for frontier model evaluation.

Evaluation and Benchmarking AI Safety Research OpenAI

6Anthropic News·17d ago·source ↗

Anthropic advocates for third-party testing regime as core AI policy infrastructure

Anthropic published a policy position paper arguing that frontier AI systems require a third-party testing and oversight regime, distinct from self-governance approaches like their own Responsible Scaling Policy. The post outlines what such a regime should include: trusted third-party auditors, precisely scoped tests targeting only the most computationally intensive systems, and international coordination via shared standards and Mutual Recognition agreements. Anthropic acknowledges their RSP is insufficient alone because it relies on single private-sector actors, and calls for industry-wide mandatory testing that would eventually become a legal requirement for wide deployment.

AI Safety Research Regulatory Developments ChatGPT Claude Responsible Scaling Policy +2 more

8Openai Blog·1mo ago·source ↗

OpenAI and Anthropic Share Findings from Joint Safety Evaluation

OpenAI and Anthropic conducted a first-of-its-kind cross-lab safety evaluation, testing each other's frontier models across dimensions including misalignment, instruction following, hallucinations, and jailbreaking resistance. The collaboration represents a novel form of inter-lab safety research cooperation. Findings highlight both progress and ongoing challenges in AI safety, and establish a potential template for future cross-organizational evaluations.

Frontier Model Releases Evaluation and Benchmarking joint safety evaluation OpenAI Anthropic +1 more

6Openai Blog·17d ago·source ↗

OpenAI proposes federal governance blueprint for frontier AI safety and national security

OpenAI published a policy blueprint calling for a U.S. federal framework to govern frontier AI, covering safety, resilience, and national security dimensions. The proposal outlines OpenAI's vision for democratic oversight of the most capable AI systems. As a tier-1 primary source from a leading lab, this represents a significant public policy position that will likely influence regulatory discussions.

AI Safety Research Regulatory Developments OpenAI

5Openai Blog·1mo ago·source ↗

Improving Verifiability in AI Development: Multi-Stakeholder Report

OpenAI contributed to a multi-stakeholder report co-authored by 58 researchers across 30 organizations, including Mila, CSET, and the Schwartz Reisman Institute. The report identifies 10 mechanisms for improving the verifiability of claims about AI systems. These tools are intended to help developers demonstrate safety, security, fairness, and privacy properties, while enabling policymakers and civil society to evaluate AI development processes.

Evaluation and Benchmarking AI Safety Research Centre for the Future of Intelligence Center for Security and Emerging Technology Mila +4 more

7Openai Blog·23d ago·source ↗

OpenAI's Frontier Governance Framework

OpenAI has published its Frontier Governance Framework, a document outlining the company's AI safety, security, and risk management practices. The framework is explicitly positioned to align with emerging regulatory requirements from the EU and California. As a Tier 1 source announcement, this represents OpenAI's formal public stance on frontier model governance and regulatory compliance strategy.

AI Safety Research Regulatory Developments EU AI Act OpenAI OpenAI Frontier Governance Framework +1 more

5Openai Blog·1mo ago·source ↗

Frontier AI regulation: Managing emerging risks to public safety

OpenAI published a policy position on regulating frontier AI systems, focusing on managing emerging risks to public safety. The piece outlines OpenAI's perspective on how governments and regulatory bodies should approach oversight of the most capable AI models. This represents a formal public stance from a leading AI lab on the shape of future AI governance frameworks.

AI Safety Research Regulatory Developments OpenAI

6Openai Blog·1mo ago·source ↗

OpenAI Introduces Trusted Access for Cyber Framework

OpenAI has announced Trusted Access for Cyber, a tiered trust-based framework designed to expand access to frontier AI capabilities relevant to cybersecurity while implementing stronger safeguards against misuse. The framework appears to govern how security researchers, organizations, and other actors can access more powerful cyber-relevant AI features. This represents a policy and access-control development at the intersection of AI safety and offensive/defensive cyber capabilities.

AI Safety Research Enterprise Deployment Patterns Trusted Access for Cyber OpenAI +1 more