5OpenAI Blog·1mo ago

OpenAI Expands External Safety Testing Ecosystem

OpenAI published a post describing its use of independent experts to evaluate frontier AI systems through third-party testing. The initiative aims to strengthen safety validation, verify safeguards, and increase transparency around capability and risk assessments. The announcement signals a continued push toward external accountability mechanisms for frontier model evaluation.

Evaluation and Benchmarking AI Safety Research OpenAI

Related guides (3)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Evaluation and BenchmarkingTopic guide

Evaluation and Benchmarking: How We Measure AI — and Why It Keeps Getting Harder

Read asBeginner In-depth

Related events (8)

6Openai Blog·22d ago·source ↗

A shared playbook for trustworthy third party evaluations

OpenAI has published guidance outlining a shared framework for conducting trustworthy third-party evaluations of frontier AI systems. The playbook covers methodology for assessing model capabilities, safeguards, and evaluation validity. This represents OpenAI's attempt to standardize and legitimize external auditing practices for frontier models.

Evaluation and Benchmarking AI Safety Research frontier model evaluation OpenAI third-party AI evaluations +1 more

8Openai Blog·1mo ago·source ↗

OpenAI and Anthropic Share Findings from Joint Safety Evaluation

OpenAI and Anthropic conducted a first-of-its-kind cross-lab safety evaluation, testing each other's frontier models across dimensions including misalignment, instruction following, hallucinations, and jailbreaking resistance. The collaboration represents a novel form of inter-lab safety research cooperation. Findings highlight both progress and ongoing challenges in AI safety, and establish a potential template for future cross-organizational evaluations.

Frontier Model Releases Evaluation and Benchmarking joint safety evaluation OpenAI Anthropic +1 more

7Anthropic News·16d ago·source ↗

Anthropic launches initiative to fund third-party AI safety evaluations

Anthropic announced a funded initiative to source third-party evaluations measuring advanced AI capabilities and safety risks, with priority areas including cybersecurity, CBRN threats, model autonomy, national security risks, social manipulation, and misalignment. The initiative is tied to Anthropic's Responsible Scaling Policy and AI Safety Level (ASL) framework, aiming to address a gap between demand and supply of high-quality safety-relevant evals. Proposals are solicited via an application form, with Anthropic framing the effort as benefiting the broader AI safety ecosystem rather than just internal use.

Evaluation and Benchmarking AI Safety Research METR Google-Proof Q&A Responsible Scaling Policy +1 more

5Openai Blog·1mo ago·source ↗

Announcing the OpenAI Safety Fellowship

OpenAI has announced a Safety Fellowship, described as a pilot program aimed at supporting independent safety and alignment research while developing the next generation of AI safety talent. The announcement is sparse on details but signals a structured investment in external safety research capacity. This follows broader industry trends of labs funding independent safety work to build the research ecosystem.

AI Safety Research Alignment and RLHF OpenAI Safety Fellowship AI alignment OpenAI

6Openai Blog·1mo ago·source ↗

Moving AI Governance Forward: OpenAI and Leading Labs Make Voluntary Safety Commitments

OpenAI and other leading AI laboratories announced voluntary commitments aimed at reinforcing AI safety, security, and trustworthiness. The commitments represent a coordinated industry response to governance concerns ahead of anticipated regulatory action. This move signals alignment among frontier labs on baseline safety standards, though the voluntary nature leaves enforcement questions open.

AI Safety Research Regulatory Developments Voluntary AI Safety Commitments OpenAI

4Openai Blog·1mo ago·source ↗

OpenAI Safety Practices Update

OpenAI published a safety update reaffirming its commitment to responsible development and deployment of AGI. The post is a high-level statement from a Tier 1 lab on its safety posture. The body excerpt is brief and does not detail specific new policies, evaluations, or technical measures.

AI Safety Research AGI (Artificial General Intelligence)OpenAI

5Openai Blog·1mo ago·source ↗

OpenAI Reports Progress with US CAISI and UK AISI on AI Safety and Security

OpenAI has published an update on its ongoing partnership with the US Cyber and AI Safety Institute (CAISI) and the UK AI Safety Institute (AISI). The collaboration focuses on strengthening AI safety and security practices. The announcement signals continued institutional engagement between OpenAI and government AI safety bodies in both countries.

AI Safety Research Regulatory Developments UK AI Security Institute US Cyber and AI Safety Institute OpenAI

5Openai Blog·1mo ago·source ↗

Frontier AI regulation: Managing emerging risks to public safety

OpenAI published a policy position on regulating frontier AI systems, focusing on managing emerging risks to public safety. The piece outlines OpenAI's perspective on how governments and regulatory bodies should approach oversight of the most capable AI models. This represents a formal public stance from a leading AI lab on the shape of future AI governance frameworks.

AI Safety Research Regulatory Developments OpenAI