Almanac
← Events
8OpenAI Blog·1mo ago

OpenAI and Anthropic Share Findings from Joint Safety Evaluation

OpenAI and Anthropic conducted a first-of-its-kind cross-lab safety evaluation, testing each other's frontier models across dimensions including misalignment, instruction following, hallucinations, and jailbreaking resistance. The collaboration represents a novel form of inter-lab safety research cooperation. Findings highlight both progress and ongoing challenges in AI safety, and establish a potential template for future cross-organizational evaluations.

Related guides (4)

Related events (8)

5Openai Blog·1mo ago·source ↗

OpenAI Expands External Safety Testing Ecosystem

OpenAI published a post describing its use of independent experts to evaluate frontier AI systems through third-party testing. The initiative aims to strengthen safety validation, verify safeguards, and increase transparency around capability and risk assessments. The announcement signals a continued push toward external accountability mechanisms for frontier model evaluation.

7Anthropic News·16d ago·source ↗

Anthropic launches initiative to fund third-party AI safety evaluations

Anthropic announced a funded initiative to source third-party evaluations measuring advanced AI capabilities and safety risks, with priority areas including cybersecurity, CBRN threats, model autonomy, national security risks, social manipulation, and misalignment. The initiative is tied to Anthropic's Responsible Scaling Policy and AI Safety Level (ASL) framework, aiming to address a gap between demand and supply of high-quality safety-relevant evals. Proposals are solicited via an application form, with Anthropic framing the effort as benefiting the broader AI safety ecosystem rather than just internal use.

7Openai Blog·1mo ago·source ↗

OpenAI and Los Alamos National Laboratory Announce Research Partnership on Biosafety Evaluations

OpenAI and Los Alamos National Laboratory (LANL) have announced a research partnership focused on developing safety evaluations for frontier AI models. The collaboration specifically targets assessing and measuring biological capabilities and risks. LANL brings national-lab-level biosecurity expertise to the effort, which aligns with OpenAI's broader preparedness framework for catastrophic risk domains.

4Openai Blog·1mo ago·source ↗

OpenAI Safety Practices Update

OpenAI published a safety update reaffirming its commitment to responsible development and deployment of AGI. The post is a high-level statement from a Tier 1 lab on its safety posture. The body excerpt is brief and does not detail specific new policies, evaluations, or technical measures.

6Anthropic News·18d ago·source ↗

Anthropic publishes foundational 'Core Views on AI Safety' position paper

Anthropic released a detailed position paper outlining their core views on AI safety, arguing that transformative AI could arrive within a decade driven by predictable scaling laws, and that no one currently knows how to train powerful AI systems to robustly behave well. The document explains Anthropic's founding rationale and research strategy, highlighting four priority areas: scaling supervision, mechanistic interpretability, process-oriented learning, and understanding AI generalization. Originally published March 2023, this represents Anthropic's canonical public statement of their safety philosophy and strategic priorities.

5Openai Blog·1mo ago·source ↗

Announcing the OpenAI Safety Fellowship

OpenAI has announced a Safety Fellowship, described as a pilot program aimed at supporting independent safety and alignment research while developing the next generation of AI safety talent. The announcement is sparse on details but signals a structured investment in external safety research capacity. This follows broader industry trends of labs funding independent safety work to build the research ecosystem.

4Openai Blog·1mo ago·source ↗

AI Safety Needs Social Scientists

OpenAI published a paper arguing that long-term AI safety research requires social scientists to address uncertainties in human psychology, rationality, emotion, and biases that affect alignment algorithms. The paper contends that aligning advanced AI with human values cannot be solved by machine learning alone. OpenAI announced plans to hire social scientists full-time to work on these problems.

5Openai Blog·1mo ago·source ↗

OpenAI Reports Progress with US CAISI and UK AISI on AI Safety and Security

OpenAI has published an update on its ongoing partnership with the US Cyber and AI Safety Institute (CAISI) and the UK AI Safety Institute (AISI). The collaboration focuses on strengthening AI safety and security practices. The announcement signals continued institutional engagement between OpenAI and government AI safety bodies in both countries.