Almanac
other

joint safety evaluation

otheractivejoint-safety-evaluation-9de9ba7b·1 events·first seen 28d ago

Aliases: joint safety evaluation

Co-occurring entities

More like this (12)

Recent events (1)

8Openai Blog·28d ago·source ↗

OpenAI and Anthropic Share Findings from Joint Safety Evaluation

OpenAI and Anthropic conducted a first-of-its-kind cross-lab safety evaluation, testing each other's frontier models across dimensions including misalignment, instruction following, hallucinations, and jailbreaking resistance. The collaboration represents a novel form of inter-lab safety research cooperation. Findings highlight both progress and ongoing challenges in AI safety, and establish a potential template for future cross-organizational evaluations.