technique
frontier model evaluation
techniqueactiveprovisional
frontier-model-evaluation-a9b0a16c·1 events·first seen 18d agoAliases: frontier model evaluation
Co-occurring entities
More like this (12)
Recent events (1)
A shared playbook for trustworthy third party evaluations
OpenAI has published guidance outlining a shared framework for conducting trustworthy third-party evaluations of frontier AI systems. The playbook covers methodology for assessing model capabilities, safeguards, and evaluation validity. This represents OpenAI's attempt to standardize and legitimize external auditing practices for frontier models.