other
third-party AI evaluations
otheractiveprovisional
third-party-ai-evaluations-a09ca5f1·1 events·first seen 18d agoAliases: third-party AI evaluations
Co-occurring entities
More like this (12)
Evaluation Cards: An Interpretive Layer for AI Evaluation ReportingOpenAI Evalsmulti-level agent evaluationAI-Assisted Systematization for Evaluating GenAI SystemsTrustworthy AIArena AIAI-assisted human evaluationReflection AIOpAI-BenchBayesian Inference and Decision Audits for Public Archives of Frontier AI EvaluationsWhite House Voluntary AI CommitmentsProducerAI
Recent events (1)
A shared playbook for trustworthy third party evaluations
OpenAI has published guidance outlining a shared framework for conducting trustworthy third-party evaluations of frontier AI systems. The playbook covers methodology for assessing model capabilities, safeguards, and evaluation validity. This represents OpenAI's attempt to standardize and legitimize external auditing practices for frontier models.