Entity · benchmark

Google-Proof Q&A

benchmarkactivegoogle-proof-q-a-dd20bb27·1 events·first seen Jun 4, 2026

Aliases: Google-Proof Q&A

Co-occurring entities

METR Responsible Scaling Policy Anthropic

More like this (12)

Search-QA ResearchQA Evidence-Backed Video Question Answering Quora Protocol QA TruthfulQA PubMedQA StrategyQA Google Document AI SimpleQA FreshQA StoryAD-QA

Recent events (1)

7Anthropic News·Jun 4, 2026·source ↗

Anthropic launches initiative to fund third-party AI safety evaluations

Anthropic announced a funded initiative to source third-party evaluations measuring advanced AI capabilities and safety risks, with priority areas including cybersecurity, CBRN threats, model autonomy, national security risks, social manipulation, and misalignment. The initiative is tied to Anthropic's Responsible Scaling Policy and AI Safety Level (ASL) framework, aiming to address a gap between demand and supply of high-quality safety-relevant evals. Proposals are solicited via an application form, with Anthropic framing the effort as benefiting the broader AI safety ecosystem rather than just internal use.

Evaluation and Benchmarking AI Safety Research METR Google-Proof Q&A Responsible Scaling Policy +1 more