Almanac
product

Every Eval Ever

productactiveprovisionalevery-eval-ever-a6317c3d·1 events·first seen 2d ago

Aliases: Every Eval Ever

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·2d ago·source ↗

Every Eval Ever: unified schema and community repository for AI evaluation results

Researchers introduce Every Eval Ever, a shared schema and crowdsourced repository designed to standardize AI evaluation results across incompatible formats, frameworks, and sources. The system ingests results from evaluation harnesses, papers, leaderboards, and custom repositories into a single JSON document format, with optional per-instance output storage. The repository, hosted on Hugging Face, currently covers 22,235 models, 2,273 unique benchmarks, and 31 evaluation formats. The work addresses a persistent infrastructure problem in AI evaluation science: divergent scores for nominally identical evaluations and scattered, incomparable metadata.