Almanac
benchmark

Open Agent Leaderboard

benchmarkactiveopen-agent-leaderboard-5d862fb4·1 events·first seen 29d ago

Aliases: Open Agent Leaderboard

Co-occurring entities

More like this (12)

Recent events (1)

5Hugging Face Blog·29d ago·source ↗

The Open Agent Leaderboard

IBM Research and Hugging Face have launched the Open Agent Leaderboard, a public benchmark for evaluating AI agents across standardized tasks. The leaderboard aims to provide transparent, reproducible comparisons of open and proprietary agent systems. This initiative addresses the growing need for rigorous evaluation infrastructure as the agent ecosystem matures.