Almanac
product

LiveCodeBench Leaderboard

productactivelivecodebench-leaderboard-e2840aa4·1 events·first seen 28d ago

Aliases: LiveCodeBench Leaderboard

Co-occurring entities

More like this (12)

Recent events (1)

5Hugging Face Blog·28d ago·source ↗

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

Hugging Face introduces a leaderboard based on LiveCodeBench, a benchmark designed for holistic and contamination-free evaluation of code-generating large language models. The benchmark continuously collects new coding problems from competitive programming platforms to prevent data contamination that plagues static benchmarks. It evaluates models across multiple code-related tasks beyond just code generation, aiming to provide a more reliable signal of true model capability.