benchmark
Multi-LCB
benchmarkactiveprovisional
multi-lcb-c94a015a·1 events·first seen 47h agoAliases: Multi-LCB
Co-occurring entities
More like this (12)
Recent events (1)
Multi-LCB extends LiveCodeBench to twelve programming languages for cross-language code evaluation
Researchers introduce Multi-LCB, a benchmark that extends the widely-used LiveCodeBench (LCB) to twelve programming languages by transforming Python tasks into equivalent tasks in other languages while preserving LCB's contamination controls. The benchmark evaluates 24 LLMs and uncovers Python overfitting, language-specific contamination, and large performance disparities across languages. Multi-LCB is designed to auto-update with future LCB releases, making it a living benchmark for multilingual code generation assessment.