benchmark
Big Bench
benchmarkactive
big-bench-6cd38b23·1 events·first seen 28d agoAliases: Big Bench
Co-occurring entities
More like this (12)
Recent events (1)
Evaluating Audio Reasoning with Big Bench Audio
Hugging Face introduces Big Bench Audio, a new benchmark designed to evaluate audio reasoning capabilities in AI models. The benchmark appears to extend the Big Bench evaluation framework into the audio domain, targeting multimodal models that process and reason over audio inputs. This release addresses a gap in evaluation tooling for audio-capable language models.