benchmark
MMAE
benchmarkactiveprovisional
mmae-bb1bee81·1 events·first seen 9d agoAliases: MMAE
Co-occurring entities
More like this (12)
Recent events (1)
MMAE: First comprehensive benchmark for instruction-based audio editing across 7 modalities
Researchers introduce MMAE, a 2,000-sample benchmark for evaluating general-purpose instruction-based audio editing systems, covering 7 audio modalities (sound, speech, music, and mixtures) and 6 levels of task complexity. The benchmark uses a rubric-based evaluation framework decomposing tasks into 17,741 verifiable criteria to assess instruction following and context consistency. Evaluation of leading models reveals severe limitations: Exact Match Rate falls below 5% overall and hits 0% on complex mixed-modality tasks, exposing fundamental gaps in current audio editing systems.