Almanac
dataset

SpeechMatrix

datasetactiveprovisionalspeechmatrix-8cde13e6·1 events·first seen 5d ago

Aliases: SpeechMatrix

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.CL·5d ago·source ↗

Audio-LLM-based data filtering for speech-to-speech translation via Rank-to-Distill

A new arXiv paper proposes using audio large language models to filter noisy training data for end-to-end speech-to-speech translation (S2ST). The authors introduce a two-stage Rank-to-Distill strategy: a lightweight ranker generates pseudo-labels from noisy speech pairs, which then supervise an audio-LLM to make keep/drop decisions directly from raw audio. Experiments on CVSS-C and SpeechMatrix benchmarks show up to +1.4 ASR-BLEU improvement over unfiltered baselines.