paper
Learning to Hear Hesitation: Continual Learning for Disfluency-Aware ASR
paperactiveprovisional
learning-to-hear-hesitation-continual-learning-for-disfluency-aware-asr-3cd7bf2f·1 events·first seen 2d agoAliases: Learning to Hear Hesitation: Continual Learning for Disfluency-Aware ASR
More like this (12)
Efficient ASR Training with Conversations that Never HappenedLeveraging Audio-LLMs to Filter Speech-to-Speech Training DataContext-Driven Incremental Compression for Multi-Turn Dialogue GenerationContext-Driven Incremental Compression for Multi-Turn Dialogue GenerationContinual LearningDirectAudioEdit: Inversion-Free Text-Guided Audio Editing via Diffusion Prediction ContrastAdaptive Turn-Taking for Real-time Multi-Party Voice AgentsMulti-Faceted Interactivity Alignment in Full-Duplex Speech ModelsOn The Effectiveness-Fluency Trade-Off In LLM Conditioning: A Systematic Studyspeaker diarizationBeyond Fully Random Masking: Attention-Guided Denoising and Optimization for Diffusion Language ModelsReinforcement Learning Elicits Contextual Learning of Unseen Language Translation
Recent events (1)
Continual learning approach for disfluency-aware ASR with explicit disfluency tokens
A new arXiv preprint addresses the challenge of transcribing disfluent speech (hesitations, repetitions, fillers) in ASR systems, which typically omit such markers causing information loss. The authors introduce explicit disfluency tokens into a pretrained ASR model and apply continual learning to adapt across datasets with varying disfluency distributions while mitigating catastrophic forgetting. The work identifies a trade-off between disfluency marker learning and general ASR performance, and finds a consistent cross-attention head mechanism shared across continual learning methods.