paper
Probing Low Frame Rate Degradation in Neural Audio Codecs
paperactiveprovisional
probing-low-frame-rate-degradation-in-neural-audio-codecs-7716a0eb·1 events·first seen 30h agoAliases: Probing Low Frame Rate Degradation in Neural Audio Codecs
More like this (12)
quantization-induced degradationsparse frame samplingDirectAudioEdit: Inversion-Free Text-Guided Audio Editing via Diffusion Prediction ContrastAdaCodecLeveraging Audio-LLMs to Filter Speech-to-Speech Training Dataneural-ODE temporal decayAudioDERdefer-to-resampleMin-p samplingFishAudio-S2-ProNeural Safety FiltersUnified Progressive Frequency Bridging
Recent events (1)
Controlled ablation reveals training artifact behind low frame rate degradation in neural audio codecs
A new arXiv preprint investigates why neural audio codecs degrade sharply at low frame rates (≤6.25 Hz), a property relevant to autoregressive speech synthesis where generation cost scales with sequence length. The authors reproduce a previously reported quality cliff at 6.25 Hz and show it stems from a suboptimal training configuration—fixed clip duration starves the decoder of inter-token context at low frame rates—rather than fundamental phonemic or codebook limits. After correcting the training setup, word error rate degrades smoothly down to 1.6 Hz, suggesting low frame rate codecs are more practically accessible than prior work implied.