paper
DirectAudioEdit: Inversion-Free Text-Guided Audio Editing via Diffusion Prediction Contrast
paperactiveprovisional
directaudioedit-inversion-free-text-guided-audio-editing-via-diffusion-prediction-contrast-570fe179·1 events·first seen 9d agoAliases: DirectAudioEdit: Inversion-Free Text-Guided Audio Editing via Diffusion Prediction Contrast
Co-occurring entities
More like this (12)
DirectAudioEditBeyond Fully Random Masking: Attention-Guided Denoising and Optimization for Diffusion Language Modelsdiffusion-based inpaintingAudioDERAudio Interaction ModelSelf-Augmenting Retrieval for Diffusion Language ModelsDenoising Diffusion Policy OptimizationLearning to Hear Hesitation: Continual Learning for Disfluency-Aware ASRCorpus-Grounded Feature DiffusionDenoising Diffusion Probabilistic ModelsLeveraging Audio-LLMs to Filter Speech-to-Speech Training DataProbing Low Frame Rate Degradation in Neural Audio Codecs
Recent events (1)
DirectAudioEdit: Training-free, inversion-free text-guided audio editing via diffusion prediction contrast
Researchers introduce DirectAudioEdit, the first training-free and inversion-free method for text-guided audio editing using diffusion denoising dynamics. The approach constructs a source-to-target editing path without requiring DDPM inversion, reducing macro-averaged FAD and KL divergence by ~16% compared to inversion-based baselines while achieving up to 64.5% speedup. Experiments span music and event-level benchmarks across two backbone architectures.