Almanac
paper

FlowEdit: Associative Memory for Lifelong Pronunciation Adaptation in Flow-Matching TTS

paperactiveprovisionalflowedit-associative-memory-for-lifelong-pronunciation-adaptation-in-flow-matching-tts-56e8206b·1 events·first seen 47h ago

Aliases: FlowEdit: Associative Memory for Lifelong Pronunciation Adaptation in Flow-Matching TTS

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.AI·47h ago·source ↗

FlowEdit: Lifelong pronunciation adaptation for flow-matching TTS via associative memory

FlowEdit is a new framework enabling lifelong pronunciation correction in frozen flow-matching text-to-speech systems without retraining model weights. Corrections are stored as token-level perturbations in text embedding space within a Modern Hopfield Network, retrieved at inference via soft attention with fuzzy morphological matching. On a curated benchmark of 312 multilingual proper nouns across 18 language families, the method reduces target-word Phoneme Error Rate by 92.7% relative to the zero-shot baseline, with each correction completing in ~15 seconds on a single GPU.