Almanac
paper

Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models

paperactiveprovisionalmulti-faceted-interactivity-alignment-in-full-duplex-speech-models-e42a2924·1 events·first seen 7d ago

Aliases: Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·7d ago·source ↗

RL-based alignment improves interactivity in full-duplex spoken dialogue models

Researchers propose a post-training alignment method using reinforcement learning to improve interactivity in full-duplex spoken dialogue models, which can listen and speak simultaneously. The method addresses four canonical axes of interactivity—pause handling, turn-taking, backchanneling, and user interruption—each with axis-specific reward functions, plus an LLM-based reward to prevent semantic degradation. The approach is applied to two open-source models, Moshi and PersonaPlex, showing consistent improvements in both offline and real-time multi-turn evaluation.