StreamAudio-2M
streamaudio-2m-371863d8·1 events·first seen 13d agoAliases: StreamAudio-2M
Co-occurring entities
More like this (12)
Recent events (1)
Audio Interaction Model: Unified Streaming LALM with Always-On Perceive-Decide-Respond Loop
Researchers introduce the Audio Interaction Model framework and a concrete implementation called Audio-Interaction, a unified streaming Large Audio Language Model that handles both offline tasks and real-time audio interaction through a continuous perceive-decide-respond loop. The system is built on SoundFlow, a framework covering data construction, training, and asynchronous low-latency inference. The authors also release StreamAudio-2M, a 2.6M-item streaming corpus spanning 28 sub-tasks, and Proactive-Sound-Bench for evaluating proactive audio intervention. Evaluated across 8 benchmarks, the model preserves competitive offline performance while enabling real-time ASR, streaming instruction following, and proactive response capabilities not available in prior offline LALMs.