Entity · model

GLM-4-Voice

modelactiveglm-4-voice-bb129efe·1 events·first seen Jun 15, 2026

Aliases: GLM-4-Voice

Co-occurring entities

BayLing-Duplex InstructS2S-Eval Direct Preference Optimization (DPO)Moshi LLaMA-Omni

More like this (12)

GLM-4.5-Air GLM-4.7 GLM-5.1 GLM MAI-Voice-1 GLM-4.7-Flash GLM-OCR Advanced Voice Mode GLM-Z1-9B-0414 Multimodal Voice Activity Projection GLM-RAG Common Voice

Recent events (1)

6arXiv · cs.CL·Jun 15, 2026·source ↗

BayLing-Duplex: Native full-duplex speech dialogue using a single autoregressive LLM

Researchers introduce BayLing-Duplex, a speech language model that achieves native full-duplex interaction — simultaneous listening and speaking — using a single autoregressive LLM with no auxiliary VAD or turn-taking module. Built by fine-tuning GLM-4-Voice on 400K samples plus a lightweight DPO stage, it reaches 92% turn-taking success and 100% interruption success on InstructS2S-Eval, and improves speech-response quality substantially over Moshi. The approach adds only special tokens to the standard vocabulary, making it portable across LLM architectures without architectural changes.

Frontier Model Releases Multimodal Progress BayLing-Duplex InstructS2S-Eval Direct Preference Optimization (DPO)+3 more