Almanac
model

GLM-4-Voice

modelactiveprovisionalglm-4-voice-bb129efe·1 events·first seen 2d ago

Aliases: GLM-4-Voice

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·2d ago·source ↗

BayLing-Duplex: Native full-duplex speech dialogue using a single autoregressive LLM

Researchers introduce BayLing-Duplex, a speech language model that achieves native full-duplex interaction — simultaneous listening and speaking — using a single autoregressive LLM with no auxiliary VAD or turn-taking module. Built by fine-tuning GLM-4-Voice on 400K samples plus a lightweight DPO stage, it reaches 92% turn-taking success and 100% interruption success on InstructS2S-Eval, and improves speech-response quality substantially over Moshi. The approach adds only special tokens to the standard vocabulary, making it portable across LLM architectures without architectural changes.