technique
K-means
techniqueactiveprovisional
k-means-8a417612·1 events·first seen 6d agoAliases: K-means
Co-occurring entities
More like this (12)
Recent events (1)
Study finds lower bitrate discrete speech representations sufficient for generative spoken language modeling
Researchers investigate how segmentation width and cluster size affect speech resynthesis and continuation quality in Generative Spoken Language Models (GSLM), which train language models on discrete speech units without text. They find that intelligible, natural speech can be synthesized at lower bitrates than the standard baseline, and that continuation quality remains stable at reduced bitrates, suggesting conventional GSLM settings may be over-specified. The paper also notes that LLM-based evaluation metrics correlate better with human judgments than conventional metrics, but correlation remains low, pointing to a gap in automatic evaluation for speech generation.