model
Sumi
modelactiveprovisional
sumi-81bb6eb5·1 events·first seen 2d agoAliases: Sumi
Co-occurring entities
More like this (12)
Recent events (1)
Sumi: First open 7B uniform diffusion language model pretrained from scratch at scale
Researchers introduce Sumi, a fully open 7B uniform diffusion language model (UDLM) pretrained from scratch on 1.5 trillion tokens — the first UDLM at both large parameter scale and large token budget. Sumi performs competitively with autoregressive models on knowledge, reasoning, and coding benchmarks, though underperforms on commonsense tasks, attributed partly to an education-heavy data mixture. Model weights, checkpoints, and full training recipe including data mixture specification are released publicly. The work fills a gap in the diffusion language model landscape, providing a reference point for studying scaling behavior and generation dynamics in uniform diffusion.