Almanac
paper

LESS: Mutual-Stability Sampling for Diffusion Language Models

paperactiveprovisionalless-mutual-stability-sampling-for-diffusion-language-models-38d141e7·1 events·first seen 30h ago

Aliases: LESS: Mutual-Stability Sampling for Diffusion Language Models

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·30h ago·source ↗

LESS: Adaptive mutual-stability sampling cuts diffusion LLM decoding steps by 72%

Researchers introduce LESS, a training-free adaptive sampler for diffusion large language models that treats token commitment as an online stopping problem. The method uses a joint stability rule combining confidence, persistence, and distributional stability to decide when to unmask tokens, avoiding wasted computation on already-stable positions. Evaluated on Dream-7B, LLaDA-8B, and LLaDA-1.5-8B across seven benchmarks, LESS reduces reverse denoising steps by 72.1% versus fixed-budget decoding while improving accuracy over prior adaptive samplers. The step reductions translate directly to fewer Transformer forward passes and lower wall-clock latency.