Entity · model

Dream-7B-Base

modelactivedream-7b-base-cb3ead4b·2 events·first seen Jun 10, 2026

Aliases: Dream-7B-Base, Dream-7B

Co-occurring entities

LESS: Mutual-Stability Sampling for Diffusion Language Models Jensen-Shannon divergence LLaDA-1.5-8B LLaDA-8B LLaDA-8B-Base MATH500 EB-Sampler ADAS HumanEval MBPP GSM8K Fast-dLLM

More like this (12)

Dream-7B-Instruct LLaDA-8B-Base Breeze-7B Falcon-7B Aya-Expanse-8B-Base Qwen3-30B-A3B-Base Fara-7B LLaMA-7B BERT-base Qwen3.5-35B-A3B-Base SaulLM-7B Mistral-7B-v0.3

Recent events (2)

5arXiv · cs.CL·Jun 16, 2026·source ↗

LESS: Adaptive mutual-stability sampling cuts diffusion LLM decoding steps by 72%

Researchers introduce LESS, a training-free adaptive sampler for diffusion large language models that treats token commitment as an online stopping problem. The method uses a joint stability rule combining confidence, persistence, and distributional stability to decide when to unmask tokens, avoiding wasted computation on already-stable positions. Evaluated on Dream-7B, LLaDA-8B, and LLaDA-1.5-8B across seven benchmarks, LESS reduces reverse denoising steps by 72.1% versus fixed-budget decoding while improving accuracy over prior adaptive samplers. The step reductions translate directly to fewer Transformer forward passes and lower wall-clock latency.

Frontier Model Releases Inference Economics LESS: Mutual-Stability Sampling for Diffusion Language Models Jensen-Shannon divergence LLaDA-1.5-8B +2 more

5arXiv · cs.CL·Jun 10, 2026·source ↗

ADAS: Attention-Discounted Adaptive Sampler improves parallel decoding for masked diffusion language models

Researchers propose ADAS, a training-free reranking rule for masked diffusion language model decoding that addresses token interaction failures in parallel token commitment. The method greedily penalizes candidates that attend strongly to already-selected uncertain positions, using attention weights as soft marginal penalties rather than hard constraints. Evaluated on LLaDA-8B-Base and Dream-7B-Base across GSM8K, MATH500, HumanEval, and MBPP, ADAS improves low-NFE performance by 9–10 percentage points on average when plugged into existing samplers with only 3.1% runtime overhead.

Frontier Model Releases Inference Economics LLaDA-8B-Base MATH500 EB-Sampler +6 more