Almanac
technique

Block-Size Curriculum Learning

techniqueactiveprovisionalblock-size-curriculum-learning-a868a55b·1 events·first seen 2d ago

Aliases: Block-Size Curriculum Learning

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·2d ago·source ↗

DreamReasoner-8B: Block-size curriculum learning enables long-CoT reasoning in diffusion language models

Researchers introduce DreamReasoner-8B, an open-source block diffusion language model trained with a block-size curriculum learning strategy that gradually transitions from fine-grained to coarse-grained block sizes during training. The work identifies a critical failure mode: training with large block sizes severely degrades reasoning, while small block sizes preserve it. The proposed curriculum bridges this gap, achieving math and code reasoning performance competitive with Qwen3-8B while retaining the parallel decoding efficiency of block diffusion models. The model and code are publicly released.