Interpretability study of DiffusionGemma reveals novel diffusion-specific reasoning phenomena
Researchers investigate the reasoning transparency of DiffusionGemma, a diffusion-based language model, decomposing transparency into variable and algorithmic components. They show that mapping information through an interpretable token bottleneck reduces DiffusionGemma's opaque serial depth from 28.6X to just 1.1X that of autoregressive Gemma 4, with no performance loss. Interpretability case studies uncover diffusion-specific phenomena including non-chronological reasoning, token smearing, and intermediate-context reasoning. Monitorability tests find DiffusionGemma comparable to Gemma 4, suggesting diffusion LMs are not inherently less amenable to safety oversight.
Related guides (3)
Related events (8)
DeepMind announces DiffusionGemma with 4x faster text generation
DeepMind published a blog post introducing DiffusionGemma, a diffusion-based variant of the Gemma model family claiming 4x faster text generation. The announcement suggests a departure from standard autoregressive decoding in favor of diffusion-based generation. If the claims hold, this could represent a meaningful inference efficiency advance for the Gemma line.
Simon Willison on DiffusionGemma
Simon Willison covers DiffusionGemma, a diffusion-based language model in the Gemma family from Google. The post appears to be commentary or a brief note on the model's release or capabilities. Diffusion-based LLMs represent an active area of research as an alternative to autoregressive generation.
Gemma Scope 2: Interpretability Tools Released Across Entire Gemma 3 Family
DeepMind has released Gemma Scope 2, an open interpretability toolkit covering the full Gemma 3 model family. The release extends the original Gemma Scope effort to provide the AI safety community with tools for understanding complex language model behavior. By making these tools openly available across all Gemma 3 variants, DeepMind aims to support mechanistic interpretability research at scale.
DiffusionGemma hits 1,000+ tokens/sec; Claude Fable 5 export controls; Agents' Last Exam benchmark launch
Google introduced DiffusionGemma, an experimental 26B MoE model using diffusion-based text generation that produces 256-token blocks simultaneously, achieving over 1,000 tokens/second on H100 hardware at the cost of lower output quality versus standard Gemma 4. Separately, the US government issued an export control directive forcing Anthropic to suspend Claude Fable 5 and Claude Mythos 5 globally, while Anthropic also reversed a controversial silent-degradation safeguard on Fable 5 after researcher backlash. UC Berkeley's Center for RDI launched Agents' Last Exam (ALE), a 1,500+ task agentic benchmark using deterministic grading, where GPT-5.5 topped the leaderboard at only 24% pass rate, highlighting the difficulty gap between current models and professional-grade workflows.
AGDO: Attention-guided denoising and optimization framework improves diffusion language model reasoning
Researchers propose AGDO, a framework that replaces random masking in diffusion large language models (dLLMs) with attention-guided denoising order and token weighting during fine-tuning and reinforcement learning. The work is motivated by an empirical finding that tokens with stronger attention to unmasked context are more stable and critical for reasoning. Experiments on math and coding benchmarks show AGDO outperforms existing post-training methods for dLLMs, advancing the case for attention-aware training in parallel-decoding language models.
SARDI: Self-Augmenting Retrieval for Diffusion Language Models using lookahead tokens
Researchers introduce SARDI, a training-free RAG framework for discrete diffusion language models that repurposes discarded low-confidence tokens during denoising as lookahead signals to guide retrieval before output is finalized. The method is retriever-agnostic and applicable to any reasoning-capable discrete diffusion LM. Evaluated across five multi-hop QA benchmarks, SARDI outperforms training-free diffusion and autoregressive retrieval baselines at up to 8x higher throughput.
Diffusion-Proof: First framework applying diffusion LLMs to formal theorem proving
Researchers introduce Diffusion-Proof, the first framework to train and apply diffusion language models (dLLMs) for formal theorem proving, addressing limitations of autoregressive models in long-range coherence. The framework includes dLLM-Prover-7B for whole-proof generation and dLLM-Corrector-7B for local proof correction via bidirectional infilling. Diffusion-Proof achieves absolute improvements of 1.61% on ProofNet-Test and 6.14% on MiniF2F-Test over an AR baseline, and solves one IMO problem that DeepSeek-Prover-V2-7B could not. The result suggests dLLMs may have structural advantages over AR models for tasks requiring long-range logical coherence.
DreamReasoner-8B: Block-size curriculum learning enables long-CoT reasoning in diffusion language models
Researchers introduce DreamReasoner-8B, an open-source block diffusion language model trained with a block-size curriculum learning strategy that gradually transitions from fine-grained to coarse-grained block sizes during training. The work identifies a critical failure mode: training with large block sizes severely degrades reasoning, while small block sizes preserve it. The proposed curriculum bridges this gap, achieving math and code reasoning performance competitive with Qwen3-8B while retaining the parallel decoding efficiency of block diffusion models. The model and code are publicly released.


