Almanac
model

SDAR

modelactiveprovisionalsdar-4de1ca69·1 events·first seen 15d ago

Aliases: SDAR

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·15d ago·source ↗

SimSD: Speculative Decoding Adapted for Diffusion Language Models

SimSD introduces a training-free speculative decoding algorithm for diffusion large language models (dLLMs), which previously could not use standard token-level speculative decoding due to their bidirectional attention and masked language modeling formulation. The method uses a plug-and-play masking strategy that introduces reference tokens from a draft model and a custom attention mask, enabling valid logit computation for drafted tokens in a single forward pass. Evaluated on SDAR-family dLLMs across four benchmarks, SimSD achieves up to 7.46x decoding throughput improvement while maintaining or improving generation quality. The approach is compatible with other acceleration techniques such as KV cache and blockwise decoding.