paper
Knowledge Editing in Masked Diffusion Language Models
paperactiveprovisional
knowledge-editing-in-masked-diffusion-language-models-50ec3b05·1 events·first seen 13d agoAliases: Knowledge Editing in Masked Diffusion Language Models
Co-occurring entities
More like this (12)
Masked Diffusion ModelsSelf-Augmenting Retrieval for Diffusion Language ModelsBeyond Fully Random Masking: Attention-Guided Denoising and Optimization for Diffusion Language ModelsDiffusion Language Modelsknowledge editingLESS: Mutual-Stability Sampling for Diffusion Language ModelsDenoising Diffusion Probabilistic ModelsTying the Loop -- Tied Expert Layers in Mixture-of-Experts Language Modelsmasked-token modelingcontinuous diffusion language modelA Diffusion Approximation for Temporal-Difference Learning with Linear Features under Markovian NoiseRepresentation-Conditioned Diffusion Models
Recent events (1)
Knowledge editing via locate-then-edit transferred to masked diffusion language models, revealing multi-token failure mode
A new arXiv paper investigates whether locate-then-edit knowledge editing methods, developed for autoregressive models, transfer to masked diffusion language models (MDMs) such as LLaDA and Dream. The authors find that causal tracing identifies the same early-to-mid-layer MLP location in both paradigms, but MDMs degrade systematically on multi-token edits due to partially unmasked intermediate states that the edit was never optimized for. A correction targeting these intermediate states substantially restores multi-token editing performance. The work is the first systematic comparison of knowledge editing across autoregressive and diffusion-based language model paradigms.