SURGE: Approximation-free Training-Free Particle Filter for Diffusion Surrogate
The paper introduces URGE (Unbiased Resampling via Girsanov Estimation), a derivative-free inference-time scaling algorithm for diffusion models that performs path-wise importance reweighting using a Girsanov change of measure. Unlike existing inference-time guidance methods, URGE requires no score, Hessian, or PDE evaluations, attaching multiplicative weights to simulated trajectories and periodically resampling. The authors establish a theoretical equivalence between path-wise and particle-wise sequential Monte Carlo (SMC), guaranteeing unbiased terminal distributions. Empirically, URGE outperforms existing inference-time guidance baselines on synthetic tests and diffusion-model benchmarks while being simpler to implement.
Related guides (2)
Related events (8)
GADD: Gibbs-Accelerated Discrete Diffusion Achieves Polylog Sampling Complexity
This paper introduces Gibbs-Accelerated Discrete Diffusion (GADD), a corrector method for uniform-rate discrete diffusion models that constructs Gibbs posterior likelihoods directly from the concrete score function without additional training. GADD achieves O(polylog(ε⁻¹)) sampling complexity, the first such rate for diffusion-based samplers in this setting. Experiments on synthetic data, zero-shot text sampling, and zero-shot conditional music generation show consistent improvements in sample quality and wall-clock efficiency over Euler and CTMC baselines. The work also introduces a novel induction-based theoretical framework for analyzing predictor-corrector methods in discrete diffusion.
Kolmogorov Regression lifts diffusion policies to Cameron-Martin space for robust long-horizon control
Researchers introduce a backward Kolmogorov equation framework that reformulates diffusion policy training as a deterministic boundary-value PDE problem in Cameron-Martin space, replacing stochastic score matching. The approach uses a precision-weighted Cameron-Martin loss and a Kolmogorov residual as an inference-time failure detector, yielding convergence guarantees tied to kernel effective rank rather than action dimension. Validation on the PushT manipulation benchmark shows 17% improvement in episode reward and 67.6% reduction in inter-step drift; a 6-station manufacturing scheduling task shows 28.4% lower RMSE than LSTM baselines and 96% reduction in deadlock events via Hamilton-Jacobi reachability certification.
Finite-Sample Lens for Understanding Diffusion Posterior Sampler Failures
This paper introduces a finite-sample theoretical framework for analyzing diffusion model posterior samplers used in imaging inverse problems. The authors show that popular likelihood approximations at intermediate timesteps systematically under- or over-estimate posterior spread, leading to failure modes including sensitivity to early stopping, incorrect weighting of posterior modes, and hallucination of prior or likelihood modes. Crucially, they demonstrate these failures can arise from a multimodal prior alone, without requiring nonlinear measurement models or multimodal posteriors. The framework is model-agnostic and can serve as a diagnostic tool for evaluating existing and future posterior samplers.
CARV: Compute-Aware Variance Reduction for Diffusion Teacher Gradient Estimation
CARV is a hierarchical Monte Carlo estimation framework that reduces gradient variance when using frozen pretrained diffusion models as teachers in downstream pipelines such as text-to-3D distillation and data attribution. The approach amortizes expensive upstream computation (rendering, simulation, encoding) over cheap diffusion-noise resamples, augmented by timestep importance sampling and stratified-inverse-CDF construction. In text-to-3D experiments, CARV delivers 2–3× effective compute multipliers; in single-step distillation, it cuts gradient variance by an order of magnitude but does not improve FID, revealing that MC variance is not the bottleneck in that regime.
Ambient Diffusion Policy: imitation learning from suboptimal robot data via noise-dependent co-training
Researchers introduce Ambient Diffusion Policy, a method for robot imitation learning that extracts useful features from suboptimal demonstrations by restricting their contribution to specific diffusion timesteps (high and low noise levels). The approach is grounded in the observation that robot action data follows a spectral power law, inducing global-to-local hierarchy and locality properties in diffusion models. Evaluated across six tasks and four types of suboptimal data, it outperforms co-training baselines by up to 33% when scaled to the Open X-Embodiment dataset.
Exact Posterior Score (EPS): Closed-form posterior sampling for linear inverse problems with diffusion models
A new arXiv preprint derives the exact posterior score in closed form for linear Gaussian inverse problems under general Gaussian interpolants, showing that posterior sampling reduces to a denoising problem at an operator-dependent shifted pivot under anisotropic noise covariance. The authors convert this identity into a training objective called Exact Posterior Score (EPS) that preserves the input/output structure of standard diffusion pretraining, enabling training from scratch or fine-tuning from a pretrained denoiser. EPS is evaluated on five linear inverse problems across FFHQ and ImageNet, outperforming both training-free and training-based baselines while requiring roughly an order of magnitude fewer denoiser evaluations than gradient-based posterior samplers.
SARDI: Self-Augmenting Retrieval for Diffusion Language Models using lookahead tokens
Researchers introduce SARDI, a training-free RAG framework for discrete diffusion language models that repurposes discarded low-confidence tokens during denoising as lookahead signals to guide retrieval before output is finalized. The method is retriever-agnostic and applicable to any reasoning-capable discrete diffusion LM. Evaluated across five multi-hop QA benchmarks, SARDI outperforms training-free diffusion and autoregressive retrieval baselines at up to 8x higher throughput.
Optimizing Stable Diffusion for Intel CPUs with NNCF and Hugging Face Optimum
This Hugging Face blog post details techniques for optimizing Stable Diffusion inference on Intel CPUs using Neural Network Compression Framework (NNCF) and the Optimum library. The workflow covers quantization and other compression methods to reduce latency and memory footprint on CPU hardware. This is relevant to the inference-economics and enterprise-deployment threads as it addresses running diffusion models without dedicated GPU hardware.

