6arXiv cs.LG (Machine Learning)·1mo ago

SURGE: Approximation-free Training-Free Particle Filter for Diffusion Surrogate

The paper introduces URGE (Unbiased Resampling via Girsanov Estimation), a derivative-free inference-time scaling algorithm for diffusion models that performs path-wise importance reweighting using a Girsanov change of measure. Unlike existing inference-time guidance methods, URGE requires no score, Hessian, or PDE evaluations, attaching multiplicative weights to simulated trajectories and periodically resampling. The authors establish a theoretical equivalence between path-wise and particle-wise sequential Monte Carlo (SMC), guaranteeing unbiased terminal distributions. Empirically, URGE outperforms existing inference-time guidance baselines on synthetic tests and diffusion-model benchmarks while being simpler to implement.

Frontier Model Releases Inference Economics diffusion-based generative models URGE (Unbiased Resampling via Girsanov Estimation)Girsanov change of measure Sequential Monte Carlo (SMC)

Related guides (2)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Inference EconomicsTopic guide

Inference Economics: The Cost of Running AI in Production

Read asBeginner In-depth

Related events (8)

6arXiv · cs.LG·24d ago·source ↗

GADD: Gibbs-Accelerated Discrete Diffusion Achieves Polylog Sampling Complexity

This paper introduces Gibbs-Accelerated Discrete Diffusion (GADD), a corrector method for uniform-rate discrete diffusion models that constructs Gibbs posterior likelihoods directly from the concrete score function without additional training. GADD achieves O(polylog(ε⁻¹)) sampling complexity, the first such rate for diffusion-based samplers in this setting. Experiments on synthetic data, zero-shot text sampling, and zero-shot conditional music generation show consistent improvements in sample quality and wall-clock efficiency over Euler and CTMC baselines. The work also introduces a novel induction-based theoretical framework for analyzing predictor-corrector methods in discrete diffusion.

Evaluation and Benchmarking Inference Economics Gibbs-Accelerated Discrete Diffusion (GADD)predictor-corrector methods discrete diffusion models +2 more

5arXiv · cs.LG·3d ago·source ↗

Kolmogorov Regression lifts diffusion policies to Cameron-Martin space for robust long-horizon control

Researchers introduce a backward Kolmogorov equation framework that reformulates diffusion policy training as a deterministic boundary-value PDE problem in Cameron-Martin space, replacing stochastic score matching. The approach uses a precision-weighted Cameron-Martin loss and a Kolmogorov residual as an inference-time failure detector, yielding convergence guarantees tied to kernel effective rank rather than action dimension. Validation on the PushT manipulation benchmark shows 17% improvement in episode reward and 67.6% reduction in inter-step drift; a 6-station manufacturing scheduling task shows 28.4% lower RMSE than LSTM baselines and 96% reduction in deadlock events via Hamilton-Jacobi reachability certification.

Agent and Tool Ecosystem Hamilton-Jacobi reachability Kolmogorov Regression for Robust Diffusion Policies PushT +1 more

6arXiv · cs.LG·22d ago·source ↗

Finite-Sample Lens for Understanding Diffusion Posterior Sampler Failures

This paper introduces a finite-sample theoretical framework for analyzing diffusion model posterior samplers used in imaging inverse problems. The authors show that popular likelihood approximations at intermediate timesteps systematically under- or over-estimate posterior spread, leading to failure modes including sensitivity to early stopping, incorrect weighting of posterior modes, and hallucination of prior or likelihood modes. Crucially, they demonstrate these failures can arise from a multimodal prior alone, without requiring nonlinear measurement models or multimodal posteriors. The framework is model-agnostic and can serve as a diagnostic tool for evaluating existing and future posterior samplers.

Evaluation and Benchmarking AI Safety Research finite-sample posterior sampling framework likelihood approximation imaging inverse problems +3 more

5arXiv · cs.AI·1mo ago·source ↗

CARV: Compute-Aware Variance Reduction for Diffusion Teacher Gradient Estimation

CARV is a hierarchical Monte Carlo estimation framework that reduces gradient variance when using frozen pretrained diffusion models as teachers in downstream pipelines such as text-to-3D distillation and data attribution. The approach amortizes expensive upstream computation (rendering, simulation, encoding) over cheap diffusion-noise resamples, augmented by timestep importance sampling and stratified-inverse-CDF construction. In text-to-3D experiments, CARV delivers 2–3× effective compute multipliers; in single-step distillation, it cuts gradient variance by an order of magnitude but does not improve FID, revealing that MC variance is not the bottleneck in that regime.

Inference Economics Multimodal Progress Model Distillation CARV importance sampling +4 more

6arXiv · cs.AI·9d ago·source ↗

Ambient Diffusion Policy: imitation learning from suboptimal robot data via noise-dependent co-training

Researchers introduce Ambient Diffusion Policy, a method for robot imitation learning that extracts useful features from suboptimal demonstrations by restricting their contribution to specific diffusion timesteps (high and low noise levels). The approach is grounded in the observation that robot action data follows a spectral power law, inducing global-to-local hierarchy and locality properties in diffusion models. Evaluated across six tasks and four types of suboptimal data, it outperforms co-training baselines by up to 33% when scaled to the Open X-Embodiment dataset.

Training Infrastructure Diffusion Policy Ambient Diffusion Policy Open X-Embodiment

5arXiv · cs.LG·4d ago·source ↗

Exact Posterior Score (EPS): Closed-form posterior sampling for linear inverse problems with diffusion models

A new arXiv preprint derives the exact posterior score in closed form for linear Gaussian inverse problems under general Gaussian interpolants, showing that posterior sampling reduces to a denoising problem at an operator-dependent shifted pivot under anisotropic noise covariance. The authors convert this identity into a training objective called Exact Posterior Score (EPS) that preserves the input/output structure of standard diffusion pretraining, enabling training from scratch or fine-tuning from a pretrained denoiser. EPS is evaluated on five linear inverse problems across FFHQ and ImageNet, outperforming both training-free and training-based baselines while requiring roughly an order of magnitude fewer denoiser evaluations than gradient-based posterior samplers.

Evaluation and Benchmarking Exact Posterior Score Estimation for Solving Linear Inverse Problems FFHQ ImageNet +1 more

5arXiv · cs.LG·15d ago·source ↗

SARDI: Self-Augmenting Retrieval for Diffusion Language Models using lookahead tokens

Researchers introduce SARDI, a training-free RAG framework for discrete diffusion language models that repurposes discarded low-confidence tokens during denoising as lookahead signals to guide retrieval before output is finalized. The method is retriever-agnostic and applicable to any reasoning-capable discrete diffusion LM. Evaluated across five multi-hop QA benchmarks, SARDI outperforms training-free diffusion and autoregressive retrieval baselines at up to 8x higher throughput.

Evaluation and Benchmarking Agent and Tool Ecosystem Self-Augmenting Retrieval for Diffusion Language Models SARDI

4Hugging Face Blog·1mo ago·source ↗

Optimizing Stable Diffusion for Intel CPUs with NNCF and Hugging Face Optimum

This Hugging Face blog post details techniques for optimizing Stable Diffusion inference on Intel CPUs using Neural Network Compression Framework (NNCF) and the Optimum library. The workflow covers quantization and other compression methods to reduce latency and memory footprint on CPU hardware. This is relevant to the inference-economics and enterprise-deployment threads as it addresses running diffusion models without dedicated GPU hardware.

Inference Economics Enterprise Deployment Patterns Stable Diffusion 3 Hugging Face Hugging Face Optimum +2 more