PTL-Diffusion: Diffusion framework with periodic terminal laws for manifold-aware generation
PTL-Diffusion is a new diffusion modeling framework that replaces the standard single Gaussian terminal distribution with a periodic family of Gaussian terminal laws, embedding phase structure directly into the forward noising dynamics rather than only in the denoising network. The authors derive closed-form forward marginals and reverse posteriors for a periodically forced Ornstein-Uhlenbeck process, enabling standard noise-prediction training. Experiments on torus, cylinder, and face datasets show improvements in manifold-level distributional matching over DDPM baselines. The work is a proof-of-concept motivating structured terminal reference laws as a direction for geometry-aware generative modeling.
Related guides (1)
Related events (8)
The Annotated Diffusion Model
A Hugging Face blog post providing a detailed, annotated walkthrough of diffusion models for image generation, likely covering the mathematical foundations and implementation details of denoising diffusion probabilistic models (DDPMs). The post serves as an educational deep-dive into the architecture and training process of diffusion-based generative models. Published in mid-2022, it coincides with the period of rapid growth in diffusion model adoption.
Finetune Stable Diffusion Models with DDPO via TRL
Hugging Face's TRL library adds support for DDPO (Denoising Diffusion Policy Optimization), enabling reinforcement learning-based finetuning of Stable Diffusion models. This extends TRL's RLHF tooling beyond language models to image generation, allowing reward-driven optimization of diffusion models. The post demonstrates practical usage of the new DDPO trainer within the TRL ecosystem.
Kolmogorov Regression lifts diffusion policies to Cameron-Martin space for robust long-horizon control
Researchers introduce a backward Kolmogorov equation framework that reformulates diffusion policy training as a deterministic boundary-value PDE problem in Cameron-Martin space, replacing stochastic score matching. The approach uses a precision-weighted Cameron-Martin loss and a Kolmogorov residual as an inference-time failure detector, yielding convergence guarantees tied to kernel effective rank rather than action dimension. Validation on the PushT manipulation benchmark shows 17% improvement in episode reward and 67.6% reduction in inter-step drift; a 6-station manufacturing scheduling task shows 28.4% lower RMSE than LSTM baselines and 96% reduction in deadlock events via Hamilton-Jacobi reachability certification.
RePlaid: Continuous Diffusion Language Models Scale Competitively with Discrete Diffusion
This paper revisits continuous diffusion language models (DLMs) by introducing RePlaid, an updated version of Plaid that aligns its architecture with modern discrete DLMs. RePlaid establishes the first scaling law for continuous DLMs competitive with discrete approaches, achieving a compute gap of only 20× versus autoregressive models and a state-of-the-art perplexity bound of 22.1 on OpenWebText among continuous DLMs. The authors provide theoretical analysis showing that likelihood-based training naturally yields linear cross-entropy over time and creates structured embedding geometries, explaining the performance gains.
Diffusion-Proof: First framework applying diffusion LLMs to formal theorem proving
Researchers introduce Diffusion-Proof, the first framework to train and apply diffusion language models (dLLMs) for formal theorem proving, addressing limitations of autoregressive models in long-range coherence. The framework includes dLLM-Prover-7B for whole-proof generation and dLLM-Corrector-7B for local proof correction via bidirectional infilling. Diffusion-Proof achieves absolute improvements of 1.61% on ProofNet-Test and 6.14% on MiniF2F-Test over an AR baseline, and solves one IMO problem that DeepSeek-Prover-V2-7B could not. The result suggests dLLMs may have structural advantages over AR models for tasks requiring long-range logical coherence.
Finite-Sample Lens for Understanding Diffusion Posterior Sampler Failures
This paper introduces a finite-sample theoretical framework for analyzing diffusion model posterior samplers used in imaging inverse problems. The authors show that popular likelihood approximations at intermediate timesteps systematically under- or over-estimate posterior spread, leading to failure modes including sensitivity to early stopping, incorrect weighting of posterior modes, and hallucination of prior or likelihood modes. Crucially, they demonstrate these failures can arise from a multimodal prior alone, without requiring nonlinear measurement models or multimodal posteriors. The framework is model-agnostic and can serve as a diagnostic tool for evaluating existing and future posterior samplers.
PLAID: Repurposing Protein Folding Models for Multimodal Protein Generation with Latent Diffusion
PLAID is a generative model that simultaneously produces protein 1D sequences and 3D all-atom structures by learning a diffusion model over the latent space of ESMFold, a protein folding model. It requires only sequence data for training—leveraging databases 2-4 orders of magnitude larger than structure databases—and decodes structure at inference via frozen folding model weights. The approach supports compositional prompting by function and organism, addressing practical drug-design constraints like humanization and solubility. A companion compression model, CHEAP, addresses the high-dimensionality of transformer latent spaces to make the diffusion training tractable.
Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models
NVIDIA's Nemotron-Labs introduces diffusion-based language models targeting extremely fast text generation, published as a Hugging Face blog post. The piece covers the approach of using diffusion processes for language modeling as an alternative to autoregressive generation, with a focus on inference speed. This represents a continued push by NVIDIA's research arm into non-autoregressive generation paradigms.
