6Berkeley AI Research (BAIR) Blog·1mo ago

PLAID: Repurposing Protein Folding Models for Multimodal Protein Generation with Latent Diffusion

PLAID is a generative model that simultaneously produces protein 1D sequences and 3D all-atom structures by learning a diffusion model over the latent space of ESMFold, a protein folding model. It requires only sequence data for training—leveraging databases 2-4 orders of magnitude larger than structure databases—and decodes structure at inference via frozen folding model weights. The approach supports compositional prompting by function and organism, addressing practical drug-design constraints like humanization and solubility. A companion compression model, CHEAP, addresses the high-dimensionality of transformer latent spaces to make the diffusion training tractable.

Frontier Model Releases Multimodal Progress CHEAP Berkeley AI Research (BAIR)AlphaFold2 ESMFold AlphaFold3 PLAID latent diffusion

Related guides (2)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Multimodal ProgressTopic guide

Multimodal Progress: How AI Learned to See, Hear, and Act

Read asBeginner In-depth

Related events (8)

5arXiv · cs.LG·1mo ago·source ↗

EvoStruct: Bridging Evolutionary and Structural Priors for Antibody CDR Design via Protein Language Model Adaptation

EvoStruct addresses vocabulary collapse in GNN-based antibody CDR design by combining a frozen protein language model with an E(3)-equivariant GNN through a cross-attention adapter. The method introduces progressive PLM unfreezing and R-Drop consistency regularization to recover functionally important amino acid diversity. On CHIMERA-Bench, EvoStruct improves sequence recovery by 16%, reduces perplexity by 43%, and achieves 2.3x greater amino acid diversity compared to the best GNN baselines.

Evaluation and Benchmarking Multimodal Progress EvoStruct protein language models E(3)-equivariant GNN +4 more

6Latent Space·24d ago·source ↗

ESMFold2: The Bitter Lesson is Coming for Proteins — Alex Rives, BioHub

A Latent Space interview/commentary piece featuring Alex Rives of BioHub discussing ESMFold2 and the application of the 'bitter lesson' (scale and general methods beating hand-crafted inductive bias) to protein structure prediction and biology. The piece covers the tension between dataset scale versus domain-specific inductive bias in biological ML, and touches on world models and programmable biology. This represents a significant perspective from a leading researcher in protein language models on the next generation of biological foundation models.

Frontier Model Releases Open Weights Progress Chan Zuckerberg Biohub ESMFold Alex Rives +3 more

4Hugging Face Blog·1mo ago·source ↗

Deep Learning with Proteins

A Hugging Face blog post covering the application of deep learning techniques to protein science, likely covering protein language models, structure prediction, and related tooling. Published in late 2022, this sits in the context of AlphaFold2's impact and the emerging ecosystem of protein ML models. The post likely surveys models, datasets, and frameworks available for computational biology on the Hugging Face platform.

Open Weights Progress Agent and Tool Ecosystem protein language models AlphaFold2 ESM (Evolutionary Scale Modeling)+1 more

6arXiv · cs.CL·1mo ago·source ↗

RePlaid: Continuous Diffusion Language Models Scale Competitively with Discrete Diffusion

This paper revisits continuous diffusion language models (DLMs) by introducing RePlaid, an updated version of Plaid that aligns its architecture with modern discrete DLMs. RePlaid establishes the first scaling law for continuous DLMs competitive with discrete approaches, achieving a compute gap of only 20× versus autoregressive models and a state-of-the-art perplexity bound of 22.1 on OpenWebText among continuous DLMs. The authors provide theoretical analysis showing that likelihood-based training naturally yields linear cross-entropy over time and creates structured embedding geometries, explaining the performance gains.

Frontier Model Releases Evaluation and Benchmarking RePlaid PLAID OpenWebText +4 more

4arXiv · cs.AI·11d ago·source ↗

PTL-Diffusion: Diffusion framework with periodic terminal laws for manifold-aware generation

PTL-Diffusion is a new diffusion modeling framework that replaces the standard single Gaussian terminal distribution with a periodic family of Gaussian terminal laws, embedding phase structure directly into the forward noising dynamics rather than only in the denoising network. The authors derive closed-form forward marginals and reverse posteriors for a periodically forced Ornstein-Uhlenbeck process, enabling standard noise-prediction training. Experiments on torus, cylinder, and face datasets show improvements in manifold-level distributional matching over DDPM baselines. The work is a proof-of-concept motivating structured terminal reference laws as a direction for geometry-aware generative modeling.

Evaluation and Benchmarking Denoising Diffusion Probabilistic Models Olivetti Faces Dataset PTL-Diffusion

4arXiv · cs.AI·11d ago·source ↗

Pose-ICL: 3D-aware in-context learning for pose-controllable image generation of custom subjects

Researchers introduce Pose-ICL, a tuning-free framework for generating images of user-specified subjects with accurate pose control. The method uses Surface-Anchored Position Embedding (SAPE) to give 2D diffusion models explicit 3D awareness by anchoring image tokens to volumetric bounding box surface coordinates. Evaluations on 3D assets and real-world subjects show improvements over existing methods in both pose accuracy and identity consistency. The framework is designed for compatibility with existing Diffusion Transformer (DiT) models.

Multimodal Progress Surface-Anchored Position Embedding Pose-ICL Pose-ICL: 3D-Aware In-Context Learning for Pose-Controllable Subject Customization

4Hugging Face Blog·1mo ago·source ↗

The Annotated Diffusion Model

A Hugging Face blog post providing a detailed, annotated walkthrough of diffusion models for image generation, likely covering the mathematical foundations and implementation details of denoising diffusion probabilistic models (DDPMs). The post serves as an educational deep-dive into the architecture and training process of diffusion-based generative models. Published in mid-2022, it coincides with the period of rapid growth in diffusion model adoption.

Multimodal Progress DDPM Denoising Diffusion Probabilistic Models Hugging Face

5Hugging Face Blog·1mo ago·source ↗

Finetune Stable Diffusion Models with DDPO via TRL

Hugging Face's TRL library adds support for DDPO (Denoising Diffusion Policy Optimization), enabling reinforcement learning-based finetuning of Stable Diffusion models. This extends TRL's RLHF tooling beyond language models to image generation, allowing reward-driven optimization of diffusion models. The post demonstrates practical usage of the new DDPO trainer within the TRL ecosystem.

Agent and Tool Ecosystem Alignment and RLHF DDPO Denoising Diffusion Policy Optimization Stable Diffusion 3 +3 more