6Latent Space (swyx)·24d ago

ESMFold2: The Bitter Lesson is Coming for Proteins — Alex Rives, BioHub

A Latent Space interview/commentary piece featuring Alex Rives of BioHub discussing ESMFold2 and the application of the 'bitter lesson' (scale and general methods beating hand-crafted inductive bias) to protein structure prediction and biology. The piece covers the tension between dataset scale versus domain-specific inductive bias in biological ML, and touches on world models and programmable biology. This represents a significant perspective from a leading researcher in protein language models on the next generation of biological foundation models.

Frontier Model Releases Open Weights Progress Chan Zuckerberg Biohub ESMFold Alex Rives Latent Space The Bitter Lesson ESMFold2

Related guides (2)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner In-depth

Related events (8)

4Hugging Face Blog·1mo ago·source ↗

Deep Learning with Proteins

A Hugging Face blog post covering the application of deep learning techniques to protein science, likely covering protein language models, structure prediction, and related tooling. Published in late 2022, this sits in the context of AlphaFold2's impact and the emerging ecosystem of protein ML models. The post likely surveys models, datasets, and frameworks available for computational biology on the Hugging Face platform.

Open Weights Progress Agent and Tool Ecosystem protein language models AlphaFold2 ESM (Evolutionary Scale Modeling)+1 more

6Berkeley Ai Research (Bair) Blog·1mo ago·source ↗

PLAID: Repurposing Protein Folding Models for Multimodal Protein Generation with Latent Diffusion

PLAID is a generative model that simultaneously produces protein 1D sequences and 3D all-atom structures by learning a diffusion model over the latent space of ESMFold, a protein folding model. It requires only sequence data for training—leveraging databases 2-4 orders of magnitude larger than structure databases—and decodes structure at inference via frozen folding model weights. The approach supports compositional prompting by function and organism, addressing practical drug-design constraints like humanization and solubility. A companion compression model, CHEAP, addresses the high-dimensionality of transformer latent spaces to make the diffusion training tractable.

Frontier Model Releases Multimodal Progress CHEAP Berkeley AI Research (BAIR)AlphaFold2 +4 more

5arXiv · cs.LG·1mo ago·source ↗

EvoStruct: Bridging Evolutionary and Structural Priors for Antibody CDR Design via Protein Language Model Adaptation

EvoStruct addresses vocabulary collapse in GNN-based antibody CDR design by combining a frozen protein language model with an E(3)-equivariant GNN through a cross-attention adapter. The method introduces progressive PLM unfreezing and R-Drop consistency regularization to recover functionally important amino acid diversity. On CHIMERA-Bench, EvoStruct improves sequence recovery by 16%, reduces perplexity by 43%, and achieves 2.3x greater amino acid diversity compared to the best GNN baselines.

Evaluation and Benchmarking Multimodal Progress EvoStruct protein language models E(3)-equivariant GNN +4 more

6Google Deepmind Blog·1mo ago·source ↗

AlphaFold: Five Years of Impact

DeepMind published a retrospective on AlphaFold's five-year impact on biological research and scientific discovery. The post surveys how the protein structure prediction system has accelerated science globally since its initial release. As a tier-1 source anniversary piece, it likely highlights cumulative usage statistics, downstream research enabled, and future directions.

Frontier Model Releases Evaluation and Benchmarking Google DeepMind AlphaFold

6Google Deepmind Blog·1mo ago·source ↗

AlphaFold Reveals Structure of Key Heart Disease Protein

DeepMind has used AlphaFold to determine the structure of a key protein implicated in heart disease. The announcement highlights a new scientific application of AlphaFold's protein structure prediction capabilities to cardiovascular research. This represents a continued expansion of AlphaFold's impact on biomedical discovery beyond its initial structural biology applications.

Frontier Model Releases DeepMind AlphaFold

6Openai Blog·1mo ago·source ↗

OpenAI and Retro Biosciences Deploy GPT-4b micro for Protein Engineering in Longevity Research

OpenAI collaborated with Retro Biosciences to apply a specialized model called GPT-4b micro to protein engineering tasks relevant to stem cell therapy and longevity research. The work represents a concrete application of a fine-tuned or specialized variant of GPT-4 to life sciences, specifically improving protein design effectiveness. This is a notable example of frontier AI models being deployed in wet-lab-adjacent scientific research contexts.

Frontier Model Releases Enterprise Deployment Patterns GPT-4b micro protein engineering stem cell therapy +2 more

5arXiv · cs.CL·11d ago·source ↗

BODHI: Contrastive embedding training for causal discovery in Large Behavioural Models

Researchers identify a critical failure mode in biomedical language model embeddings: off-the-shelf encoders (BioBERT, PubMedBERT, BioM-ELECTRA) assign high cosine similarity (0.76–0.92) to causally unrelated cross-domain pairs, achieving 0% accuracy on cross-domain discrimination. The paper introduces BODHI, a contrastive training approach using hard negatives mined from a biomedical knowledge graph, which improves within-vs-across-domain separation from 1.05x to 2.30x and raises discrimination gap by +0.392. The work targets Large Behavioural Models (LBMs)—foundation models that reason over personal life graphs—where false embedding proximity directly produces false causal edges. Additional contributions include an OpenVINO inference optimization achieving 133x latency reduction (1367ms to 10ms) on Intel AMX hardware, plus a counterintuitive finding that FP16 outperforms INT8 on this silicon.

Evaluation and Benchmarking Inference Economics BIOSSES BioBERT PubMedBERT +4 more

4Hugging Face Blog·1mo ago·source ↗

SAIR: Accelerating Pharma R&D with AI-Powered Structural Intelligence

SandboxAQ has published a blog post on Hugging Face describing SAIR (Structural AI for Research), a system applying AI to structural biology data for drug discovery acceleration. The post outlines how structural intelligence—likely leveraging protein structure prediction or molecular modeling—is being applied to pharmaceutical R&D pipelines. This represents an enterprise deployment of AI in the life sciences domain, combining structural biology with machine learning.

Enterprise Deployment Patterns Hugging Face SAIR SandboxAQ