Almanac
← Events
7The Batch (DeepLearning.AI)·2h ago

ESMFold2 matches AlphaFold 3 performance with open, LLM-inspired architecture for molecular structure prediction

Biohub and EvolutionaryScale released ESMFold2, a 6.2-billion-parameter open-weights model for predicting the 3D shapes of proteins, DNA, RNA, and small molecules by treating molecular sequences as language. Unlike AlphaFold 3, ESMFold2 can operate without multiple sequence alignments (MSAs) by using a transformer-based embedding model (ESMC) trained on 2.8 billion sequences, outperforming Chai-1 in MSA-free settings and matching AlphaFold 3 when MSAs are provided. The model weights are freely available on HuggingFace and via API through Biohub, making frontier-level structural biology accessible without proprietary infrastructure. The release is significant for drug discovery involving novel or synthetic molecules where MSA databases may be sparse.

Related guides (3)

Related events (8)

6Latent Space·1mo ago·source ↗

ESMFold2: The Bitter Lesson is Coming for Proteins — Alex Rives, BioHub

A Latent Space interview/commentary piece featuring Alex Rives of BioHub discussing ESMFold2 and the application of the 'bitter lesson' (scale and general methods beating hand-crafted inductive bias) to protein structure prediction and biology. The piece covers the tension between dataset scale versus domain-specific inductive bias in biological ML, and touches on world models and programmable biology. This represents a significant perspective from a leading researcher in protein language models on the next generation of biological foundation models.

6Berkeley Ai Research (Bair) Blog·1mo ago·source ↗

PLAID: Repurposing Protein Folding Models for Multimodal Protein Generation with Latent Diffusion

PLAID is a generative model that simultaneously produces protein 1D sequences and 3D all-atom structures by learning a diffusion model over the latent space of ESMFold, a protein folding model. It requires only sequence data for training—leveraging databases 2-4 orders of magnitude larger than structure databases—and decodes structure at inference via frozen folding model weights. The approach supports compositional prompting by function and organism, addressing practical drug-design constraints like humanization and solubility. A companion compression model, CHEAP, addresses the high-dimensionality of transformer latent spaces to make the diffusion training tractable.

5arXiv · cs.LG·1mo ago·source ↗

EvoStruct: Bridging Evolutionary and Structural Priors for Antibody CDR Design via Protein Language Model Adaptation

EvoStruct addresses vocabulary collapse in GNN-based antibody CDR design by combining a frozen protein language model with an E(3)-equivariant GNN through a cross-attention adapter. The method introduces progressive PLM unfreezing and R-Drop consistency regularization to recover functionally important amino acid diversity. On CHIMERA-Bench, EvoStruct improves sequence recovery by 16%, reduces perplexity by 43%, and achieves 2.3x greater amino acid diversity compared to the best GNN baselines.

4Hugging Face Blog·1mo ago·source ↗

Deep Learning with Proteins

A Hugging Face blog post covering the application of deep learning techniques to protein science, likely covering protein language models, structure prediction, and related tooling. Published in late 2022, this sits in the context of AlphaFold2's impact and the emerging ecosystem of protein ML models. The post likely surveys models, datasets, and frameworks available for computational biology on the Hugging Face platform.

6Google Deepmind Blog·1mo ago·source ↗

AlphaFold Reveals Structure of Key Heart Disease Protein

DeepMind has used AlphaFold to determine the structure of a key protein implicated in heart disease. The announcement highlights a new scientific application of AlphaFold's protein structure prediction capabilities to cardiovascular research. This represents a continued expansion of AlphaFold's impact on biomedical discovery beyond its initial structural biology applications.

6Google Deepmind Blog·1mo ago·source ↗

AlphaFold: Five Years of Impact

DeepMind published a retrospective on AlphaFold's five-year impact on biological research and scientific discovery. The post surveys how the protein structure prediction system has accelerated science globally since its initial release. As a tier-1 source anniversary piece, it likely highlights cumulative usage statistics, downstream research enabled, and future directions.

7Google Deepmind Blog·1mo ago·source ↗

AlphaGenome: DeepMind's Unified DNA Sequence Model for Regulatory Variant-Effect Prediction

DeepMind has introduced AlphaGenome, a new unified DNA sequence model designed to advance regulatory variant-effect prediction and improve understanding of genome function. The model is now available via API, making it accessible to researchers. AlphaGenome represents a significant step in applying large-scale AI to genomics, particularly for interpreting non-coding regulatory regions of the genome.

7The Batch·28d ago·source ↗

Google's AlphaGenome Interprets Non-Coding DNA That Regulates Genetic Expression

Google has released AlphaGenome, an open-weights model that interprets the ~98% of human and mouse genomes that regulate gene expression rather than coding for proteins. The model takes up to 1 million DNA base pairs as input and outputs roughly 6,000 human and 1,000 mouse gene properties, using a CNN-transformer-CNN architecture trained via ensemble distillation from 64 pretrained models. Across 50 evaluations, AlphaGenome matched or exceeded prior models in 47 cases, and correctly predicted expression changes associated with T-cell acute lymphoblastic leukemia. Weights, API, and inference code are freely available for noncommercial use.