Tuesday, February 25, 2025

Microsoft Researchers Introduces BioEmu-1: A Deep Learning Model that can Generate Thousands of Protein Structures Per Hour on a Single GPU

Good news! Generating proteins becomes child's play! 😊

"... Microsoft Researchers have introduced BioEmu-1, a deep learning model designed to generate thousands of protein structures per hour. Rather than relying solely on traditional MD [molecular dynamics] simulations, BioEmu-1 employs a diffusion-based generative framework to emulate the equilibrium ensemble of protein conformations. The model combines data from static structural databases, extensive  MD simulations, and experimental measurements of protein stability. This approach allows BioEmu-1 to produce a diverse set of protein structures, capturing both large-scale rearrangements and subtle conformational shifts. Importantly, the model generates these structures with a computational efficiency that makes it practical for everyday use, offering a new tool to study protein dynamics without overwhelming computational demands. ..."

From the abstract:
"Following the sequence and structure revolutions, predicting the dynamical mechanisms of proteins that implement biological function remains an outstanding scientific challenge. Several experimental techniques and molecular dynamics (MD) simulations can, in principle, determine conformational states, binding configurations and their probabilities, but suffer from low throughput.
Here we develop a Biomolecular Emulator (BioEmu), a generative deep learning system that can generate thousands of statistically independent samples from the protein structure ensemble per hour on a single graphical processing unit.
By leveraging novel training methods and vast data of protein structures, over 200 milliseconds of MD simulation, and experimental protein stabilities, BioEmu’s protein ensembles represent equilibrium in a range of challenging and practically relevant metrics.
Qualitatively, BioEmu samples many functionally relevant conformational changes, ranging from formation of cryptic pockets, over unfolding of specific protein regions, to large-scale domain rearrangements.
Quantitatively, BioEmu samples protein conformations with relative free energy errors around 1 kcal/mol, as validated against millisecond-timescale MD simulation and experimentally-measured protein stabilities.
By simultaneously emulating structural ensembles and thermodynamic properties, BioEmu reveals mechanistic insights, such as the causes for fold destabilization of mutants, and can efficiently provide experimentally-testable hypotheses."

Microsoft Researchers Introduces BioEmu-1: A Deep Learning Model that can Generate Thousands of Protein Structures Per Hour on a Single GPU - MarkTechPost





No comments: