Saturday, May 24, 2025

With AI, researchers predict the location of virtually any protein within any human cell

Amazing stuff!

"A protein located in the wrong part of a cell can contribute to several diseases, such as Alzheimer’s, cystic fibrosis, and cancer. But there are about 70,000 different proteins and protein variants in a single human cell, and since scientists can typically only test for a handful in one experiment, it is extremely costly and time-consuming to identify proteins’ locations manually.

A new generation of computational techniques seeks to streamline the process using machine-learning models that often leverage datasets containing thousands of proteins and their locations, measured across multiple cell lines. One of the largest such datasets is the Human Protein Atlas, which catalogs the subcellular behavior of over 13,000 proteins in more than 40 cell lines. But as enormous as it is, the Human Protein Atlas has only explored about 0.25 percent of all possible pairings of all proteins and cell lines within the database.

Now ... Their method can predict the location of any protein in any human cell line, even when both protein and cell have never been tested before. ..."

From the abstract:
"The subcellular localization of a protein is important for its function, and its mislocalization is linked to numerous diseases. Existing datasets capture limited pairs of proteins and cell lines, and existing protein localization prediction models either miss cell-type specificity or cannot generalize to unseen proteins.
Here we present a method for Prediction of Unseen Proteins’ Subcellular localization (PUPS). PUPS combines a protein language model and an image inpainting model to utilize both protein sequence and cellular images.
We demonstrate that the protein sequence input enables generalization to unseen proteins, and the cellular image input captures single-cell variability, enabling cell-type-specific predictions. Experimental validation shows that PUPS can predict protein localization in newly performed experiments outside the Human Protein Atlas used for training. Collectively, PUPS provides a framework for predicting differential protein localization across cell lines and single cells within a cell line, including changes in protein localization driven by mutations."

With AI, researchers predict the location of virtually any protein within a human cell | MIT News | Massachusetts Institute of Technology "Trained with a joint understanding of protein and cell behavior, the model could help with diagnosing disease and developing new drugs."




Fig. 1 Our model enables the prediction of subcellular localization of unseen proteins in unseen cell lines.


No comments: