Common Sense: proteomics

Showing posts with label proteomics. Show all posts

Tuesday, April 28, 2026

Research improves molecular probe of protein binding sites for drug discovery

Good news!

"... The invention works with an existing lab method called photo-crosslinking. Leaving behind a clean, uniform chemical signature, the technology allowed the team to directly compare how different molecules compete for the same binding site on a protein, all in a single experiment. Because most small-molecule drugs act by binding to specific protein targets, finding precisely where these molecules bind is a major benefit for drug discovery.

As proof of concept, the team analyzed the activity of dasatinib and ascinimib, two cancer drugs that target different sites on the same protein, a type of enzyme called a kinase that, when mutated, causes leukemia.

The results coincided with known interactions for each drug and revealed previously unknown interactions. The newer drug, ascinimib, which has a more favorable safety profile and fewer side effects, showed fewer off-target kinase interactions. ...

new technology, called SEE-CITE, is giving the molecule being studied the ability to detach from its payload so that each tagged molecule leaves behind a consistent calling card. This makes possible quantitative measurements and comparisons of how strongly different molecules engage a given binding site. The team also upgraded a widely used software tool to better interpret the complex data this method generates. ..."

From the abstract:

"For chemical probe and drug discovery campaigns, the pairing of mass spectrometry-based chemoproteomics with photoaffinity labelling has emerged as a favoured approach for target discovery and mode of action assignment. However, photocrosslinked peptide-compound adducts raise analytic challenges for quantitative binding site discovery.

Here, to address these challenges, we establish the Silyl Ether Enables Chemoproteomic Interaction and Target Engagement (SEE-CITE) method. SEE-CITE incorporates a fully functionalized chemically cleavable photocrosslinking handle that enables precise site-of-labelling identification and head-to-head comparisons of relative binding site engagement by chemically diverse compounds.

To ensure high-confidence localization of labelled residues, we extended the MSFragger algorithm of the FragPipe computational platform to report localization scores customized for photoaffinity labelling and SEE-CITE data.

When applied to scout fragments and analogues of select FDA-approved kinase inhibitors, SEE-CITE delineates known drug binding sites and uncovers small-molecule binding sites that affect the protein activity of RTN4 and COX5A."

UCLA research improves molecular probe for drug discovery | UCLA

Small-molecule binding-site discovery using silyl ether-enabled chemoproteomics (open access)

Fig. 1: Establishing the SEE-CITE interaction site mapping platform using scout SEE-CITE probes.

Fig. 3: SEE-CITE mapping of ABL1 binding sites.

Monday, April 20, 2026

Discovery could lead to new therapies for blood disorders

Good news!

"... investigators have revealed the detailed workings of a cell membrane protein that has essential roles in all animals. The discovery could lead to new therapeutic strategies for blood coagulation disorders, cancers and other conditions in which the protein, called a TMEM16 scramblase, works abnormally.

Scramblases operate within cell membranes, where they alter or “scramble” the normal layered arrangement of lipid molecules – an essential step in many biological processes. The scramblase TMEM16F also works as an ion channel, allowing small, charged molecules such as potassium or chloride ions through the membrane. ...

the researchers at last attained this goal by embedding the protein in liposomes – tiny lipid capsules – which allowed them to image its active and inactive structures at near-atomic-scale resolution. ...

TMEM16F’s rearrangement of cell membrane lipids enables platelet cells to clump together to make blood coagulate, and a mutation affecting the scramblase underlies Scott syndrome, a hemophilia-like bleeding disorder.

The protein is also involved in the formation of the placenta in pregnancy, bone development and immune functions; and it is exploited or suppressed in various cancers and infections. ..."

From the abstract:

"The ubiquitous transmembrane protein 16F (TMEM16F) Ca2+-activated channel and scramblase catalyzes phosphatidylserine externalization to enable blood coagulation, membrane fusion and brain immune surveillance.

Despite its importance, the molecular mechanisms underlying TMEM16F activation remain poorly understood.

Here, we obtained high-resolution cryo-electron microscopy structures of TMEM16F active in liposomes. In high-activity conditions, TMEM16F adopts two conformations, the canonical Ca2+-bound closed state and one where the upward rotation of the cytosolic domain leads to an X-shaped groove that forms a transmembrane pore and locally thins the membrane.

Using mutagenesis, functional assays and molecular dynamics simulations, we show that the X-shaped groove is active and mediates nonselective ion flux and lipid scrambling through distinct pathways; ions move within the protein-delimited pore, whereas lipids skirt the X-shaped groove.

Our findings provide a complete picture of TMEM16F Ca2+-dependent gating and demonstrate that imaging membrane proteins in a native-like environment can allow capturing otherwise inaccessible active states."

Discovery could lead to new therapies for blood disorders | Cornell Chronicle

Calcium dependent activation of the TMEM16F scramblase and ion channel (open access)

Fig. 1: Structure of purified TMEM16F reconstituted in liposomes.

Fig. 5: Activation of TMEM16F.

Thursday, March 26, 2026

How huntingtin proteins travel in the brain

Amazing stuff!

"Mutant huntingtin protein, which causes the neurodegenerative disorder Huntington’s disease, travels through the brain using tiny ‘tunneling nanotubes’, research has revealed. Researchers found that a protein called Rhes binds with a protein for cellular acidity regulation, SLC4A7, to build tube-like structures creating a highway to shuttle huntingtin from neuron to neuron. Interrupting this pathway minimised the spread of huntingtin in the brain and offers a potential druggable target."

From the abstract:
"Tunneling nanotubes (TNTs) are membranous structures that mediate intercellular transfer of proteins, including the pathogenic mutant Huntingtin (mHTT) protein in Huntington disease (HD).
We previously identified the ras homolog enriched in the striatum (Rhes) as a key regulator of TNT formation and mHTT transmission; however, the molecular components underlying this process remained unknown.
Here, using unbiased liquid chromatography–tandem mass spectrometry analysis of membrane-associated Rhes complexes, we identify Slc4a7 (solute carrier family 4 member 7), an intracellular pH sensor, as a top membrane-binding partner of Rhes.
Functional studies revealed that small interfering RNA–mediated depletion or pharmacological inhibition of Slc4a7 substantially reduced Rhes-induced TNT formation and suppressed mHTT intercellular transfer.
Mechanistically, Rhes directly interacts with Slc4a7 through both its amino- and carboxyl-terminal domains and modulates intracellular pH to facilitate TNT formation. This interaction does not depend on the transporter activity of Slc4a7. However, inhibition of Rhes farnesylation—a lipid modification that anchors Rhes to the membrane—disrupts its binding to Slc4a7 and abolishes TNT formation.
Slc4a7 knock-out mice showed markedly reduced cell-to-cell transmission of mHTT in the striatum in vivo.
Together, these findings uncover a previously unrecognized Rhes-Slc4a7 signaling axis critical for TNT-mediated mHTT transmission and highlight Slc4a7 as a potential therapeutic target to limit disease spread in HD."

Targeting Tunneling Nanotubes Reduces Spread of Mutant Huntington’s Protein

Membrane-associated Rhes-Slc4a7 complex orchestrates tunneling nanotube formation and mutant Huntingtin spread (open access)

Fig. 1. Membrane-anchored Rhes drives the formation of TNTs between cells.

Tunneling nanotubes connect Rhes expressing striatal neuronal cells

Wednesday, February 04, 2026

Aging slows breakdown of synaptic proteins, raising disease risk

Recommendable!

"In brief

A new study reveals mechanisms linking synapse loss to cognitive decline and dementia in aging brains, highlighting critical changes during this process.
Researchers found that aging slows the breakdown of synaptic proteins, leading to accumulation that may contribute to diseases like Alzheimer’s.
Stanford’s innovative tagging method could enable tracking of neuronal proteins, facilitating the identification of new biomarkers for assessing brain health.

...

Now, researchers ... have discovered clues that may tie synapse loss to another hallmark of brain aging: the declining ability of brain cells to break down and recycle damaged proteins. ...

that synaptic proteins are particularly susceptible to this age-related garbage-disposal problem: In old age, synaptic proteins break down much more slowly, they become more likely to pile up into the tangled clumps of protein characteristic of neurodegenerative disease, and they are more likely to make their way into microglia, immune cells that prune away damaged synapses. ..."

From the abstract:

"Neurodegenerative diseases affect 1 in 12 people globally and remain incurable. Central to their pathogenesis is a loss of neuronal protein maintenance and the accumulation of protein aggregates with ageing.

Here we engineered bioorthogonal tools that enabled us to tag the nascent neuronal proteome and study its turnover with ageing, its propensity to aggregate and its interaction with microglia.

We show that neuronal protein half-life approximately doubles on average between 4-month-old and 24-month-old mice, with the stability of individual proteins differing among brain regions.

Furthermore, we describe the aged neuronal 'aggregome', which encompasses 1,726 proteins, nearly half of which show reduced degradation with age. The aggregome includes well-known proteins linked to diseases and numerous proteins previously not associated with neurodegeneration.

Notably, we demonstrate that neuronal proteins accumulate in aged microglia, with 54% also displaying reduced degradation and/or aggregation with age. Among these proteins, synaptic proteins are highly enriched, which suggests that there is a cascade of events that emerge from impaired synaptic protein turnover and aggregation to the disposal of these proteins, possibly through microglial engulfment of synapses. These findings reveal the substantial loss of neuronal proteome maintenance with ageing, which could be causal for age-related synapse loss and cognitive decline."

Aging slows breakdown of synaptic proteins, raising disease risk | Stanford Report "Recent research unveils new links between the brain’s waste management systems and neurodegeneration. The findings may provide insights for early disease identification."

Ageing promotes microglial accumulation of slow-degrading synaptic proteins (open access)

Images of mouse brain cells, with neurons in red and protein aggregates in green. Those protein aggregates are far more likely to form in older mice (right) compared with younger mice (left), which may contribute to slower breakdown of damaged proteins.

Tuesday, January 06, 2026

Machine learning reveals hidden dimensions of functional similarity in proteins

Recommendable!

Caveat: I did not read the entire, long article.

"Large language models trained on biological sequences, rather than natural language, are transforming biology, from predicting human genetic disease (1, 2) to the design of new-to-nature proteins (3–5). In this issue of PNAS, Cao et al. (6) extend these applications to detect the molecular underpinnings of phenotypic convergence by decoding patterns invisible to traditional sequence analysis approaches ...

Protein Language Models: Decoding Molecular Convergence

Protein language models (PLMs)—deep neural networks trained on millions of protein sequences—have emerged as powerful approaches for capturing the complex relationships between protein sequence, structure, and function (15–18). These models, adapted from natural language processing architectures, like transformers, learn to represent proteins as high-dimensional embeddings, numerical vectors that encode information about structural propensities, functional annotations, mutational effects, and evolutionary constraints.

PLMs are trained through self-supervised learning on large databases of protein sequences without explicit structural or functional labels. By learning to predict hidden amino acids (masked learning) or the next residue in a sequence (autoregressive), these models develop a representation of the “grammar” of proteins—which combinations of amino acids are permissible, which residues tend to co-occur, and which patterns correlate with specific structural and functional properties (3, 19–21). ...

They show that ACEP [Adaptive Convergence by Embedding of Protein] successfully identifies embedding-level convergence across the three test cases, demonstrating that PLMs have learned to recognize functional similarities. ..."

From the significance and abstract:

"Significance

In biology, repeated emergence of the same functional trait in evolution is important as it provides opportunity to decode the relations between genome or protein sequences to specific functions. Such functional convergence has been largely linked to sequence convergence at the level of single sites, because conventional methods cannot measure similarity of high-order features of sequences. This study reveals that the recent protein language models can extract embeddings from protein sequences reflecting high-order features, and develops statistical tests to evaluate the adaptive convergence of such features. The findings emphasize an underrated sequence basis for functional trait convergence in evolution, provide corresponding detection framework, and demonstrate potential power of deep learning in investigating the complex sequence–function mapping in evolutionary biology.

Abstract

Convergent evolution, or convergence, refers to repeated, independent emergences of the same trait in two or more lineages of species during evolution, often indicating functional adaptation to specific environmental factors.

Many computational methods have been proposed to investigate the genetic basis for organismal functional convergence, as an important way to decode the complex sequence–function map of proteins. These methods mostly focus on the convergence of amino acid states at the level of individual sites in functionally related proteins.

However, even without site-level sequence similarity, protein function similarity may also stem from convergence of high-order protein features, which cannot be captured by the conventional methods.

To fill this gap, we first derived numerical embeddings from protein sequences by pretrained protein language models (PLM).

In four previously reported cases, we found that functionally convergent proteins have similar embeddings despite no site-level convergence, indicating that PLM embeddings can reflect convergence of high-order protein features.

We then designed a pipeline to detect Adaptive Convergence by Embedding of Protein (ACEP). ACEP tests were significant on known and additional candidate genes with putative adaptive convergence like echolocation and crassulacean acid metabolism.

Genome-wide application showed that the ACEP framework can effectively enrich such candidates. Relations between convergences of PLM embeddings and specific protein physicochemical features were further examined.

In conclusion, PLM embeddings can indicate adaptive convergence of high-order protein features beyond site identities, demonstrating the power of deep learning tools for investigating the complex mapping between molecular sequences and functions."

Machine learning reveals hidden dimensions of functional similarity in proteins | PNAS

Language models reveal a complex sequence basis for adaptive convergent evolution of protein functions (no public access)

Fig. 1 Detecting molecular convergence using protein language model embeddings.

Saturday, December 06, 2025

Protein-prediction tools based on AI (AlphaFold family) revolutionize science

Good news! Amazing stuff!

"AlphaFold2 — the revolutionary artificial intelligence (AI) tool that predicts highly accurate 3D structures of proteins from sequence data — was unveiled by its creators at Google DeepMind five years ago this month.

An update of the tool, AlphaFold3, was released last year. It can predict how potential therapeutics could interact with proteins. It may take a while for AlphaFold’s biological insights to translate to drug discovery, but the platform has already established itself as top dog in the protein-structure prediction arena.

One analysis identified more than 200,000 studies that used AlphaFold directly or indirectly, encompassing work by nearly 800,000 scientists. ..."

"... The 2021 release of AlphaFold2’s code and a database that has swelled to hundreds of millions of predicted structures mean that scientists can now get a reliable prediction for almost any protein. ...

Some 3.3 million users in more than 190 countries have accessed the AlphaFold database (AFDB), which is hosted by EMBL–EBI and contains more than 240 million structural predictions, encompassing most known proteins. More than one million AFDB users come from low- and middle-income countries, including China and India ..."

Nature Briefing: Translational Research

AlphaFold is five years old — these charts show how it revolutionized science "Since it was unveiled in 2020, Google DeepMind’s game-changing AI tool has helped researchers all over the world to predict the 3D structures of hundreds of millions of proteins."

Very impressive!

Thursday, December 04, 2025

A tiny protein complex controls fat cell size and lipid storage

Good news!

"Scientists have made a major breakthrough in understanding how fat cells grow in size, in response to accommodating larger droplets of fat. The findings unlock a new path in tackling obesity, by reducing the amount of fat our cells can store away. ...

Earlier ... research had identified a protein known as seipin that was critical for healthy lipid storage across organisms, including humans. But how seipin was facilitating this remained unknown, and despite some studies naming another protein – adipogenin – in the process, scientists didn't know how it was involved.

Using cryo-electron microscopy, the researchers found that adipogenin was more than a bystander in the process, reinforcing seipin's structural integrity to enhance its ability to form and deliver lipid droplets to cells. The result is adipocytes accommodating larger lipid droplets – and increasing the size of these fat cells. ..."

From the abstract of the Perspective:

"Obesity is characterized by the accumulation of triacylglycerols in lipid droplets of adipocytes (fat cells) and the expansion of adipose tissue. Adipocytes arise from stem cells through adipogenesis, a process driven by several transcription factors ... Li et al. (3) identify adipogenin as a molecular switch that shifts the emphasis from generating new lipid droplets to expanding existing ones during adipogenesis."

From the editor's summary and abstract:

"Editor’s summary

Fat storage in the body relies on specialized structures called lipid droplets (LDs). Li et al. identified the microprotein adipogenin as a regulator of adipocyte LD size ... Adipogenin interacts with the membrane protein seipin and stabilizes the assembly of seipin dodecamers by bridging adjacent subunits. Functionally, seipin-adipogenin complexes promote the formation of fewer but larger LDs. In mice, adipocyte-specific adipogenin overexpression results in increased fat mass and larger LDs, whereas adipogenin deletion reduces fat accumulation and LD size, particularly in brown adipose tissue. Thus, adipogenin represents a modulator of adipocyte lipid storage that acts through a structural and functional partnership with seipin. ...

Abstract

INTRODUCTION

Adipogenin (Adig) is an 80–amino acid microprotein that is highly expressed in adipose tissues and steatotic liver. A previous genome-wide association study suggested that human ADIG is associated with blood leptin levels, highlighting its importance in energy metabolism. At the molecular level, Adig’s function is largely unknown: No interacting proteins have been identified. ...

RATIONALE

Microproteins typically exert their functions by binding to larger proteins and regulating their activities. We pulled down Adig from adipocytes and identified its interacting proteins by mass spectrometry. Upon the identification of a seipin-Adig complex, we resolved its structure using cryo–electron microscopy (cryo-EM), enabling us to determine Adig’s effect on seipin configuration at an atomic scale. Because seipin plays a vital role in lipid droplet (LD) formation and growth, we explored the function of the seipin-Adig complex in these processes. Moreover, we generated adipocyte-specific Adig overexpression and deletion mice to investigate Adig’s effect on adipose tissue expansion and lipid metabolism in vivo.

RESULTS

We found that Adig is a highly conserved protein with a single transmembrane (TM) segment that localizes to the endoplasmic reticulum (ER). Notably, Adig and seipin can form a complex and stabilize each other.

Cryo-EM analysis revealed two distinct oligomers: an undecameric seipin-alone complex at ~3.2-Å overall resolution and a dodecameric seipin-Adig complex at ~3.0-Å overall resolution.

In the seipin-Adig complex map, extra densities, corresponding to seipin and Adig TM domains, were observed. Multiple approaches, including high-resolution imaging, gel filtration, and molecular dynamics simulations, revealed that Adig could facilitate the assembly of dodecameric seipin complexes. Seipin complexes with varying Adig contents modulated LD formation and growth. The presence of the seipin-Adig complex altered triacylglycerol (TAG) flux in the ER, leading to the formation of fewer, but larger, LDs.

Additionally, the ER-to-LD trafficking of select lipid-synthesizing enzymes was accelerated in Adig-expressing cells.

In mice, Adig overexpression in adipocytes promoted LD enlargement and adipose tissue expansion, whereas Adig deletion decreased the amount of the seipin complexes in adipocytes and impaired TAG accumulation in brown adipose tissues.

CONCLUSION

In this study, we demonstrate that Adig complexes with seipin, forming a previously unrecognized dodecameric seipin complex. Furthermore, Adig stabilizes and promotes the assembly of this complex, thereby supporting LD growth in cells. In mice, modulating the expression of seipin-Adig complexes in adipose tissues by Adig overexpression or deletion substantially affects LD formation and expansion as well as lipid absorption by adipose tissues. This study reveals Adig as a key cofactor that modulates seipin function and fat storage in adipose tissue. We conclude that the oligomerization and function of seipin complexes can be modulated by Adig expression."

A key protein controls fat cell size and lipid storage

Seipin-adipogenin controls lipid storage in fat cells (Perspective, no public access) "A protein complex promotes the expansion of lipid droplets during the formation of mature adipocytes"

Microprotein plays vital role in fat accumulation (original news release) "Findings from UTSW researchers, colleagues could lead to new treatments to improve metabolic health and reduce risks of obesity, diabetes"

Adipogenin promotes the development of lipid droplets by binding a dodecameric seipin complex (no public access)

Adipogenin Dictates Adipose Tissue Expansion by Facilitating the Assembly of a Dodecameric Seipin Complex (preprint, open access, but seems to be dated and does not match the journal article)

Seipin-Adig complex promotes the development of lipid droplets.

Saturday, November 29, 2025

New biosensor technology maps enzyme mystery inside cells

Amazing stuff!

"... researchers have developed a powerful new biosensor that reveals, in unprecedented detail, how and where kinases – enzymes that control nearly all cellular processes – turn on and off inside living cells.

The advance provides scientists with a new way to study the molecular switches that regulate cellular processes, including cell growth and DNA repair, as well as cellular responses to chemotherapy drugs and pathological conditions such as cancer.

Cells rely on kinases to control processes from cellular metabolism and growth to stress responses. Unraveling how the more than 500 kinases in human cells all work together is one of biology’s biggest puzzles. Until now, researchers lacked robust tools to see exactly where and how these enzymes act inside cells. ..."

From the abstract:

"Understanding kinase action requires precise quantitative measurements of their activity in vivo. In addition, the ability to capture spatial information of kinase activity is crucial to deconvolute complex signaling networks, interrogate multifaceted kinase actions, and assess drug effects or genetic perturbations.

Here we develop a proteomic kinase activity sensor technique (ProKAS) for the analysis of kinase signaling using mass spectrometry.

ProKAS is based on a tandem array of peptide sensors with amino acid barcodes that allow multiplexed analysis for spatial, kinetic, and screening applications.

We engineered a ProKAS module to simultaneously monitor the activities of the DNA damage response kinases ATR, ATM, and CHK1 in response to genotoxic drugs, while also uncovering differences between these signaling responses in the nucleus, cytosol, and replication factories.

Furthermore, we developed an in silico approach for the rational design of specific substrate peptides expandable to other kinases.

Overall, ProKAS is a versatile system for systematically and spatially probing kinase action in cells."

New biosensor technology maps enzyme mystery inside cells | Cornell Chronicle

Proteomic sensors for quantitative multiplexed and spatial monitoring of kinase signaling (open access)

Fig. 1: Design and rationale of ProKAS, a modular technique for multiplexed analysis of kinase activity using mass spectrometry.

Fig. 2: Development and validation of a ProKAS sensor specific for ATR using phosphoproteomic data.

Thursday, November 06, 2025

Study finds targets for a new tuberculosis vaccine

Good news! My high school buddies liked to tease me from time to time reminding me that my initials TB stand for tuberculosis! 😊

It seems to be extraordinary difficult to develop new vaccines for TB!

"There is currently only one vaccine for tuberculosis — the world’s deadliest infectious disease, killing more than 1 million people annually — and it was approved over 100 years ago."

"A large-scale screen of tuberculosis proteins has revealed several possible antigens that could be developed as a new vaccine for TB, the world’s deadliest infectious disease.

In the new study, a team of MIT biological engineers was able to identify a handful of immunogenic peptides, out of more than 4,000 bacterial proteins, that appear to stimulate a strong response from a type of T cells responsible for orchestrating immune cells’ response to infection.

There is currently only one vaccine for tuberculosis, known as BCG, which is a weakened version of a bacterium that causes TB in cows. This vaccine is widely administered in some parts of the world, but it poorly protects adults against pulmonary TB. ...

identifying TB proteins presented on the surface of infected human cells. When an immune cell such as a phagocyte is infected with Mycobacterium tuberculosis, some of the bacterial proteins get chopped into fragments called peptides, which are then displayed on the surface of the cell by MHC proteins. These MHC-peptide complexes act as a signal that can activate T cells.

MHCs, or major histocompatibility complexes, come in two types known as class I and class II. Class I MHCs activate killer T cells, while class II MHCs stimulate helper T cells. In human cells, there are three genes that can encode MHC-II proteins, and each of these comes in hundreds of variants. This means that any two people can have a very different repertoire of MHC-II molecules, which present different antigens. ...

To try to answer the question, the researchers infected human phagocytes with Mycobacterium tuberculosis. After three days, they extracted MHC-peptide complexes from the cell surfaces, then identified the peptides using mass spectrometry.

Focusing on peptides bound to MHC-II, the researchers found 27 TB peptides, from 13 proteins, that appeared most often in the infected cells. Then, they further tested those peptides by exposing them to T cells donated by people who had previously been infected with TB.

They found that 24 of these peptides did elicit a T cell response in at least some of the samples. None of the proteins from which these peptides came worked for every single donor, but Bryson believes that a vaccine using a combination of these peptides would likely work for most people. ...

To evaluate whether the proteins they identified could make a good vaccine, the researchers created mRNA vaccines encoding two protein sequences — EsxB and EsxG. The researchers designed several versions of the vaccine, which were targeted to different compartments within the cells.

The researchers then delivered this vaccine into human phagocytes, where they found that vaccines that targeted cell lysosomes — organelles that break down molecules — were the most effective. These vaccines induced 1,000 times more MHC presentation of TB peptides than any of the others.

They later found that the presentation was even higher if they added EsxA to the vaccine, because it allows the formation of the heterodimers that can poke through the lysosomal membrane.

The researchers currently have a mix of eight proteins that they believe could offer protection against TB for most people, but they are continuing to test the combination with blood samples from people around the world. ..."

From the editor's summary and abstract:

"Editor’s summary

Despite decades of research, we are still awaiting an effective vaccine to prevent infection with tuberculosis (TB). CD4+ T cells are essential to respond to infection with Mycobacterium tuberculosis (Mtb), the causative agent of TB, and vaccines for TB may need to be targeted toward this population.

Here, Leddy et al. did just that. The authors performed immunopeptidomics to identify candidate peptides expressed by Mtb that, when presented in the context of major histocompatibility complex class II (MHC-II) on phagocytic cells, activated CD4+ T cells in culture. The authors then developed several mRNA immunogens targeting different subcellular regions to show that such localization greatly affects antigen presentation capacity by phagocytic cells in vitro. These data highlight the promise of this CD4+ T cell–focused mRNA vaccine for TB and demonstrate the utility of this immunopeptidomics approach. ...

Abstract

No currently licensed vaccine reliably prevents pulmonary tuberculosis (TB), a leading cause of infectious disease mortality. Developing effective new vaccines requires identifying which Mycobacterium tuberculosis (Mtb) proteins are presented on major histocompatibility complex class II (MHC-II) by infected human phagocytes (target cells) and defining their capacity for recognition by CD4+ T cells. Vaccine designs must elicit T cell responses recognizing the same peptide-MHC complexes presented by infected cells. Although many human CD4+ T cell Mtb epitopes have been described, presentation on MHC-II by infected cells in most cases has not been directly evaluated.

Using mass spectrometry (MS), we demonstrated that Mtb type VII secretion system (T7SS) substrates are enriched in the MHC-II repertoire of Mtb-infected human monocyte-derived phagocytes and that many of these antigens are immunogenic in people with prior evidence of Mtb infection.

We next used MS to guide TB messenger RNA (mRNA) vaccine design, increasing the presentation of target MHC-II epitopes by orders of magnitude by incorporating design features that mirror aspects of antigen presentation dynamics in infected phagocytes.

Our results provide a strategy for TB vaccine design that is guided by bottom-up unbiased discovery. Our approach combines targeted evaluation of antigen presentation in human cells paired with rapid iterative testing of mRNA vaccine designs to optimize antigen presentation before animal studies or human clinical trials."

MIT study finds targets for a new tuberculosis vaccine | MIT News | Massachusetts Institute of Technology "Using these antigens, researchers plan to develop vaccine candidates that they hope would stimulate a strong immune response against the world’s deadliest pathogen."

Immunopeptidomics can inform the design of mRNA vaccines for the delivery of Mycobacterium tuberculosis MHC class II antigens (no public access)

Immunopeptidomics informs discovery and delivery of Mycobacterium tuberculosis MHC-II antigens for vaccine design (preprint, open access)

Fig. 1 Immunopeptidomics identifies potential vaccine targets presented on MHC-II in Mtb-infected human dendritic cells

Saturday, October 25, 2025

Anthrogen Introduces Odyssey: A 102B Parameter Protein Language Model

Good news! It is getting crowded in the space of protein models! Here is another one! Expect rapid advances in medicine and biology!

The emphasis here seems to be on modeling the 3D structure of proteins.

"[Protein Model] Anthrogen Introduces Odyssey: A 102B Parameter Protein Language Model that Replaces Attention with Consensus and Trains with Discrete Diffusion. Odyssey is Anthrogen’s multimodal protein language model family that fuses sequence tokens, FSQ structure tokens, and functional context for generation, editing, and conditional design, it replaces self attention with Consensus that scales as O(L) and reports improved training stability, it trains and samples with discrete diffusion for joint sequence and structure denoising, it ships in production variants from 1.2B to 102B parameters, it claims about 10x data efficiency versus competing models in matched evaluations, and API access is opening for external users to test real design workflows"

"... Odyssey is a frontier, multimodal protein model family that learns jointly from sequence, 3D structure, and functional context. It supports conditional generation, editing, and sequence + structure co-design. We scale our production-ready models from 1.2B to 102B parameters.

At input, Odyssey treats proteins as more than strings. Amino acid sequences are used as usual, while 3D shape is turned into compact structure tokens using a finite scalar quantizer (FSQ)—think of it as a simple alphabet for 3D geometry so the model can “read” shapes as easily as letters.

Alongside these, we include light-weight functional cues—domain tags, secondary-structure hints, orthologous group labels, or short text descriptors—so the model can reason about what a region does, not just what it looks like. The three streams are embedded separately, then fused, so local sequence patterns and long-range geometric relationships end up in one shared representation. ..."

From the abstract:

"We present Odyssey, a family of multimodal protein language models for sequence and structure generation, protein editing and design.

We scale Odyssey to more than 102 billion parameters, trained over 1.1 × 1023 FLOPs. The Odyssey architecture uses context modalities, categorized as structural cues, semantic descriptions, and orthologous group metadata, and comprises two main components:

a finite scalar quantizer for tokenizing continuous atomic coordinates, and

a transformer stack for multimodal representation learning.

Odyssey is trained via discrete diffusion, and characterizes the generative process as a time-dependent unmasking procedure.

The finite scalar quantizer and transformer stack leverage the consensus mechanism, a replacement for attention that uses an iterative propagation scheme informed by local agreements between residues.

Across various benchmarks, Odyssey achieves landmark performance for protein generation and protein structure discretization. Our empirical findings are supported by theoretical analysis."

Interesting AI Releases: Salesforce WALT, Apple UltraCUA, Google VISTA, and many more...

Anthrogen introduces Odyssey, the world's largest and most powerful protein language model. (original news release)

Odyssey: reconstructing evolution through emergent consensus in the global proteome (open access)

Figure 2: Overall architecture schematic of Odyssey.

Figure 2: Illustration of self-consensus for d = 2, depicting the local neighborhood principle and the operative steps of self-consensus.

Sunday, September 28, 2025

On SimpleFold: Folding Proteins is Simpler than You Think

Good news! An Apple research team has a take on protein folding! 😊

Caveat: I have not yet read this paper.

From the abstract:

"Protein folding models have achieved groundbreaking results typically via a combination of integrating domain knowledge into the architectural blocks and training pipelines.

Nonetheless, given the success of generative models across different but related problems, it is natural to question whether these architectural designs are a necessary condition to build performant models.

In this paper, we introduce SimpleFold, the first flow-matching based protein folding model that solely uses general purpose transformer blocks.

Protein folding models typically employ computationally expensive modules involving triangular updates, explicit pair representations or multiple training objectives curated for this specific domain.

Instead, SimpleFold employs standard transformer blocks with adaptive layers and is trained via a generative flow-matching objective with an additional structural term.

We scale SimpleFold to 3B parameters and train it on approximately 9M distilled protein structures together with experimental PDB data.

On standard folding benchmarks, SimpleFold-3B achieves competitive performance compared to state-of-the-art baselines, in addition SimpleFold demonstrates strong performance in ensemble prediction which is typically difficult for models trained via deterministic reconstruction objectives.

Due to its general-purpose architecture, SimpleFold shows efficiency in deployment and inference on consumer-level hardware.

SimpleFold challenges the reliance on complex domain-specific architectures designs in protein folding, opening up an alternative design space for future progress."

[2509.18480] SimpleFold: Folding Proteins is Simpler than You Think

Tuesday, September 09, 2025

Thioesters could explain how proteins first formed on early Earth

Amazing stuff! Could this be a breakthrough?

"A sulfurous intermediate could be the missing puzzle piece that explains how simple molecules on early Earth first assembled into proteins. New research shows that aminoacyl–thiols can react with RNA molecules to initiate the first steps of protein synthesis, while avoiding competing side reactions, all without the need for enzymes. ...

In particular, the initial activating step – an aminoacylation reaction – has proven challenging to replicate without enzymes. Previous studies have proposed various electrophiles including phosphates, imidazoles, and N-carboxyanhydrides as chemical activating agents but the resulting species are highly reactive, leading to uncontrolled background reactions and poor stability in water.

According to ... [a team] ... the solution must therefore involve a milder mechanism of activation. The researchers chose to focus their attention on thioesters, which are prevalent motifs in metabolic processes. Through a panel of experiments, they demonstrated that aminoacyl–thiols derived from from prebiotic precursors provided sufficient activation to enable these units to selectively bind to tRNA molecules. The water-based reaction seamlessly tolerated 15 different amino acid units while also suppressing the uncontrolled competing reactions of these amino acids directly with each other. ..."

From the abstract:

"To orchestrate ribosomal peptide synthesis, transfer RNAs (tRNAs) must be aminoacylated, with activated amino acids, at their 2′,3′-diol moiety, and so the selective aminoacylation of RNA in water is a key challenge that must be resolved to explain the origin of protein biosynthesis.

So far, there have been no chemical methods to effectively and selectively aminoacylate RNA-2′,3′-diols with the breadth of proteinogenic amino acids in water.

Here we demonstrate that (biological) aminoacyl-thiols (1) react selectively with RNA diols over amine nucleophiles, promoting aminoacylation over adventitious (non-coded) peptide bond formation.

Broad side-chain scope is demonstrated, including Ala, Arg, Asp, Glu, Gln, Gly, His, Leu, Lys, Met, Phe, Pro, Ser and Val, and Arg aminoacylation is enhanced by unprecedented side-chain nucleophilic catalysis.

Duplex formation directs chemoselective 2′,3′-aminoacylation of RNA.

We demonstrate that prebiotic nitriles, N-carboxyanhydrides and amino acid anhydrides, as well as biological aminoacyl-adenylates, all react with thiols (including coenzymes A and M) to selectively yield aminoacyl-thiols (1) in water. Finally, we demonstrate that the switch from thioester to thioacid activation inverts diol/amine selectivity, promoting peptide synthesis in excellent yield. Two-step, one-pot, chemically controlled formation of peptidyl-RNA is observed in water at neutral pH. Our results indicate an important role for thiol cofactors in RNA aminoacylation before the evolution of proteinaceous synthetase enzymes."

Thioesters could explain how proteins first formed on early Earth | Research | Chemistry World

Thioester-mediated RNA aminoacylation and peptidyl-RNA synthesis in water (open access)

[T]eam has shown that aminoacyl–thiols are sufficiently reactive to load amino acid units onto RNA molecules without the help of enzymes

Thursday, August 28, 2025

Scientists build an “evolution engine” to rapidly reprogram proteins

Good news! Are we on the cusp of outdoing evolution? You bet!

"... The system, named T7-ORACLE ... represents a breakthrough in how researchers can engineer therapeutic proteins for cancer, neurodegeneration and essentially any other disease area.

“This is like giving evolution a fast-forward button,” ... “You can now evolve proteins continuously and precisely inside cells without damaging the cell’s genome or requiring labor-intensive steps.” ..."

From the editor's summary and abstract:

"Editor’s summary

Continuous evolution of proteins in the lab is often slow because normal mutation rates in bacteria are very low. Diercks et al. investigated whether a highly mutagenic DNA replication system could speed up evolution in Escherichia coli without harming the host genome. They engineered an orthogonal T7 replisome that replicates only target plasmids at mutation rates 100,000 times higher than normal while leaving the rest of the genome unchanged. Using this system, the authors rapidly evolved TEM-1 β-lactamase to gain much stronger resistance to several antibiotics in under a week. This approach could greatly accelerate protein engineering and antibiotic resistance studies. ...

Abstract

Systems that perform continuous hypermutation of designated genes without compromising the integrity of the host genome can substantially accelerate the evolution of new or enhanced protein functions.

We describe an orthogonal DNA replication system in Escherichia coli based on the controlled expression of the replisome of bacteriophage T7 (T7-ORACLE).

The system replicates circular plasmids that enable high transformation efficiencies and seamless integration into standard molecular biology workflows. Engineering of T7 DNA polymerase yielded variant proteins with mutation rates of 1.7 × 10−5 substitutions per base in vivo—100,000-fold above the genomic mutation rate. We demonstrated continuous evolution using the T7 replisome by expanding the substrate scope of TEM-1 β-lactamase and increasing activity 5000-fold against clinically relevant monobactam and cephalosporin antibiotics in less than 1 week."

Scientists build an “evolution engine” to rapidly reprogram proteins | Scripps Research "A new platform developed at Scripps Research enables fast, scalable protein evolution—opening the door to new therapies and diagnostics, and to predicting resistance mutations across many disease areas."

An orthogonal T7 replisome for continuous hypermutation and accelerated evolution in E. coli (no public access)

An Orthogonal T7 Replisome for Continuous Hypermutation and Accelerated Evolution in E. coli (preprint, open access)

Fig. 1 Establishing an orthogonal replication system in E. coli based on the bacteriophage T7 replisome.

Wednesday, July 16, 2025

More than 200 misfolded proteins, much more than previously known, may contribute to Alzheimer's and dementia

Good news! Amazing stuff!

"For decades, the story of Alzheimer's research has been dominated by a battle between A-beta and tau amyloids, both of which can kill neurons and impact the brain's ability to function. A new study suggests, however, that these sticky brain plaques may not be operating alone. ...

researchers have identified more than 200 types of misfolded proteins in rats that could be associated with age-related cognitive decline.

Key Takeaways

More than 200 types of misshapen proteins in the brain could be associated with age-related cognitive decline in rats.
Researchers believe these proteins slip past a surveillance system in the cell that specifically targets and destroys malfunctioning proteins.
A better understanding of these protein deformities in the brain could lead to better treatments and preventative measures.

...

"Our research is showing that amyloids are just the tip of the iceberg." ..."

From the abstract:

"Cognitive decline during aging represents a major societal burden, causing both personal and economic hardship in an increasingly aging population. Many studies have found that the proteostasis network, which functions to keep proteins properly folded, is impaired with age, suggesting that there may be many proteins that incur structural alterations with age.

Here, we used limited proteolysis mass spectrometry, a structural proteomic method, to globally interrogate protein conformational changes in a rat model of cognitive aging.

Specifically, we compared soluble hippocampal proteins from aged rats with preserved cognition to those from aged rats with impaired cognition.

We identified a couple hundred proteins as having undergone cognition-associated structural changes (CASCs).

We report that CASC proteins are substantially more likely to be nonrefoldable than non-CASC proteins, meaning that they typically cannot spontaneously refold to their native conformations after being chemically denatured.

These findings suggest that noncovalent, conformational alterations may be general features in cognitive decline."

More misfolded proteins than previously known may contribute to Alzheimer's and dementia | Hub "New Johns Hopkins study of rats suggests more than 200 types of misfolded proteins could be associated with age-related cognitive decline"

Proteins with cognition-associated structural changes in a rat model of aging exhibit reduced refolding capacity (open access)

Fig. 1. Measuring protein conformational changes associated with age-associated changes in cognition with LiP-MS.

Sunday, June 15, 2025

New antibacterial coating prevents bacteria on surfaces made from flea jumping protein

Good news! Amazing stuff!

"Researchers have developed a new method for preventing bacteria from adhering to surfaces, such as medical devices. It relies on the unique properties of resilin, a natural insect protein that enables fleas to jump hundreds of times their body length.

Resilin is a super-elastic protein produced by many insects, which enables them to jump and stretch their wings. It’s what enables some species of fleas, for example, to jump up to 200 times their body length. ...

One of the coatings, a coacervate, repelled 100% of bacteria, stopping them from attaching to surfaces while being non-toxic to human cells. A coacervate is a soft, spherical, nano-sized droplet made from proteins (resilin in this case) that clump together in water, forming a separate phase (like tiny blobs), which coats surfaces and influences how cells or bacteria interact with them. ..."

"The collaborative study led by researchers at RMIT University is the first reported use of antibacterial coatings made from resilin-mimetic proteins to fully block bacteria from attaching to a surface. ..."

From the abstract:

"The applications of responsive biomaterials for tuning cell-surface interactions have been recently explored due to their unique switchable characteristics. However, rational design of surfaces using suitable biomacromolecules to attain optimal physicochemical performance, biocompatibility, cell adhesion and anti-fouling properties is quite challenging.

Resilin-mimetic polypeptides (RMPs) are intrinsically disordered biomacromolecules that exhibit multi-stimuli responsive behaviour, including reversible dual-phase thermal behaviour forming self-assembled nano- to microstructures. However, there is a limited understanding of the effect of morphological features of RMP-based nanostructures, and their influence on surface properties.

Therefore, in this study, a family of responsive RMP-based nanostructured coatings (nano-coacervates, nanogels and nano-bioconjugates) are fabricated to investigate their various surface properties that influence cell-surface interactions. The effects of their physicochemical properties, such as conformation, packing density, charge, roughness, and stiffness, are investigated using atomic force microscopy, neutron scattering and reflectometry techniques. Biocompatibility and microbiological testing show that these nanostructured switchable responsive coatings can be applied to a wide range of substrates to modulate biofilm formation and attribute antimicrobial characteristics. The developed nanocoatings have the potential to find applications in many areas, including implantable medical devices, and drug delivery."

New antibacterial coating prevents bacteria on surfaces

Insect protein blocks bacterial infection (original news release) "A protein that gives fleas their bounce has been used to boot out bacteria cells, with lab results demonstrating the material’s potential for preventing medical implant infection."

Nano-structured antibiofilm coatings based on recombinant resilin (no public access)

Graphical abstract

The coacervate resilin-mimetic coating on the base scaffold, magnified 4,000 times under a scanning electron microscope (SEM)

Tuesday, June 03, 2025

Proteins in ancient enamel from South Africa challenge assumptions about early human relatives

Amazing stuff! It took from July 2023 preprint to May 2025 to publish this paper in a journal! This is almost unacceptable! Slower than a snail!

"... Instead, the authors of a new study turned to proteins, which survive much longer than DNA. By analyzing enamel from four P. robustus ["a heavy-jawed, thick-molared human relative that lived in southern Africa roughly two million years ago"] tooth fossils, researchers were able to determine the biological sex of each specimen—and were surprised to find that one tooth, initially thought to come from a female due to its small size, actually belonged to a male. “ Paleoanthropologists have long known that our use of tooth size to estimate sex was fraught with uncertainty ...

The analysis also revealed variation in a protein called enamelin and showed that one P. robustus individual was more distantly related to the other three specimens than they were to one another. Those findings may point to hidden genetic diversity within the Paranthropus genus, supporting the possibility of multiple distinct species. ..."

From the editor's summary and abstract:

"Editor’s summary

It is now well known that the early hominin fauna was species rich, with many overlapping lineages existing in the African Pleistocene. However, our knowledge of diversity within many of these lineages has been limited because current ancient DNA technologies have not been able to reveal genetic sequences older than around 0.2 million years. Madupe et al. examined protein sequences from approximately 2-million-year-old Paranthropus robustus teeth that were particularly well preserved. Using proteomics approaches, the authors were able to assign the individual teeth to sex and to identify patterns of diversity suggesting the existence of multiple populations.

Abstract

Paranthropus robustus is a morphologically well-documented Early Pleistocene hominin species from southern Africa with no genetic evidence reported so far. In this work, we describe the mass spectrometric sequencing of enamel peptides from four ~2 million–year-old dental specimens attributed morphologically to P. robustus from the site of Swartkrans in South Africa. The identification of AMELY-specific peptides enabled us to assign two specimens to male individuals, whereas semiquantitative mass spectrometric data analysis attributed the other two to females. A single amino acid polymorphism and the enamel-dentine junction shape variation indicated potential subgroups present within southern African Paranthropus. This study demonstrates how palaeoproteomics can help distinguish sexual dimorphism from other sources of variation in African Early Pleistocene hominins."

ScienceAdviser

Males of this ancient human cousin weren’t always bigger than females "Proteins from a collection of fossils hint at sex and genetic differences in P. robustus"

Proteins from two-million-year-old teeth reveal unprecedented insights into extinct human relative (original news release)

Enamel proteins reveal biological sex and genetic variability in southern African Paranthropus (no public access)

Enamel proteins reveal biological sex and genetic variability within southern African Paranthropus (preprint, open access)

Fig. 1 Location and cave structure of the site of Swartkrans, South Africa. Plus teeth