Saturday, January 27, 2024

The Forces That Drive Evolution May Not Be as Random as We Thought

Amazing stuff!

"... They were able to test renowned evolutionary biologist Stephen J. Gould's thought experiment: replaying a tape of evolutionary history would result in a different, unpredictable outcome each time, since evolutionary paths depend on unpredictable events. ..."

"A groundbreaking study has found that evolution is not as unpredictable as previously thought, which could allow scientists to explore which genes could be useful to tackle real-world issues such as antibiotic resistance, disease and climate change. ...
The study ... has found that the evolutionary trajectory of a genome may be influenced by its evolutionary history, rather than determined by numerous factors and historical accidents. ..."
Using a machine learning approach known as Random Forest, along with a dataset of 2,500 complete genomes from a single bacterial species, the team carried out several hundred thousand hours of computer processing to address the question. ...
In effect, the researchers discovered an invisible ecosystem where genes can cooperate or can be in conflict with one another. ..."

From the significance and abstract:
"Significance
Different strains of the same prokaryotic species often show significant variation in gene content. Whether this variation is due to genetic drift or selection is not well understood. If the latter, we expect sets of genes to be consistently and repeatedly gained or lost together, or sequentially. We used machine learning to predict the presence of variable genes in a large set of Escherichia coli strains, using other variable genes as predictors. We find a large proportion of genes are predictable, suggesting selection plays a role in their acquisition, loss, and maintenance. We show that some genes are consistently associated with the presence or absence of others. These results have implications for understanding evolutionary dynamics in prokaryotic genomes.
Abstract
Pangenomes exhibit remarkable variability in many prokaryotic species, much of which is maintained through the processes of horizontal gene transfer and gene loss. Repeated acquisitions of near-identical homologs can easily be observed across pangenomes, leading to the question of whether these parallel events potentiate similar evolutionary trajectories, or whether the remarkably different genetic backgrounds of the recipients mean that postacquisition evolutionary trajectories end up being quite different. In this study, we present a machine learning method that predicts the presence or absence of genes in the Escherichia coli pangenome based on complex patterns of the presence or absence of other accessory genes within a genome. Our analysis leverages the repeated transfer of genes through the E. coli pangenome to observe patterns of repeated evolution following similar events. We find that the presence or absence of a substantial set of genes is highly predictable from other genes alone, indicating that selection potentiates and maintains gene–gene co-occurrence and avoidance relationships deterministically over long-term bacterial evolution and is robust to differences in host evolutionary history. We propose that at least part of the pangenome can be understood as a set of genes with relationships that govern their likely cohabitants, analogous to an ecosystem’s set of interacting organisms. Our findings indicate that intragenomic gene fitness effects may be key drivers of prokaryotic evolution, influencing the repeated emergence of complex gene–gene relationships across the pangenome."

The Forces That Drive Evolution May Not Be as Random as We Thought : ScienceAlert



Fig. 2 Relationships between selected presence–absence patterns in the E. coli pangenome. On the top are a network of nodes that represent the presence–absence patterns of the columns directly beneath, as well as the connections between the nodes that represent significant co-occurrence and avoidance relationships. Below left, the backbone phylogeny of the genomes in this study is positioned such that the rows of the heatmap to its right represent the presence or absence of nodes according to the label above each column.


No comments: