Thursday, February 18, 2021

A wealth of discovery built on the Human Genome Project — by the numbers

Recommendable review article, a comment, that raises a number of interesting or concerning questions! Are we missing opportunities for e.g. better cures?

" ... the trends we note in how genome research has changed over time.  ...
It shows an intense focus on a small number of superstar’ protein-coding genes, potentially to the detriment of interesting work that could be done on others. There has been a pivot towards non-protein-coding sections of the genome, and to understanding interactions between genetic material and proteins. And drug discovery has been grounded in just a few protein targets. ...
Our analysis shows that, between the start of the HGP [Human Genome Project] in 1990 and its completion in 2003 (after the [Human Genome Project] draft was published in 2001), the number of discovered (or ‘annotated’) human genes grew drastically. It levelled out suddenly in the mid-2000s at about 20,000 protein-coding genes (see ‘Twenty years of junk, stars and drugs: Non-coding elements’), far short of the 100,000-strong estimate previously adopted by many in the scientific community ...
Each year since 2001, between 10,000 and 20,000 papers mentioning protein-coding genes have been published ....
However, that interest has focused largely on just a few genes. ... Some superstar genes ... became the subject of hundreds of publications a year, with most other genes receiving scant attention ... We find that, by 2017, 22% of gene-related publications referenced just 1% of genes. ...
Intense study is, of course, justified for genes that have profound biological importance. A good example is TP53 — it is crucial to cell growth and death, and leads to cancer when inactivated or altered. Variations in this gene are found in more than 50% of tumour sequences. It is mentioned in 9,232 publications between 1976 and 2017 ...
during the past two decades: more attention was lavished on a select few. Despite this being flagged as a potential problem on the tenth anniversary of the draft genome’s publication [2011], there has been no course correction. ...
that this vast imbalance can be explained by a ‘rich-gets-richer’ dynamic rooted in social factors. It is likely that as the number of papers focusing on TP53 increases, the easier it is to secure funding, mentorship, tools and citations for further TP53 work because it is a safe bet ... In network science, this phenomenon is called preferential attachment ...
A great debate pre-dated the start of the HGP: was it worth mapping the vast non-coding regions of genome that were called junk DNA, or the dark matter of the genome? Thanks in large part to the HGP, it is now appreciated that the majority of functional sequences in the human genome do not encode proteins. Rather, elements such as long non-coding RNAs, promoters, enhancers and countless gene-regulatory motifs work together to bring the genome to life. ...
Before about the 1980s, drugs were found largely by serendipity. Their molecular and protein targets were usually unknown. Until 2001, the probability of knowing all of a drug’s protein targets was less than 50% in any given year. The HGP changed this. Now, the targets are known for almost all drugs licensed in the United States each year ...
Of the roughly 20,000 proteins revealed by the HGP as potential drug targets, we show that only about 10% — 2,149 — have so far been targeted by approved drugs ... That leaves 90% of the proteome untouched by pharmacology. Experimental drugs in our data set increase this number to 3,119 ... Again, the attention given to these is highly uneven. Five per cent of all approved drugs currently approved (99 distinct molecules) target the protein ADRA1A, which is involved in cell growth and proliferation. ..."

A wealth of discovery built on the Human Genome Project — by the numbers A new analysis traces the story of the draft genome’s impact on genomics since 2001, linking its effects on publications, drug approvals and understanding of disease.

No comments: