Comparative Genomics and Integrated Network Approach Unveiled Undirected Phylogeny Patterns, Co-mutational Hotspots, Functional Crosstalk and Regulatory Interactions in SARS-CoV-2
Abstract
SARS-CoV-2 pandemic resulted in 92 million cases in a span of one year. The study focuses on understanding population specific variations attributing its high rate of infections in specific geographical regions particularly in USA. Rigorous phylogenomic network analysis of complete SARS-CoV-2 genomes (245) inferred five central clades named a (ancestral), b, c, d and e (subtype e1 & e2). The clade d & e2 were found exclusively comprising of USA. Clades were distinguished by 10 co-mutational combinations in Nsp3, ORF8, Nsp13, S, Nsp12, Nsp2 and Nsp6. Our analysis revealed that only 67.46% of SNP mutations were at amino acid level. T1103P mutation in Nsp3 was predicted to increase protein stability in 238 strains except 6 strains which were marked as ancestral type; whereas co-mutation (P409L & Y446C) in Nsp13 were found in 64 genomes from USA highlighting its 100% co-occurrence. Docking highlighted mutation (D614G) caused reduction in binding of Spike proteins with ACE2, but it also showed better interaction with TMPRSS2 receptor contributing to high transmissibility among USA strains. We also found host proteins, MYO5A, MYO5B, MYO5C had maximum interaction with viral proteins (N, S, M). Thus, blocking the internalization pathway by inhibiting MYO5 proteins which could be an effective target for COVID-19 treatment. The functional annotations of the HPI network were found to be closely associated with hypoxia and thrombotic conditions confirming the vulnerability and severity of infection. We also screened CpG islands in Nsp1 & N conferring ability of SARS-CoV-2 to enter and trigger ZAP activity inside host cell.
Importance
In the current study we presented a global view of mutational pattern observed in SARS-CoV-2 virus transmission. This provided a who-infect-whom geographical model since the early pandemic. This is hitherto the most comprehensive comparative genomics analysis of full-length genomes for co-mutations at different geographical regions specially in USA strains. Compositional structural biology results suggested that mutations have balance of contrary forces effect on pathogenicity suggesting only few mutations to effective at translation level but not all. Novel HPI analysis and CpG predictions elucidates the proof of concept of hypoxia and thrombotic conditions in several patients. Thus, the current study focuses the understanding of population specific variations attributing high rate of SARS-CoV-2 infections in specific geographical regions which may eventually be vital for the most severely affected countries and regions for sharp development of custom-made vindication strategies.
Related articles
Related articles are currently not available for this article.