Genome-wide data inferring the evolution and population demography of the novel pneumonia coronavirus (SARS-CoV-2)
Abstract
As the highly risk and infectious diseases, the outbreak of coronavirus disease 2019 (COVID-19) poses unprecedent challenges to global health. Up to March 3, 2020, SARS-CoV-2 has infected more than 89,000 people in China and other 66 countries across six continents. In this study, we used 10 new sequenced genomes of SARS-CoV-2 and combined 136 genomes from GISAID database to investigate the genetic variation and population demography through different analysis approaches (e.g. Network, EBSP, Mismatch, and neutrality tests). The results showed that 80 haplotypes had 183 substitution sites, including 27 parsimony-informative and 156 singletons. Sliding window analyses of genetic diversity suggested a certain mutations abundance in the genomes of SARS-CoV-2, which may be explaining the existing widespread. Phylogenetic analysis showed that, compared with the coronavirus carried by pangolins (Pangolin-CoV), the virus carried by bats (bat-RaTG13-CoV) has a closer relationship with SARS-CoV-2. The network results showed that SARS-CoV-2 had diverse haplotypes around the world by February 11. Additionally, 16 genomes, collected from Huanan seafood market assigned to 10 haplotypes, indicated a circulating infection within the market in a short term. The EBSP results showed that the first estimated expansion date of SARS-CoV-2 began from 7 December 2019.
Related articles
Related articles are currently not available for this article.