Modeling the SARS-CoV-2 mutation based on geographical regions and time

This article has 1 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

The Coronavirus Disease 2019 (COVID-19) epidemic was first detected in late-December 2019. So far, it has caused 203,815,431 confirmed cases and 4,310,623 deaths in the world. We collected sequences from 150,659 COVID-19 patients. Based on the previous phylogenomic analysis, we found three major branches of the virus RNA genomic mutation located in Asia, America, and Europe which is consistent with other studies. We selected sites with high mutation frequencies from Asia, America, and Europe. There are only 13 common mutation sites in these three regions. It infers that the viral mutations are highly dependent on their location and different locations have specific mutations. Most mutations can lead to amino acid substitutions, which occurred in 3/5’UTR, S/N/M protein, and ORF1ab/3a/8/10. Thus, the mutations may affect the pathogenesis of the virus. In addition, we applied an ARIMA model to predict the short-term frequency change of these top mutation sites during the spread of the disease. We tested a variety of settings of the ARIMA model to optimize the prediction effect of three patterns. This model can provide good help for predicting short-term mutation frequency changes.

Related articles

Related articles are currently not available for this article.