Large-scale analysis of SARS-CoV-2 spike-glycoprotein mutants demonstrates the need for continuous screening of virus isolates
Abstract
Due to the widespread of the COVID-19 pandemic, the SARS-CoV-2 genome is evolving in diverse human populations. Several studies already reported different strains and an increase in the mutation rate. Particularly, mutations in SARS-CoV-2 spike-glycoprotein are of great interest as it mediates infection in human and recently approved mRNA vaccines are designed to induce immune responses against it.
We analyzed 146,917 SARS-CoV-2 genome assemblies and 2,393 NGS datasets from GISAID, NCBI Virus and NCBI SRA archives focusing on non-synonymous mutations in the spike protein.
Only around 13.8% of the samples contained the wild-type spike protein with no variation from the reference. Among the spike protein mutants, we confirmed a low mutation rate exhibiting less than 10 non-synonymous mutations in 99.98% of the analyzed sequences, but the mean and median number of spike protein mutations per sample increased over time. 2,592 distinct variants were found in total. The majority of the observed variants were recurrent, but only nine and 23 recurrent variants were found in at least 0.5% of the mutant genome assemblies and NGS samples, respectively. Further, we found high-confidence subclonal variants in about 15.1% of the NGS data sets with mutant spike protein, which might indicate co-infection with various SARS-CoV-2 strains and/or intra-host evolution. Lastly, some variants might have an effect on antibody binding or T-cell recognition.
These findings demonstrate the increasing importance of monitoring SARS-CoV-2 sequences for an early detection of variants that require adaptations in preventive and therapeutic strategies.
Related articles
Related articles are currently not available for this article.