Exploring the natural origins of SARS-CoV-2 in the light of recombination
Abstract
The lack of an identifiable intermediate host species for the proximal animal ancestor of SARS-CoV-2, and the large geographical distance between Wuhan and where the closest evolutionary related coronaviruses circulating in horseshoe bats (Sarbecoviruses) have been identified, is fuelling speculation on the natural origins of SARS-CoV-2. We have comprehensively analysed phylogenetic relations between SARS-CoV-2, and the related bat and pangolin Sarbecoviruses sampled so far. Determining the likely recombination events reveals a highly reticulate evolutionary history within this group of coronaviruses. Clustering of the inferred recombination events is non-random with evidence that Spike, the main target for humoral immunity, is beside a recombination hotspot likely driving antigenic shift in the ancestry of bat Sarbecoviruses. Coupled with the geographic ranges of their hosts and the sampling locations, across southern China, and into Southeast Asia, we confirm horseshoe bats, Rhinolophus, are the likely SARS-CoV-2 progenitor reservoir species. By tracing the recombinant sequence patterns, we conclude that there has been relatively recent geographic movement and co-circulation of these viruses’ ancestors, extending across their bat host ranges in China and Southeast Asia over the last 100 years or so. We confirm that a direct proximal ancestor to SARS-CoV-2 is yet to be sampled, since the closest relative shared a common ancestor with SARS-CoV-2 approximately 40 years ago. Our analysis highlights the need for more wildlife sampling to (i) pinpoint the exact origins of SARS-CoV-2’s animal progenitor, and (ii) survey the extent of the diversity in the related Sarbecoviruses’ phylogeny that present high risk for future spillover.
Highlights
The origin of SARS-CoV-2 can be traced to horseshoe bats, genus Rhinolophus, with ranges in both China and Southeast Asia.
The closest known relatives of SARS-CoV-2 exhibit frequent transmission among their Rhinolophus host species.
Sarbecoviruses have undergone extensive recombination throughout their evolutionary history.
Accounting for the mosaic patterns of these recombinants is important when inferring relatedness to SARS-CoV-2.
Breakpoint patterns are consistent with recombination hotspots in the coronavirus genome, particularly upstream of the pike open reading frame with a coldspot in S1.
Related articles
Related articles are currently not available for this article.