GENESPACE: syntenic pan-genome annotations for eukaryotes

John T. Lovell
Avinash Sreedasyam
M. Eric Schranz
Melissa A. Wilson
Joseph W. Carlson
Alex Harkess
David Emms
David Goodstein
Jeremy Schmutz

3 evaluations Published on Mar 11, 2022

This article on Sciety

Abstract

The development of multiple high-quality reference genome sequences in many taxonomic groups has yielded a high-resolution view of the patterns and processes of molecular evolution. Nonetheless, leveraging information across multiple reference haplotypes remains a significant challenge in nearly all eukaryotic systems. These challenges range from studying the evolution of chromosome structure, to finding candidate genes for quantitative trait loci, to testing hypotheses about speciation and adaptation in nature. Here, we address these challenges through the concept of a pan-genome annotation, where conserved gene order is used to restrict gene families and define the expected physical position of all genes that share a common ancestor among multiple genome annotations. By leveraging pan-genome annotations and exploring the underlying syntenic relationships among genomes, we dissect presence-absence and structural variation at four levels of biological organization: among three tetraploid cotton species, across 300 million years of vertebrate sex chromosome evolution, across the diversity of the Poaceae (grass) plant family, and among 26 maize cultivars. The methods to build and visualize syntenic pan-genome annotations in the GENESPACE R package offer a significant addition to existing gene family and synteny programs, especially in polyploid, outbred and other complex genomes.

Related articles are currently not available for this article.