Efficient evidence-based genome annotation with EviAnn

Aleksey V. Zimin
Daniela Puiu
Mihaela Pertea
James A. Yorke
Steven L. Salzberg

0 evaluations Published on Sep 15, 2025

This article on Sciety

Abstract

For many years, machine learning-based ab initio gene finding approaches have been central components of eukaryotic genome annotation pipelines, and they remain so today. The reliance on these approaches was originally sustained by the high cost and low availability of gene expression data, a primary source of evidence for gene annotation along with protein homology. However, innovations in modern sequencing technologies have revolutionized the acquisition of gene expression data, allowing scientists to rely more heavily on this class of evidence. In addition, proteins found in a multitude of well-annotated genomes represent another invaluable resource for gene annotation. Existing annotation packages often underutilize these data sources, which prompted us to develop EviAnn ( <underline>Evi</underline> dence-based <underline>Ann</underline> otator), a novel evidence-based eukaryotic gene annotation system. EviAnn takes a strongly data-driven approach, building the exon-intron structure of genes from transcript alignments or protein-sequence homology rather than from purely ab initio gene finding techniques. We show that when provided with the same input data, EviAnn consistently outperforms current state-of-the-art packages including BRAKER3, MAKER2, and FINDER, while utilizing considerably less computer time. Annotation of a mammalian genome can be completed in less than an hour on a single multi-core server. EviAnn is freely available under an open-source license from <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/alekseyzimin/EviAnn_release">https://github.com/alekseyzimin/EviAnn_release</ext-link> and from Bioconda as “eviann”.

Related articles are currently not available for this article.