IPEV: Identification of Prokaryotic and Eukaryotic Virus-derived sequences in virome using deep learning

This article has 2 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Background

The virome obtained through virus-like particle enrichment contain a mixture of prokaryotic and eukaryotic virus-derived fragments. Accurate identification and classification of these elements are crucial for understanding their roles and functions in microbial communities. However, the rapid mutation rates of viral genomes pose challenges in developing high-performance tools for classification, potentially limiting downstream analyses.

Findings

We present IPEV, a novel method that combines trinucleotide pair relative distance and frequency with a 2D convolutional neural network for distinguishing prokaryotic and eukaryotic viruses in viromes. Cross-validation assessments of IPEV demonstrate its state-of-the-art precision, significantly improving the F1-score by approximately 22% on an independent test set compared to existing methods when query viruses share less than 30% sequence similarity with known viruses. Furthermore, IPEV outperforms other methods in terms of accuracy on most real virome samples when using sequence alignments as annotations. Notably, IPEV reduces runtime by 50 times compared to existing methods under the same computing configuration. We utilized IPEV to reanalyze longitudinal samples and found that the gut virome exhibits a higher degree of temporal stability than previously observed in persistent personal viromes, providing novel insights into the resilience of the gut virome in individuals.

Conclusions

IPEV is a high-performance, user-friendly tool that assists biologists in identifying and classifying prokaryotic and eukaryotic viruses within viromes. The tool is available at<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/basehc/IPEV">https://github.com/basehc/IPEV</ext-link>.

Related articles

Related articles are currently not available for this article.