Panalyze: automated virus pangenome variation graph construction, analysis and annotation
Abstract
Motivation
Constructing and studying pangenome variation graphs (PVG) supports new insights into viral genomic diversity. This is because such pangenomes are less prone to reference bias, which affects mutation detection. Interpreting the information arising from this is challenging, so automating these processes to allow exploratory investigations for PVG optimisation is essential. Moreover, existing methods do not scale well to the smaller virus genome sizes and to facilitate analysis in laptop environments. To address this, we developed an easily deployable pipeline to facilitate the rapid creation of virus PVGs that applies a broad range of analyses to these PVGs.
Results
We present panalyze, a scalable and unbiased virus PVG construction, analysis and annotation tool implemented in NextFlow and containerised in Docker. Panalyze uses NextFlow to efficiently complete tasks across multiple compute nodes and in diverse high-performance computing environments. Panalyze can also operate on a single thread on a standard laptop, and analyse sequences lengths are of any size. We illustrate how Panalyze works and the valuable outputs it can generate using a range of common viral pathogens.
Availability
Panalyze is released under a MIT open-source license, available on GitHub with documentation accessible at <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/downingtim/Panalyze/">https://github.com/downingtim/Panalyze/</ext-link> .
Contact
<email>Chandana.tennakoon@pirbright.ac.uk</email> , <email>tim.downing@pibright.ac.uk</email>
Supplementary information
Supplementary data are available online.
Related articles
Related articles are currently not available for this article.