ToRQuEMaDA: Tool for Retrieving Queried Eubacteria, Metadata and Dereplicating Assemblies
Abstract
TQMD is a tool which downloads, stores and produces lists of dereplicated prokaryotic genomes. It has been developed to counter the ever-growing number of prokaryotic genomes and their uneven taxonomic distribution. It is based on word-based alignment-free methods (k-mers), an iterative single-linkage approach and a divide-and-conquer strategy to remain both efficient and scalable. We studied the performance of TQMD by verifying the influence of its parameters and heuristics on the clustering outcome. We further compared TQMD to two other dereplication tools (dRep and Assembly-Dereplicator). Our results showed that TQMD is optimized to dereplicate at high taxonomic levels (phylum/class), whereas the other dereplication tools are optimized for lower taxonomic levels (species/strain), making TQMD complementary to the existing dereplicating tools. TQMD is available at <<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://bitbucket.org/phylogeno/tqmd">https://bitbucket.org/phylogeno/tqmd</ext-link>>.
Related articles
Related articles are currently not available for this article.