Prot-SpaM : Fast alignment-free phylogeny reconstruction based on whole-proteome sequences

Chris-Andre Leimeister
Jendrik Schellhorn
Svenja Schöbel
Michael Gerth
Christoph Bleidorn
Burkhard Morgenstern

1 evaluations Published on Sep 7, 2018

This article on Sciety

Abstract

Word-based or ‘alignment-free’ sequence comparison has become an active area of research in bioinformatics. While previous word-frequency approaches calculated rough measures of sequence similarity or dissimilarity, some new alignment-free methods are able to accurately estimate phylogenetic distances between genomic sequences. One of these approaches is Filtered Spaced Word Matches . Herein, we extend this approach to estimate evolutionary distances between complete or incomplete proteomes; our implementation of this approach is called Prot-SpaM . We compare the performance of Prot-SpaM to other alignment-free methods on simulated sequences and on various groups of eukaryotic and prokaryotic taxa. Prot-SpaM can be used to calculate high-quality phylogenetic trees from whole-proteome sequences in a matter of seconds or minutes and often outperforms other alignment-free approaches. The source code of our software is available through Github :

<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/jschellh/ProtSpaM">https://github.com/jschellh/ProtSpaM</ext-link>

Related articles are currently not available for this article.