Atria: An Ultra-fast and Accurate Trimmer for Adapter and Quality Trimming
Abstract
Background
As Next Generation Sequencing takes a dominant role in terms of output capacity and sequence length, adapters attached to the reads and low-quality bases hinder the performance of downstream analysis directly and implicitly, such as producing false-positive single nucleotide polymorphisms (SNP), and generating fragmented assemblies. A fast trimming algorithm is in demand to remove adapters precisely, especially in read tails with relatively low quality.
Findings
We present a trimming program named Atria. Atria matches the adapters in paired reads and finds possible overlapped regions with a super-fast and carefully designed byte-based matching algorithm (O(n)time withO(1)space). Atria also implements multi-threading in both sequence processing and file compression and supports single-end reads.
Conclusions
Atria performs favorably in various trimming and runtime benchmarks of both simulated and real data with other cutting-edge trimmers. We also provide an ultra-fast and lightweight byte-based matching algorithm. The algorithm can be used in a broad range of short-sequence matching applications, such as primer search and seed scanning before alignment.
Availability & Implementation
The Atria executables, source code, and benchmark scripts are available at<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/cihga39871/Atria">https://github.com/cihga39871/Atria</ext-link>under the MIT license.
Related articles
Related articles are currently not available for this article.