Atria: An Ultra-fast and Accurate Trimmer for Adapter and Quality Trimming

This article has 2 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Background

As Next Generation Sequencing takes a dominant role in terms of output capacity and sequence length, adapters attached to the reads and low-quality bases hinder the performance of downstream analysis directly and implicitly, such as producing false-positive single nucleotide polymorphisms (SNP), and generating fragmented assemblies. A fast trimming algorithm is in demand to remove adapters precisely, especially in read tails with relatively low quality.

Findings

We present a trimming program named Atria. Atria matches the adapters in paired reads and finds possible overlapped regions with a super-fast and carefully designed byte-based matching algorithm (O(n)time withO(1)space). Atria also implements multi-threading in both sequence processing and file compression and supports single-end reads.

Conclusions

Atria performs favorably in various trimming and runtime benchmarks of both simulated and real data with other cutting-edge trimmers. We also provide an ultra-fast and lightweight byte-based matching algorithm. The algorithm can be used in a broad range of short-sequence matching applications, such as primer search and seed scanning before alignment.

Availability & Implementation

The Atria executables, source code, and benchmark scripts are available at<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/cihga39871/Atria">https://github.com/cihga39871/Atria</ext-link>under the MIT license.

Related articles

Related articles are currently not available for this article.