A practical DNA data storage using expanded alphabet introducing 5-methylcytosine

This article has 2 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

DNA molecular is a promising next-generation data storage medium. Recently, it has been theoretically proposed that non-natural or modified bases can serve as extra molecular letters to increase the information density. However, the feasibility of the strategy is challenging due to the difficulty in synthesizing and the complex structure of non-natural DNA sequences. Here, we described a practical DNA data storage transcoding scheme named R+ based on expanded molecular alphabet by introducing 5-methlcytosine(5mC). We also demonstrated the experimental validation by encoding one representative file into several 1.3~1.6 kbpsin vitroDNA fragments for nanopore sequencing. The results show an average data recovery rate of 98.97% and 86.91% with and without reference respectively. This work validates the practicability of 5mC in DNA storage systems, with a potentially wide range of applications.

Availability & Implementation

R+ is implemented in Python and the code is available under the MIT license at<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/Incpink-Liu/DNA-storage-R_plus">https://github.com/Incpink-Liu/DNA-storage-R_plus</ext-link>

Related articles

Related articles are currently not available for this article.