Achievable Rates of Nanopore-Based DNA Storage

This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a basecalling algorithm. Specifically, the noisy nanopore channel (NNC) with the Scrappie pore model generates average output levels via i.i.d. ge...

Full description

Saved in:
Bibliographic Details
Published inIEEE journal on selected areas in information theory Vol. 6; pp. 261 - 269
Main Authors McBain, Brendon, Viterbo, Emanuele
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2641-8770
2641-8770
DOI10.1109/JSAIT.2025.3598756

Cover

More Information
Summary:This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a basecalling algorithm. Specifically, the noisy nanopore channel (NNC) with the Scrappie pore model generates average output levels via i.i.d. geometric sample duplications corrupted by i.i.d. Gaussian noise (NNC-Scrappie). Simplified message passing algorithms are derived for efficient soft decoding of nanopore signals using NNC-Scrappie. Previously, evaluation of this channel model was limited by the lack of DNA storage datasets with nanopore signals included. This is solved by deriving an achievable rate based on the dynamic time-warping (DTW) algorithm that can be applied to genomic sequencing datasets subject to constraints that make the resulting rate applicable to DNA storage. Using a publicly-available dataset from Oxford Nanopore Technologies (ONT), it is demonstrated that coding over multiple DNA strands of 100 bases in length and decoding with the NNC-Scrappie decoder can achieve rates of at least <inline-formula> <tex-math notation="LaTeX">0.64-1.18 </tex-math></inline-formula> bits per base, depending on the channel quality of the nanopore that is chosen in the sequencing device per channel-use, and 0.96 bits per base on average assuming uniformly chosen nanopores. These rates are pessimistic since they only apply to single reads and do not include calibration of the pore model to specific nanopores.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2641-8770
2641-8770
DOI:10.1109/JSAIT.2025.3598756