Achievable Rates of Nanopore-Based DNA Storage

This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a basecalling algorithm. Specifically, the noisy nanopore channel (NNC) with the Scrappie pore model generates average output levels via i.i.d. ge...

Full description

Saved in:
Bibliographic Details
Published inIEEE journal on selected areas in information theory Vol. 6; pp. 261 - 269
Main Authors McBain, Brendon, Viterbo, Emanuele
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2641-8770
2641-8770
DOI10.1109/JSAIT.2025.3598756

Cover

Abstract This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a basecalling algorithm. Specifically, the noisy nanopore channel (NNC) with the Scrappie pore model generates average output levels via i.i.d. geometric sample duplications corrupted by i.i.d. Gaussian noise (NNC-Scrappie). Simplified message passing algorithms are derived for efficient soft decoding of nanopore signals using NNC-Scrappie. Previously, evaluation of this channel model was limited by the lack of DNA storage datasets with nanopore signals included. This is solved by deriving an achievable rate based on the dynamic time-warping (DTW) algorithm that can be applied to genomic sequencing datasets subject to constraints that make the resulting rate applicable to DNA storage. Using a publicly-available dataset from Oxford Nanopore Technologies (ONT), it is demonstrated that coding over multiple DNA strands of 100 bases in length and decoding with the NNC-Scrappie decoder can achieve rates of at least <inline-formula> <tex-math notation="LaTeX">0.64-1.18 </tex-math></inline-formula> bits per base, depending on the channel quality of the nanopore that is chosen in the sequencing device per channel-use, and 0.96 bits per base on average assuming uniformly chosen nanopores. These rates are pessimistic since they only apply to single reads and do not include calibration of the pore model to specific nanopores.
AbstractList This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a basecalling algorithm. Specifically, the noisy nanopore channel (NNC) with the Scrappie pore model generates average output levels via i.i.d. geometric sample duplications corrupted by i.i.d. Gaussian noise (NNC-Scrappie). Simplified message passing algorithms are derived for efficient soft decoding of nanopore signals using NNC-Scrappie. Previously, evaluation of this channel model was limited by the lack of DNA storage datasets with nanopore signals included. This is solved by deriving an achievable rate based on the dynamic time-warping (DTW) algorithm that can be applied to genomic sequencing datasets subject to constraints that make the resulting rate applicable to DNA storage. Using a publicly-available dataset from Oxford Nanopore Technologies (ONT), it is demonstrated that coding over multiple DNA strands of 100 bases in length and decoding with the NNC-Scrappie decoder can achieve rates of at least [Formula Omitted] bits per base, depending on the channel quality of the nanopore that is chosen in the sequencing device per channel-use, and 0.96 bits per base on average assuming uniformly chosen nanopores. These rates are pessimistic since they only apply to single reads and do not include calibration of the pore model to specific nanopores.
This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a basecalling algorithm. Specifically, the noisy nanopore channel (NNC) with the Scrappie pore model generates average output levels via i.i.d. geometric sample duplications corrupted by i.i.d. Gaussian noise (NNC-Scrappie). Simplified message passing algorithms are derived for efficient soft decoding of nanopore signals using NNC-Scrappie. Previously, evaluation of this channel model was limited by the lack of DNA storage datasets with nanopore signals included. This is solved by deriving an achievable rate based on the dynamic time-warping (DTW) algorithm that can be applied to genomic sequencing datasets subject to constraints that make the resulting rate applicable to DNA storage. Using a publicly-available dataset from Oxford Nanopore Technologies (ONT), it is demonstrated that coding over multiple DNA strands of 100 bases in length and decoding with the NNC-Scrappie decoder can achieve rates of at least <inline-formula> <tex-math notation="LaTeX">0.64-1.18 </tex-math></inline-formula> bits per base, depending on the channel quality of the nanopore that is chosen in the sequencing device per channel-use, and 0.96 bits per base on average assuming uniformly chosen nanopores. These rates are pessimistic since they only apply to single reads and do not include calibration of the pore model to specific nanopores.
Author Viterbo, Emanuele
McBain, Brendon
Author_xml – sequence: 1
  givenname: Brendon
  orcidid: 0000-0002-1073-2948
  surname: McBain
  fullname: McBain, Brendon
  email: brendon.mcbain@monash.edu
  organization: ECSE Department, Monash University, Melbourne, VIC, Australia
– sequence: 2
  givenname: Emanuele
  orcidid: 0000-0002-5861-2873
  surname: Viterbo
  fullname: Viterbo, Emanuele
  email: emanuele.viterbo@monash.edu
  organization: ECSE Department, Monash University, Melbourne, VIC, Australia
BookMark eNpNkEtLw0AUhQepYNX-AXERcJ1455WZWcb6qpQKtq6HSXKjLTVTZ1LBf29qC7q6Z3G-c-E7JYPWt0jIBYWMUjDXT_NissgYMJlxabSS-REZslzQVCsFg3_5hIxiXAEAY1QorYYkK6r3JX65co3Ji-swJr5JZq71Gx8wvXER6-R2ViTzzgf3hufkuHHriKPDPSOv93eL8WM6fX6YjItpWjFhulQKrZiuKqhLXjvBa1ZyCsBBNgINpTovBVc548KwBqXKRW3A9LnEvNFS8zNytd_dBP-5xdjZld-Gtn9pORNKaGYE7Vts36qCjzFgYzdh-eHCt6Vgd2rsrxq7U2MPanrocg8tEfEPoJQJqYD_AAw6Xek
CODEN IJSTL5
Cites_doi 10.1109/ISIT57864.2024.10619277
10.1109/ISTC49272.2021.9594243
10.1109/ISIT57864.2024.10619598
10.1109/ICC45041.2023.10279497
10.1109/ISIT45174.2021.9517755
10.1109/TIT.2024.3404002
10.1109/ISIT54713.2023.10206948
10.1109/jsait.2025.3598773
10.1109/ITW55543.2023.10161642
10.1109/TIT.2018.2809001
10.1109/TSP.2025.3570417
10.1038/nbt.2950
10.1038/s41598-017-05188-1
10.1109/ISIT50566.2022.9834633
10.1109/MBITS.2024.3355883
10.1109/ISTC57237.2023.10273510
10.1109/TNB.2024.3350001
10.1109/LSP.2001.838216
10.1109/TMBMC.2024.3403488
10.1109/TCOMM.2024.3367748
10.1109/ISIT57864.2024.10619201
10.1109/ACCESS.2023.3278975
10.1109/LCOMM.2020.3029071
10.1109/ISIT54713.2023.10206475
10.1016/j.patcog.2021.107895
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/JSAIT.2025.3598756
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2641-8770
EndPage 269
ExternalDocumentID 10_1109_JSAIT_2025_3598756
11124570
Genre orig-research
GroupedDBID 0R~
97E
AAJGR
AASAJ
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
IFIPE
JAVBF
OCL
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c249t-548728cc0db3da43d2b3100305f4e91186b437623492fe5764d90992fbe6f8583
IEDL.DBID RIE
ISSN 2641-8770
IngestDate Wed Oct 08 14:10:38 EDT 2025
Wed Oct 01 05:35:32 EDT 2025
Wed Sep 10 07:40:58 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c249t-548728cc0db3da43d2b3100305f4e91186b437623492fe5764d90992fbe6f8583
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-1073-2948
0000-0002-5861-2873
PQID 3247482941
PQPubID 5075791
PageCount 9
ParticipantIDs proquest_journals_3247482941
ieee_primary_11124570
crossref_primary_10_1109_JSAIT_2025_3598756
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20250000
2025-00-00
20250101
PublicationDateYYYYMMDD 2025-01-01
PublicationDate_xml – year: 2025
  text: 20250000
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE journal on selected areas in information theory
PublicationTitleAbbrev JSAIT
PublicationYear 2025
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref14
(ref23) 2025
ref11
ref10
ref2
ref1
ref17
ref16
ref19
ref18
ref24
ref26
ref25
ref20
(ref15) 2017
ref22
ref28
ref27
ref8
ref7
McBain (ref21) 2025
ref9
ref4
ref3
ref6
ref5
References_xml – ident: ref10
  doi: 10.1109/ISIT57864.2024.10619277
– ident: ref6
  doi: 10.1109/ISTC49272.2021.9594243
– ident: ref11
  doi: 10.1109/ISIT57864.2024.10619598
– volume-title: Oxford Nanopore Technologies Benchmark Datasets
  year: 2025
  ident: ref23
– ident: ref17
  doi: 10.1109/ICC45041.2023.10279497
– ident: ref13
  doi: 10.1109/ISIT45174.2021.9517755
– ident: ref9
  doi: 10.1109/TIT.2024.3404002
– ident: ref20
  doi: 10.1109/ISIT54713.2023.10206948
– ident: ref16
  doi: 10.1109/jsait.2025.3598773
– volume-title: Scrappie: A Technology Demonstrator for the Oxford Nanopore Research Algorithms Group
  year: 2017
  ident: ref15
– ident: ref12
  doi: 10.1109/ITW55543.2023.10161642
– ident: ref5
  doi: 10.1109/TIT.2018.2809001
– ident: ref26
  doi: 10.1109/TSP.2025.3570417
– ident: ref4
  doi: 10.1038/nbt.2950
– year: 2025
  ident: ref21
  article-title: Coding synthetic DNA for nanopore sequencing
– ident: ref27
  doi: 10.1038/s41598-017-05188-1
– ident: ref8
  doi: 10.1109/ISIT50566.2022.9834633
– ident: ref3
  doi: 10.1109/MBITS.2024.3355883
– ident: ref28
  doi: 10.1109/ISTC57237.2023.10273510
– ident: ref19
  doi: 10.1109/TNB.2024.3350001
– ident: ref25
  doi: 10.1109/LSP.2001.838216
– ident: ref2
  doi: 10.1109/TMBMC.2024.3403488
– ident: ref1
  doi: 10.1109/TCOMM.2024.3367748
– ident: ref14
  doi: 10.1109/ISIT57864.2024.10619201
– ident: ref7
  doi: 10.1109/ACCESS.2023.3278975
– ident: ref22
  doi: 10.1109/LCOMM.2020.3029071
– ident: ref18
  doi: 10.1109/ISIT54713.2023.10206475
– ident: ref24
  doi: 10.1016/j.patcog.2021.107895
SSID ssj0002214787
Score 2.2816415
Snippet This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Index Database
Publisher
StartPage 261
SubjectTerms Accuracy
Algorithms
Backtracking
Channel models
Datasets
Decoding
DNA
Encoding
information rates
mathematical models
Message passing
nanobioscience
nanopores
Noise
Noise measurement
Random noise
Sequential analysis
Vectors
Title Achievable Rates of Nanopore-Based DNA Storage
URI https://ieeexplore.ieee.org/document/11124570
https://www.proquest.com/docview/3247482941
Volume 6
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 2641-8770
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002214787
  issn: 2641-8770
  databaseCode: RIE
  dateStart: 20200101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELagEwvlUUShoAxsKGnqOLE9hkdVOnSgrdQtSuyzQEgNgpSBX8_ZSQCBkNgyJJbl89193-UehFzEIsojrhRKQDOfmUT70vDYNwaxsoxkLMFlW8ySyZJNV_GqKVZ3tTAA4JLPILCP7l--LtXGhsqGqJeUxRwZ-jYXSV2s9RlQoXbijuBtYUwoh9N5erdACkjjwPap43ZI9Tfn46ap_DLBzq-Mu2TW7qhOJ3kKNlURqPcfzRr_veU9stsgTC-tr8Q-2YL1Aem20xu8RpkPSZCqh0d4s6VT3r1FnF5pPDS2JSJy8K_Qu2nvZpZ6c2TlaHR6ZDm-XVxP_GZ6gq-QUlW-pSJUKBXqItI5izQtbDAf9dswQBMnkoKhdaG2PaEBpB1MS4SL1BSQGIEyPCKddbmGY-JJyQG9nYkZkjelIGc5HUGYx2BErgXtk8v2WLPnuklG5shFKDMnhMwKIWuE0Cc9e05fbzZH1CeDVhRZo0ivGeI9zgSVbHTyx2enZMeuXodFBqRTvWzgDIFCVZy7C_IB2xu3sg
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZQGWChPIooFMjAhpKmjt3EY3ipLSUDbaVuUWKfBUJqEaQM_HrOTgIIhMSWIZEtn-_u-y73IOSMR0EWhFKiBBRzme4rV-iQu1ojVhaB4AJstkXSH8zYaM7nVbG6rYUBAJt8Bp55tP_y1VKuTKisi3pJGQ-Roa9zxhgvy7U-QyrUzNyJwro0xhfd0SQeTpEEUu6ZTnWhGVP9zf3YeSq_jLD1LDdNktR7KhNKnrxVkXvy_Ue7xn9veptsVRjTictLsUPWYLFLmvX8BqdS5z3ixfLhEd5M8ZRzbzCns9QOmtslYnJwL9C_KecqiZ0J8nI0Oy0yu7meXg7can6CK5FUFa4hIzSS0ld5oDIWKJqbcD5quGaARi7q5wztCzUNCjUg8WBKIGCkOoe-jlCK-6SxWC7ggDhChID-TnOG9E1KyFhGe-BnHHSUqYi2yXl9rOlz2SYjtfTCF6kVQmqEkFZCaJOWOaevN6sjapNOLYq0UqXXFBFfyCIqWO_wj89OycZgejdOx8Pk9ohsmpXKIEmHNIqXFRwjbCjyE3tZPgCsXbr_
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Achievable+Rates+of+Nanopore-Based+DNA+Storage&rft.jtitle=IEEE+journal+on+selected+areas+in+information+theory&rft.au=McBain%2C+Brendon&rft.au=Viterbo%2C+Emanuele&rft.date=2025&rft.pub=IEEE&rft.eissn=2641-8770&rft.volume=6&rft.spage=261&rft.epage=269&rft_id=info:doi/10.1109%2FJSAIT.2025.3598756&rft.externalDocID=11124570
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2641-8770&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2641-8770&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2641-8770&client=summon