A systematic benchmark of Nanopore long-read RNA sequencing for transcript-level analysis in human cell lines

The human genome contains instructions to transcribe more than 200,000 RNAs. However, many RNA transcripts are generated from the same gene, resulting in alternative isoforms that are highly similar and that remain difficult to quantify. To evaluate the ability to study RNA transcript expression, we...

Full description

Saved in:
Bibliographic Details
Published inNature methods Vol. 22; no. 4; pp. 801 - 812
Main Authors Chen, Ying, Davidson, Nadia M., Wan, Yuk Kei, Yao, Fei, Su, Yan, Gamaarachchi, Hasindu, Sim, Andre, Patel, Harshil, Low, Hwee Meng, Hendra, Christopher, Wratten, Laura, Hakkaart, Christopher, Sawyer, Chelsea, Iakovleva, Viktoriia, Lee, Puay Leng, Xin, Lixia, Ng, Hui En Vanessa, Loo, Jia Min, Ong, Xuewen, Ng, Hui Qi Amanda, Wang, Jiaxu, Koh, Wei Qian Casslynn, Poon, Suk Yeah Polly, Stanojevic, Dominik, Tran, Hoang-Dai, Lim, Kok Hao Edwin, Toh, Shen Yon, Ewels, Philip Andrew, Ng, Huck-Hui, Iyer, N. Gopalakrishna, Thiery, Alexandre, Chng, Wee Joo, Chen, Leilei, DasGupta, Ramanuj, Sikic, Mile, Chan, Yun-Shen, Tan, Boon Ooi Patrick, Wan, Yue, Tam, Wai Leong, Yu, Qiang, Khor, Chiea Chuan, Wüstefeld, Torsten, Lezhava, Alexander, Pratanwanich, Ploy N., Love, Michael I., Goh, Wee Siong Sho, Ng, Sarah B., Oshlack, Alicia, Göke, Jonathan
Format Journal Article
LanguageEnglish
Published New York Nature Publishing Group US 01.04.2025
Nature Publishing Group
Subjects
Online AccessGet full text
ISSN1548-7091
1548-7105
1548-7105
DOI10.1038/s41592-025-02623-4

Cover

More Information
Summary:The human genome contains instructions to transcribe more than 200,000 RNAs. However, many RNA transcripts are generated from the same gene, resulting in alternative isoforms that are highly similar and that remain difficult to quantify. To evaluate the ability to study RNA transcript expression, we profiled seven human cell lines with five different RNA-sequencing protocols, including short-read cDNA, Nanopore long-read direct RNA, amplification-free direct cDNA and PCR-amplified cDNA sequencing, and PacBio IsoSeq, with multiple spike-in controls, and additional transcriptome-wide N 6 -methyladenosine profiling data. We describe differences in read length, coverage, throughput and transcript expression, reporting that long-read RNA sequencing more robustly identifies major isoforms. We illustrate the value of the SG-NEx data to identify alternative isoforms, novel transcripts, fusion transcripts and N 6 -methyladenosine RNA modifications. Together, the SG-NEx data provide a comprehensive resource enabling the development and benchmarking of computational methods for profiling complex transcriptional events at isoform-level resolution. This analysis provides a collection of sequencing datasets generated from long-read and short-read RNA sequencing, serving as a valuable resource for transcriptome profiling.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1548-7091
1548-7105
1548-7105
DOI:10.1038/s41592-025-02623-4