APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data

Background The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3′-untranslated region (3′-UTR) of mRNA produces transcripts with shorter or longer 3′-UTR. Often, 3′-UTR serves as a binding platform fo...

Full description

Saved in:

Bibliographic Details
Published in	BMC bioinformatics Vol. 23; no. Suppl 3; pp. 396 - 14
Main Authors	Fahmi, Naima Ahmed, Ahmed, Khandakar Tanvir, Chang, Jae-Woong, Nassereddeen, Heba, Fan, Deliang, Yong, Jeongsik, Zhang, Wei
Format	Journal Article
Language	English
Published	London BioMed Central 28.09.2022 BioMed Central Ltd Springer Nature B.V BMC
Subjects	3' Untranslated regions 3' Untranslated Regions - genetics 3′-End-seq Algorithms Alternative polyadenylation Analysis Animal experimentation Animals Annotations Binding Binding sites Bioinformatics Biomedical and Life Sciences Computational Biology/Bioinformatics Computer Appl. in Life Sciences Computer applications Computer graphics Datasets Embryo fibroblasts Experiments Fibroblasts - metabolism Gene expression Genomes Graphical representations Isoforms Life Sciences Mathematical analysis Messenger RNA Methods Mice Microarrays MicroRNAs MicroRNAs - metabolism miRNA mRNA processing Performance evaluation Pipelines Polyadenylation Post-transcription Protein Isoforms - genetics Ribonucleic acid RNA RNA Precursors - metabolism RNA sequencing RNA, Messenger - genetics RNA, Messenger - metabolism RNA-binding protein RNA-Seq Simulation Source code Transcriptome Transcriptomes United States Alternative polyadenylation 3′-End-seq RNA-seq Transcriptome
Online Access	Get full text
ISSN	1471-2105 1471-2105
DOI	10.1186/s12859-022-04939-w

Cover

More Information
Summary:	Background The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3′-untranslated region (3′-UTR) of mRNA produces transcripts with shorter or longer 3′-UTR. Often, 3′-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3′-UTR APA is known to modulate translation and provides a mean to regulate gene expression at the post-transcriptional level. Current bioinformatics pipelines have limited capability in profiling 3′-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3′-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3′-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations. Methods APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3′-UTR transcripts in the RNA-seq data. APA-Scan works in three major steps: (i) calculate the read coverage of the 3′-UTR regions of genes; (ii) identify the potential APA sites and evaluate the significance of the events among two biological conditions; (iii) graphical representation of user specific event with 3′-UTR annotation and read coverage on the 3′-UTR regions. APA-Scan is implemented in Python3. Source code and a comprehensive user’s manual are freely available at https://github.com/compbiolabucf/APA-Scan . Result APA-Scan was applied to both simulated and real RNA-seq datasets and compared with two widely used baselines DaPars and APAtrap. In simulation APA-Scan significantly improved the accuracy of 3′-UTR APA identification compared to the other baselines. The performance of APA-Scan was also validated by 3′-end-seq data and qPCR on mouse embryonic fibroblast cells. The experiments confirm that APA-Scan can detect unannotated 3′-UTR APA events and improve genome annotation. Conclusion APA-Scan is a comprehensive computational pipeline to detect transcriptome-wide 3′-UTR APA events. The pipeline integrates both RNA-seq and 3′-end-seq data information and can efficiently identify the significant events with a high-resolution short reads coverage plots.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-022-04939-w