EXFI: Exon and splice graph prediction without a reference genome

For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algo...

Full description

Saved in:
Bibliographic Details
Published inEcology and evolution Vol. 10; no. 16; pp. 8880 - 8893
Main Authors Langa, Jorge, Estonba, Andone, Conklin, Darrell
Format Journal Article
LanguageEnglish
Published England John Wiley & Sons, Inc 01.08.2020
John Wiley and Sons Inc
Wiley
Subjects
Online AccessGet full text
ISSN2045-7758
2045-7758
DOI10.1002/ece3.6587

Cover

More Information
Summary:For population genetic studies in nonmodel organisms, it is important to use every single source of genomic information. This paper presents EXFI, a Python pipeline that predicts the splice graph and exon sequences using an assembled transcriptome and raw whole‐genome sequencing reads. The main algorithm uses Bloom filters to remove reads that are not part of the transcriptome, to predict the intron–exon boundaries, to then proceed to call exons from the assembly, and to generate the underlying splice graph. The results are returned in GFA1 format, which encodes both the predicted exon sequences and how they are connected to form transcripts. EXFI is written in Python, tested on Linux platforms, and the source code is available under the MIT License at https://github.com/jlanga/exfi. EXFI predicts the splice graph from a transcriptome and selected WGS reads. Transcripts are splitted into exons given the information present in the WGS experiment. Predictions are suitable for downstream bioinformatic analyses and new experimental designs.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2045-7758
2045-7758
DOI:10.1002/ece3.6587