Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2

RepeatExplorer2 is a novel version of a computational pipeline that uses graph-based clustering of next-generation sequencing reads for characterization of repetitive DNA in eukaryotes. The clustering algorithm facilitates repeat identification in any genome by using relatively small quantities of s...

Full description

Saved in:
Bibliographic Details
Published inNature protocols Vol. 15; no. 11; pp. 3745 - 3776
Main Authors Novák, Petr, Neumann, Pavel, Macas, Jiří
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 01.11.2020
Nature Publishing Group
Subjects
Online AccessGet full text
ISSN1754-2189
1750-2799
1750-2799
DOI10.1038/s41596-020-0400-y

Cover

More Information
Summary:RepeatExplorer2 is a novel version of a computational pipeline that uses graph-based clustering of next-generation sequencing reads for characterization of repetitive DNA in eukaryotes. The clustering algorithm facilitates repeat identification in any genome by using relatively small quantities of short sequence reads, and additional tools within the pipeline perform automatic annotation and quantification of the identified repeats. The pipeline is integrated into the Galaxy platform, which provides a user-friendly web interface for script execution and documentation of the results. Compared to the original version of the pipeline, RepeatExplorer2 provides automated annotation of transposable elements, identification of tandem repeats and enhanced visualization of analysis results. Here, we present an overview of the RepeatExplorer2 workflow and provide procedures for its application to (i) de novo repeat identification in a single species, (ii) comparative repeat analysis in a set of species, (iii) development of satellite DNA probes for cytogenetic experiments and (iv) identification of centromeric repeats based on ChIP-seq data. Each procedure takes approximately 2 d to complete. RepeatExplorer2 is available at https://repeatexplorer-elixir.cerit-sc.cz . RepeatExplorer is a software tool for repeat identification and quantification using unassembled sequencing reads. The authors describe four pipelines implemented on the Galaxy platform, highlighting different applications.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1754-2189
1750-2799
1750-2799
DOI:10.1038/s41596-020-0400-y