GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experiments

Accurate prediction of transcription factor binding motifs that are enriched in a collection of sequences remains a computational challenge. Here we report on GimmeMotifs, a pipeline that incorporates an ensemble of computational tools to predict motifs de novo from ChIP-sequencing (ChIP-seq) data....

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 27; no. 2; pp. 270 - 271
Main Authors van Heeringen, Simon J., Veenstra, Gert Jan C.
Format Journal Article
LanguageEnglish
Published Oxford Oxford University Press 15.01.2011
Subjects
Online AccessGet full text
ISSN1367-4803
1367-4811
1367-4811
1460-2059
DOI10.1093/bioinformatics/btq636

Cover

More Information
Summary:Accurate prediction of transcription factor binding motifs that are enriched in a collection of sequences remains a computational challenge. Here we report on GimmeMotifs, a pipeline that incorporates an ensemble of computational tools to predict motifs de novo from ChIP-sequencing (ChIP-seq) data. Similar redundant motifs are compared using the weighted information content (WIC) similarity score and clustered using an iterative procedure. A comprehensive output report is generated with several different evaluation metrics to compare and evaluate the results. Benchmarks show that the method performs well on human and mouse ChIP-seq datasets. GimmeMotifs consists of a suite of command-line scripts that can be easily implemented in a ChIP-seq analysis pipeline. Availability: GimmeMotifs is implemented in Python and runs on Linux. The source code is freely available for download at http://www.ncmls.eu/bioinfo/gimmemotifs/. Contact:  s.vanheeringen@ncmls.ru.nl Supplementary Information:  Supplementary data are available at Bioinformatics online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Undefined-1
ObjectType-Feature-3
Associate Editor: Alfonso Valencia
ISSN:1367-4803
1367-4811
1367-4811
1460-2059
DOI:10.1093/bioinformatics/btq636