Computing exact P-values for DNA motifs

Motivation: Many heuristic algorithms have been designed to approximate P-values of DNA motifs described by position weight matrices, for evaluating their statistical significance. They often significantly deviate from the true P-value by orders of magnitude. Exact P-value computation is needed for...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 23; no. 5; pp. 531 - 537
Main Authors Zhang, Jing, Jiang, Bo, Li, Ming, Tromp, John, Zhang, Xuegong, Zhang, Michael Q.
Format Journal Article
LanguageEnglish
Published Oxford Oxford University Press 01.03.2007
Oxford Publishing Limited (England)
Subjects
Online AccessGet full text
ISSN1367-4803
1367-4811
1367-4811
1460-2059
DOI10.1093/bioinformatics/btl662

Cover

More Information
Summary:Motivation: Many heuristic algorithms have been designed to approximate P-values of DNA motifs described by position weight matrices, for evaluating their statistical significance. They often significantly deviate from the true P-value by orders of magnitude. Exact P-value computation is needed for ranking the motifs. Furthermore, surprisingly, the complexity of the problem is unknown. Results: We show the problem to be NP-hard, and present MotifRank, software based on dynamic programming, to calculate exact P-values of motifs. We define the exact P-value on a general and more precise model. Asymptotically, MotifRank is faster than the best exact P-value computing algorithm, and is in fact practical. Our experiments clearly demonstrate that MotifRank significantly improves the accuracy of existing approximation algorithms. Availability: MotifRank is available from http://bio.dlg.cn Contact: mzhang@cshl.edumli@uwaterloo.ca Supplementary information: Supplementary data are available at Bioinformatics online.
Bibliography:To whom correspondence should be addressed.
ark:/67375/HXZ-RV70T9VW-K
istex:86F9D90F7D08C0BD1378E3D6C9F7C26A96A1DC2D
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Undefined-3
ISSN:1367-4803
1367-4811
1367-4811
1460-2059
DOI:10.1093/bioinformatics/btl662