KnotSeeker: Heuristic pseudoknot detection in long RNA sequences

Pseudoknots are folded structures in RNA molecules that perform essential functions as part of cellular transcription machinery and regulatory processes. The prediction of these structures in RNA molecules has important implications in antiviral drug design. It has been shown that the prediction of...

Full description

Saved in:
Bibliographic Details
Published inRNA (Cambridge) Vol. 14; no. 4; pp. 630 - 640
Main Authors Sperschneider, Jana, Datta, Amitava
Format Journal Article
LanguageEnglish
Published United States Cold Spring Harbor Laboratory Press 01.04.2008
Subjects
Online AccessGet full text
ISSN1355-8382
1469-9001
1469-9001
DOI10.1261/rna.968808

Cover

More Information
Summary:Pseudoknots are folded structures in RNA molecules that perform essential functions as part of cellular transcription machinery and regulatory processes. The prediction of these structures in RNA molecules has important implications in antiviral drug design. It has been shown that the prediction of pseudoknots is an NP-complete problem. Practical structure prediction algorithms based on free energy minimization employ a restricted problem class and dynamic programming. However, these algorithms are computationally very expensive, and their accuracy deteriorates if the input sequence containing the pseudoknot is too long. Heuristic methods can be more efficient, but do not guarantee an optimal solution in regards to the minimum free energy model. We present KnotSeeker, a new heuristic algorithm for the detection of pseudoknots in RNA sequences as a preliminary step for structure prediction. Our method uses a hybrid sequence matching and free energy minimization approach to perform a screening of the primary sequence. We select short sequence fragments as possible candidates that may contain pseudoknots and verify them by using an existing dynamic programming algorithm and a minimum weight independent set calculation. KnotSeeker is significantly more accurate in detecting pseudoknots compared to other common methods as reported in the literature. It is very efficient and therefore a practical tool, especially for long sequences. The algorithm has been implemented in Python and it also uses C/C++ code from several other known techniques. The code is available from http://www.csse.uwa.edu.au/∼datta/pseudoknot .
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reprint requests to: Jana Sperschneider, School of Computer Science and Software Engineering, University of Western Australia, Perth, WA 6009, Australia; e-mail: janaspe@csse.uwa.edu.au; fax: 61-8-6488-1089; or Amitava Datta, School of Computer Science and Software Engineering, University of Western Australia, Perth, WA 6009, Australia; e-mail: datta@csse.uwa.edu.au.
ISSN:1355-8382
1469-9001
1469-9001
DOI:10.1261/rna.968808