Three-stage prediction of protein β-sheets by neural networks, alignments and graph algorithms

Motivation: Protein β-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein β-sheets, however, remains challenging because protein β-sheets require formation of hydrogen bonds between linearly distant residues. Previo...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 21; no. suppl-1; pp. i75 - i84
Main Authors Cheng, Jianlin, Baldi, Pierre
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.06.2005
Subjects
Online AccessGet full text
ISSN1367-4803
1460-2059
DOI10.1093/bioinformatics/bti1004

Cover

More Information
Summary:Motivation: Protein β-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein β-sheets, however, remains challenging because protein β-sheets require formation of hydrogen bonds between linearly distant residues. Previous approaches for predicting β-sheet topological features, such as β-strand alignments, in general have not exploited the global covariation and constraints characteristic of β-sheet architectures. Results: We propose a modular approach to the problem of predicting/assembling protein β-sheets in a chain by integrating both local and global constraints in three steps. The first step uses recursive neural networks to predict pairing probabilities for all pairs of interstrand β-residues from profile, secondary structure and solvent accessibility information. The second step applies dynamic programming techniques to these probabilities to derive binding pseudoenergies and optimal alignments between all pairs of β-strands. Finally, the third step uses graph matching algorithms to predict the β-sheet architecture of the protein by optimizing the global pseudoenergy while enforcing strong global β-strand pairing constraints. The approach is evaluated using cross-validation methods on a large non-homologous dataset and yields significant improvements over previous methods. Availability: http://www.igb.uci.edu/servers/psss.html Contact: pfbaldi@ics.uci.edu
Bibliography:To whom correspondence should be addressed.
local:bti1004
istex:5A32A39353AAB44540EDD1DF4CA7485B29946B4E
ark:/67375/HXZ-1HR86DBV-0
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
ISSN:1367-4803
1460-2059
DOI:10.1093/bioinformatics/bti1004