Modeling peptide fragmentation with dynamic Bayesian networks for peptide identification

Motivation: Tandem mass spectrometry (MS/MS) is an indispensable technology for identification of proteins from complex mixtures. Proteins are digested to peptides that are then identified by their fragmentation patterns in the mass spectrometer. Thus, at its core, MS/MS protein identification relie...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics Vol. 24; no. 13; pp. i348 - i356
Main Authors	Klammer, Aaron A., Reynolds, Sheila M., Bilmes, Jeff A., MacCoss, Michael J., Noble, William Stafford
Format	Journal Article
Language	English
Published	England Oxford University Press 01.07.2008 Oxford Publishing Limited (England)
Subjects	Algorithms Amino Acid Sequence Artificial Intelligence Bayes Theorem Mass spectrometry Mass Spectrometry - methods Molecular Sequence Data Pattern Recognition, Automated - methods Peptide Mapping - methods Peptides Sequence Analysis, Protein - methods
Online Access	Get full text
ISSN	1367-4803 1367-4811 1460-2059 1367-4811
DOI	10.1093/bioinformatics/btn189

Cover

More Information
Summary:	Motivation: Tandem mass spectrometry (MS/MS) is an indispensable technology for identification of proteins from complex mixtures. Proteins are digested to peptides that are then identified by their fragmentation patterns in the mass spectrometer. Thus, at its core, MS/MS protein identification relies on the relative predictability of peptide fragmentation. Unfortunately, peptide fragmentation is complex and not fully understood, and what is understood is not always exploited by peptide identification algorithms. Results: We use a hybrid dynamic Bayesian network (DBN)/support vector machine (SVM) approach to address these two problems. We train a set of DBNs on high-confidence peptide-spectrum matches. These DBNs, known collectively as Riptide, comprise a probabilistic model of peptide fragmentation chemistry. Examination of the distributions learned by Riptide allows identification of new trends, such as prevalent a-ion fragmentation at peptide cleavage sites C-term to hydrophobic residues. In addition, Riptide can be used to produce likelihood scores that indicate whether a given peptide-spectrum match is correct. A vector of such scores is evaluated by an SVM, which produces a final score to be used in peptide identification. Using Riptide in this way yields improved discrimination when compared to other state-of-the-art MS/MS identification algorithms, increasing the number of positive identifications by as much as 12% at a 1% false discovery rate. Availability: Python and C source code are available upon request from the authors. The curated training sets are available at http://noble.gs.washington.edu/proj/intense/. The Graphical Model Tool Kit (GMTK) is freely available at http://ssli.ee.washington.edu/bilmes/gmtk. Contact:noble@gs.washington.edu
Bibliography:	To whom correspondence should be addressed. ark:/67375/HXZ-JQT9HQ6M-S istex:E2A016B6A525CC1DD686A243B9E6EC0107D4A20F ArticleID:btn189 ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 content type line 23 ObjectType-Article-1 ObjectType-Feature-2
ISSN:	1367-4803 1367-4811 1460-2059 1367-4811
DOI:	10.1093/bioinformatics/btn189