Strategies for Metagenomic-Guided Whole-Community Proteomics of Complex Microbial Environments

Accurate protein identification in large-scale proteomics experiments relies upon a detailed, accurate protein catalogue, which is derived from predictions of open reading frames based on genome sequence data. Integration of mass spectrometry-based proteomics data with computational proteome predict...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 6; no. 11; p. e27173
Main Authors	Cantarel, Brandi L., Erickson, Alison R., VerBerkmoes, Nathan C., Erickson, Brian K., Carey, Patricia A., Pan, Chongle, Shah, Manesh, Mongodin, Emmanuel F., Jansson, Janet K., Fraser-Liggett, Claire M., Hettich, Robert L.
Format	Journal Article
Language	English
Published	United States Public Library of Science 23.11.2011 Public Library of Science (PLoS)
Subjects	Accuracy Algorithms Amino Acid Sequence Bacteria - metabolism BASIC BIOLOGICAL SCIENCES Biology Bioremediation Chromatography Computer applications database and informatics methods database searching Databases, Protein Datasets Deoxyribonucleic acid Dictionaries DNA DNA sequencing Ecosystem Enzymes Female gene prediction Gene sequencing Genomes Genomics Geobacter Humans Laboratories Mass spectrometry Mass spectroscopy Metabolism metagenomics Metagenomics - methods Microorganisms Nucleotide sequence Open reading frames Peptides Peptides - metabolism Physics Physiology protein sequencing Proteins proteomic databases proteomic sequencing Proteomics Proteomics - methods Residence Characteristics Scientific imaging Sequence Analysis, Protein sequence databases Sequence Homology, Amino Acid Spectroscopy Studies United States > US Maryland Tennessee Baltimore Maryland
Online Access	Get full text
ISSN	1932-6203 1932-6203
DOI	10.1371/journal.pone.0027173

Cover

More Information
Summary:	Accurate protein identification in large-scale proteomics experiments relies upon a detailed, accurate protein catalogue, which is derived from predictions of open reading frames based on genome sequence data. Integration of mass spectrometry-based proteomics data with computational proteome predictions from environmental metagenomic sequences has been challenging because of the variable overlap between proteomic datasets and corresponding short-read nucleotide sequence data. In this study, we have benchmarked several strategies for increasing microbial peptide spectral matching in metaproteomic datasets using protein predictions generated from matched metagenomic sequences from the same human fecal samples. Additionally, we investigated the impact of mass spectrometry-based filters (high mass accuracy, delta correlation), and de novo peptide sequencing on the number and robustness of peptide-spectrum assignments in these complex datasets. In summary, we find that high mass accuracy peptide measurements searched against non-assembled reads from DNA sequencing of the same samples significantly increased identifiable proteins without sacrificing accuracy.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 AC05-00OR22725; UH2DK83991 National Institutes of Health (NIH) Crohn's and Collitis Foundation of America USDOE Office of Science (SC), Biological and Environmental Research (BER) Wrote the paper: BLC ARE EFM JKJ CMF-L RLH. Designed the overall approach and integration plan for metagenomics-metaproteomics: BLC ARE NCV EFM CMF-L RLH. Developed and tested the integrated approach and did the majority of data analysis: BLC ARE. Developed and performed the spectral analysis database comparisons: BKE ARE. Designed and performed the de novo peptide sequencing data and comparisons: ARE PAC NCV CP. Performed all protein sequence database searches: MS. Provided the samples: JKJ.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0027173