Comparing Top-Down Proteoform Identification: Deconvolution, PrSM Overlap, and PTM Detection

Generating top-down tandem mass spectra (MS/MS) from complex mixtures of proteoforms benefits from improvements in fractionation, separation, fragmentation, and mass analysis. The algorithms to match MS/MS to sequences have undergone a parallel evolution, with both spectral alignment and match-count...

Full description

Saved in:
Bibliographic Details
Published inJournal of proteome research Vol. 22; no. 7; pp. 2199 - 2217
Main Authors Tabb, David L., Jeong, Kyowon, Druart, Karen, Gant, Megan S., Brown, Kyle A., Nicora, Carrie, Zhou, Mowei, Couvillion, Sneha, Nakayasu, Ernesto, Williams, Janet E., Peterson, Haley K., McGuire, Michelle K., McGuire, Mark A., Metz, Thomas O., Chamot-Rooke, Julia
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 07.07.2023
American Chemical Society (ACS)
Subjects
Online AccessGet full text
ISSN1535-3893
1535-3907
1535-3907
DOI10.1021/acs.jproteome.2c00673

Cover

More Information
Summary:Generating top-down tandem mass spectra (MS/MS) from complex mixtures of proteoforms benefits from improvements in fractionation, separation, fragmentation, and mass analysis. The algorithms to match MS/MS to sequences have undergone a parallel evolution, with both spectral alignment and match-counting approaches producing high-quality proteoform-spectrum matches (PrSMs). This study assesses state-of-the-art algorithms for top-down identification (ProSight PD, TopPIC, MSPathFinderT, and pTop) in their yield of PrSMs while controlling false discovery rate. We evaluated deconvolution engines (ThermoFisher Xtract, Bruker AutoMSn, Matrix Science Mascot Distiller, TopFD, and FLASHDeconv) in both ThermoFisher Orbitrap-class and Bruker maXis Q-TOF data (PXD033208) to produce consistent precursor charges and mass determinations. Finally, we sought post-translational modifications (PTMs) in proteoforms from bovine milk (PXD031744) and human ovarian tissue. Contemporary identification workflows produce excellent PrSM yields, although approximately half of all identified proteoforms from these four pipelines were specific to only one workflow. Deconvolution algorithms disagree on precursor masses and charges, contributing to identification variability. Detection of PTMs is inconsistent among algorithms. In bovine milk, 18% of PrSMs produced by pTop and TopMG were singly phosphorylated, but this percentage fell to 1% for one algorithm. Applying multiple search engines produces more comprehensive assessments of experiments. Top-down algorithms would benefit from greater interoperability.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF)
European Union (EU)
PNNL-SA-178802
National Institutes of Health (NIH)
AC05-76RL01830; 829157; 1R01HD092297-01A1
USDOE Office of Science (SC), Biological and Environmental Research (BER)
ISSN:1535-3893
1535-3907
1535-3907
DOI:10.1021/acs.jproteome.2c00673