epiTCR: a highly sensitive predictor for TCR–peptide binding

Abstract Motivation Predicting the binding between T-cell receptor (TCR) and peptide presented by human leucocyte antigen molecule is a highly challenging task and a key bottleneck in the development of immunotherapy. Existing prediction tools, despite exhibiting good performance on the datasets the...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics (Oxford, England) Vol. 39; no. 5
Main Authors	Pham, My-Diem Nguyen, Nguyen, Thanh-Nhan, Tran, Le Son, Nguyen, Que-Tran Bui, Nguyen, Thien-Phuc Hoang, Pham, Thi Mong Quynh, Nguyen, Hoai-Nghia, Giang, Hoa, Phan, Minh-Duy, Nguyen, Vy
Format	Journal Article
Language	English
Published	England Oxford University Press 04.05.2023
Subjects	Antigens - chemistry Epitopes - chemistry Humans Original Paper Peptides - metabolism Receptors, Antigen, T-Cell - chemistry Systems Biology T-Lymphocytes - metabolism
Online Access	Get full text
ISSN	1367-4811 1367-4803 1367-4811
DOI	10.1093/bioinformatics/btad284

Cover

More Information
Summary:	Abstract Motivation Predicting the binding between T-cell receptor (TCR) and peptide presented by human leucocyte antigen molecule is a highly challenging task and a key bottleneck in the development of immunotherapy. Existing prediction tools, despite exhibiting good performance on the datasets they were built with, suffer from low true positive rates when used to predict epitopes capable of eliciting T-cell responses in patients. Therefore, an improved tool for TCR–peptide prediction built upon a large dataset combining existing publicly available data is still needed. Results We collected data from five public databases (IEDB, TBAdb, VDJdb, McPAS-TCR, and 10X) to form a dataset of >3 million TCR–peptide pairs, 3.27% of which were binding interactions. We proposed epiTCR, a Random Forest-based method dedicated to predicting the TCR–peptide interactions. epiTCR used simple input of TCR CDR3β sequences and antigen sequences, which are encoded by flattened BLOSUM62. epiTCR performed with area under the curve (0.98) and higher sensitivity (0.94) than other existing tools (NetTCR, Imrex, ATM-TCR, and pMTnet), while maintaining comparable prediction specificity (0.9). We identified seven epitopes that contributed to 98.67% of false positives predicted by epiTCR and exerted similar effects on other tools. We also demonstrated a considerable influence of peptide sequences on prediction, highlighting the need for more diverse peptides in a more balanced dataset. In conclusion, epiTCR is among the most well-performing tools, thanks to the use of combined data from public sources and its use will contribute to the quest in identifying neoantigens for precision cancer immunotherapy. Availability and implementation epiTCR is available on GitHub (https://github.com/ddiem-ri-4D/epiTCR).
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1367-4811 1367-4803 1367-4811
DOI:	10.1093/bioinformatics/btad284