MEGDTA: multi-modal drug-target affinity prediction based on protein three-dimensional structure and ensemble graph neural network

Background Drug development is a time-consuming and costly endeavor, and utilizing computer-aided methods to predict drug-target affinity (DTA) can significantly accelerate this process. The key to accurate DTA prediction lies in selecting appropriate computational models to effectively extract feat...

Full description

Saved in:
Bibliographic Details
Published inBMC genomics Vol. 26; no. 1; pp. 738 - 14
Main Authors Hou, Zhanwei, Li, Yijun, Zhai, Haixia, Luo, Junwei, Ding, Yulian, Pan, Yi
Format Journal Article
LanguageEnglish
Published London BioMed Central 11.08.2025
BioMed Central Ltd
Springer Nature B.V
BMC
Subjects
Online AccessGet full text
ISSN1471-2164
1471-2164
DOI10.1186/s12864-025-11943-w

Cover

More Information
Summary:Background Drug development is a time-consuming and costly endeavor, and utilizing computer-aided methods to predict drug-target affinity (DTA) can significantly accelerate this process. The key to accurate DTA prediction lies in selecting appropriate computational models to effectively extract features from drug molecular structures and target protein structures. Existing methods usually ignore the features of the protein three-dimensional structure. Results This paper proposes a multi-modal drug-target affinity prediction model based on protein three-dimensional structure and ensemble graph neural networks (MEGDTA). This model aims to capture diverse features from drug and target structure using neural network architectures, especially for protein three-dimensional structure. First, one drug is represented into two forms by a molecular graph and a Morgan Fingerprint, and their features are extracted by constructing a graph feature space and a fully connected network, respectively. Second, for a protein, a residue graph is constructed based on its three-dimensional structure. And, the protein sequence and residue graph are processed using a long short-term memory (LSTM) network and multiple parallel graph neural networks (GNNs) with variant modules to learn the latent features of proteins. Third, a cross-attention mechanism fuses the extracted features of the drug and protein, followed by fully connected layers to finalize the prediction. The source code of MEGDTA is available from https://github.com/liyijuncode/MEGDTA . Conclusions MEGDTA is validated on three publicly available benchmark datasets, Davis, KIBA and Metz. A comparative study is conducted with other existing models. The results show that MEGDTA performs strongly in terms of mean squared error (MSE) and concordance index (CI), and r 2 m , which demonstrate the effectiveness of MEGDTA.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1471-2164
1471-2164
DOI:10.1186/s12864-025-11943-w