Multi-Modal Graph-Aware Transformer with Contrastive Fusion for Brain Tumor Segmentation

Accurate segmentation of brain tumors in MRI images is critical for early diagnosis, surgical planning, and effective treatment strategies. Traditional deep learning models such as U-Net, Attention U-Net, and Swin-U-Net have demonstrated commendable success in tumor segmentation by leveraging Convol...

Full description

Saved in:

Bibliographic Details
Published in	Journal of electronics, electromedical engineering, and medical informatics Vol. 7; no. 4; pp. 1226 - 1239
Main Authors	Chowdhury, Rini, Kumar, Prashant, Suganthi, R., Ammu, V., Evance Leethial, R., Roopa, C.
Format	Journal Article
Language	English
Published	15.10.2025
Online Access	Get full text
ISSN	2656-8632 2656-8632
DOI	10.35882/jeeemi.v7i4.993

Cover

More Information
Summary:	Accurate segmentation of brain tumors in MRI images is critical for early diagnosis, surgical planning, and effective treatment strategies. Traditional deep learning models such as U-Net, Attention U-Net, and Swin-U-Net have demonstrated commendable success in tumor segmentation by leveraging Convolutional Neural Networks (CNNs) and transformer-based encoders. However, these models often fall short in effectively capturing complex inter-modality interactions and long-range spatial dependencies, particularly in tumor regions with diffuse or poorly defined boundaries. Additionally, they suffer from limited generalization capabilities and demand substantial computational resources. AIM: To overcome these limitations, a novel approach named Graph-Aware Transformer with Contrastive Fusion (GAT-CF) is introduced. This model enhances segmentation performance by integrating spatial attention mechanisms of transformers with graph-based relational reasoning across multiple MRI modalities, namely T1, T2, FLAIR, and T1CE. The graph-aware structure models inter-slice and intra-slice relationships more effectively, promoting better structural understanding of tumor regions. Furthermore, a multi-modal contrastive learning strategy is employed to align semantic features and distinguish complementary modality-specific information, thereby improving the model’s discriminative power. The fusion of these techniques facilitates improved contextual understanding and more accurate boundary delineation in complex tumor regions. When evaluated on the BraTS2021 dataset, the proposed GAT-CF model achieved a Dice score of 99.1% and an IoU of 98.4%, surpassing the performance of state-of-the-art architectures like Swin-UNet and SegResNet. It also demonstrated superior accuracy in detecting and enhancing tumor voxels and core tumor regions, highlighting its robustness, precision, and potential for clinical adoption in neuroimaging applications
ISSN:	2656-8632 2656-8632
DOI:	10.35882/jeeemi.v7i4.993