Multi-Modal Graph-Aware Transformer with Contrastive Fusion for Brain Tumor Segmentation
Accurate segmentation of brain tumors in MRI images is critical for early diagnosis, surgical planning, and effective treatment strategies. Traditional deep learning models such as U-Net, Attention U-Net, and Swin-U-Net have demonstrated commendable success in tumor segmentation by leveraging Convol...
Saved in:
| Published in | Journal of electronics, electromedical engineering, and medical informatics Vol. 7; no. 4; pp. 1226 - 1239 |
|---|---|
| Main Authors | , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
15.10.2025
|
| Online Access | Get full text |
| ISSN | 2656-8632 2656-8632 |
| DOI | 10.35882/jeeemi.v7i4.993 |
Cover
| Summary: | Accurate segmentation of brain tumors in MRI images is critical for early diagnosis, surgical planning, and effective treatment strategies. Traditional deep learning models such as U-Net, Attention U-Net, and Swin-U-Net have demonstrated commendable success in tumor segmentation by leveraging Convolutional Neural Networks (CNNs) and transformer-based encoders. However, these models often fall short in effectively capturing complex inter-modality interactions and long-range spatial dependencies, particularly in tumor regions with diffuse or poorly defined boundaries. Additionally, they suffer from limited generalization capabilities and demand substantial computational resources. AIM: To overcome these limitations, a novel approach named Graph-Aware Transformer with Contrastive Fusion (GAT-CF) is introduced. This model enhances segmentation performance by integrating spatial attention mechanisms of transformers with graph-based relational reasoning across multiple MRI modalities, namely T1, T2, FLAIR, and T1CE. The graph-aware structure models inter-slice and intra-slice relationships more effectively, promoting better structural understanding of tumor regions. Furthermore, a multi-modal contrastive learning strategy is employed to align semantic features and distinguish complementary modality-specific information, thereby improving the model’s discriminative power. The fusion of these techniques facilitates improved contextual understanding and more accurate boundary delineation in complex tumor regions. When evaluated on the BraTS2021 dataset, the proposed GAT-CF model achieved a Dice score of 99.1% and an IoU of 98.4%, surpassing the performance of state-of-the-art architectures like Swin-UNet and SegResNet. It also demonstrated superior accuracy in detecting and enhancing tumor voxels and core tumor regions, highlighting its robustness, precision, and potential for clinical adoption in neuroimaging applications |
|---|---|
| ISSN: | 2656-8632 2656-8632 |
| DOI: | 10.35882/jeeemi.v7i4.993 |