MCBTNet: Multi-Feature Fusion CNN and Bi- Level Routing Attention Transformer-Based Medical Image Segmentation Network
Accurate medical image segmentation is crucial for precise diagnosis and treatment in clinical pathology analysis and surgical navigation. While Convolutional Neural Network (CNN)-based approaches excel in capturing and analyzing local features, they often lose key global context. Transformers, util...
Saved in:
| Published in | IEEE journal of biomedical and health informatics Vol. 29; no. 7; pp. 5069 - 5082 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
IEEE
01.07.2025
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2168-2194 2168-2208 2168-2208 |
| DOI | 10.1109/JBHI.2025.3545398 |
Cover
| Summary: | Accurate medical image segmentation is crucial for precise diagnosis and treatment in clinical pathology analysis and surgical navigation. While Convolutional Neural Network (CNN)-based approaches excel in capturing and analyzing local features, they often lose key global context. Transformers, utilizing self-attention mechanisms, address this issue but often overlook localized and multi-scale features while also requiring significant computational resources. To integrate the advantages of CNNs and Transformers to achieve efficient and precise medical image segmentation, we propose a segmentation framework based on multi-feature fusion CNN and Bi-level Routing Attention Transformer (MCBTNet). MCBTNet integrates CNNs and Transformers within a U-shaped encoder-decoder architecture. This configuration not only extracts multi-scale features via the U-shaped structure but also efficiently captures global contextual information through the dynamic sparsity of the Bi-Level Routing Attention Transformer. Our novel Frequency-Channel-Spatial multi-dimensional attention mechanism is implemented on skip connections, enhancing segmentation accuracy and speed by maximizing multi-scale feature utilization. Finally, MCBTNet obtains the segmentation result by fusing the predictions of different scales. Experimental results on five public datasets demonstrate that MCBTNet outperforms state-of-the-art methods in Dice and HD metrics, with lower computational and memory requirements. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 2168-2194 2168-2208 2168-2208 |
| DOI: | 10.1109/JBHI.2025.3545398 |