The auto segmentation for cardiac structures using a dual‐input deep learning network based on vision saliency and transformer

Purpose Accurate segmentation of cardiac structures on coronary CT angiography (CCTA) images is crucial for the morphological analysis, measurement, and functional evaluation. In this study, we achieve accurate automatic segmentation of cardiac structures on CCTA image by adopting an innovative deep...

Full description

Saved in:

Bibliographic Details
Published in	Journal of applied clinical medical physics Vol. 23; no. 5; pp. e13597 - n/a
Main Authors	Wang, Jing, Wang, Shuyu, Liang, Wei, Zhang, Nan, Zhang, Yan
Format	Journal Article
Language	English
Published	United States John Wiley & Sons, Inc 01.05.2022 John Wiley and Sons Inc
Subjects	Accuracy Algorithms coronary CT angiography Deep learning Electrocardiography Medical Imaging Neural networks self‐attention transformers visual attention mechanism coronary CT angiography deep learning self-attention transformers visual attention mechanism
Online Access	Get full text
ISSN	1526-9914 1526-9914
DOI	10.1002/acm2.13597

Cover

More Information
Summary:	Purpose Accurate segmentation of cardiac structures on coronary CT angiography (CCTA) images is crucial for the morphological analysis, measurement, and functional evaluation. In this study, we achieve accurate automatic segmentation of cardiac structures on CCTA image by adopting an innovative deep learning method based on visual attention mechanism and transformer network, and its practical application value is discussed. Methods We developed a dual‐input deep learning network based on visual saliency and transformer (VST), which consists of self‐attention mechanism for cardiac structures segmentation. Sixty patients’ CCTA subjects were randomly selected as a development set, which were manual marked by an experienced technician. The proposed vision attention and transformer mode was trained on the patients CCTA images, with a manual contour‐derived binary mask used as the learning‐based target. We also used the deep supervision strategy by adding auxiliary losses. The loss function of our model was the sum of the Dice loss and cross‐entropy loss. To quantitatively evaluate the segmentation results, we calculated the Dice similarity coefficient (DSC) and Hausdorff distance (HD). Meanwhile, we compare the volume of automatic segmentation and manual segmentation to analyze whether there is statistical difference. Results Fivefold cross‐validation was used to benchmark the segmentation method. The results showed the left ventricular myocardium (LVM, DSC = 0.87), the left ventricular (LV, DSC = 0.94), the left atrial (LA, DSC = 0.90), the right ventricular (RV, DSC = 0.92), the right atrial (RA, DSC = 0.91), and the aortic (AO, DSC = 0.96). The average DSC was 0.92, and HD was 7.2 ± 2.1 mm. In volume comparison, except LVM and LA (p < 0.05), there was no significant statistical difference in other structures. Proposed method for structural segmentation fit well with the true profile of the cardiac substructure, and the model prediction results closed to the manual annotation. Conclusions The adoption of the dual‐input and transformer architecture based on visual saliency has high sensitivity and specificity to cardiac structures segmentation, which can obviously improve the accuracy of automatic substructure segmentation. This is of gr
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1526-9914 1526-9914
DOI:	10.1002/acm2.13597