IFT-Net: Interactive Fusion Transformer Network for Quantitative Analysis of Pediatric Echocardiography
•An IFT-Net algorithm for pediatric echocardiography quantitative analysis is proposed.•A parallel network of DPT and CNN is presented to fuse the local and global features.•A new key point positioning method for pediatric echocardiographic analysis.•We construct a PSAX views dataset, which provides...
Saved in:
| Published in | Medical image analysis Vol. 82; p. 102648 |
|---|---|
| Main Authors | , , , , , , , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Elsevier B.V
01.11.2022
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1361-8415 1361-8423 1361-8423 |
| DOI | 10.1016/j.media.2022.102648 |
Cover
| Summary: | •An IFT-Net algorithm for pediatric echocardiography quantitative analysis is proposed.•A parallel network of DPT and CNN is presented to fuse the local and global features.•A new key point positioning method for pediatric echocardiographic analysis.•We construct a PSAX views dataset, which provides a new quantitative analysis perspective.
The task of automatic segmentation and measurement of key anatomical structures in echocardiography is critical for subsequent extraction of clinical parameters. However, the influence of boundary blur, speckle noise, and other factors increase the difficulty of fully automatically segmenting 2D ultrasound images. The previous research has addressed this challenge using convolutional neural networks (CNN), which fails to consider global contextual information and long-range dependency. To further improve the quantitative analysis of pediatric echocardiography, this paper proposes an interactive fusion transformer network (IFT-Net) for quantitative analysis of pediatric echocardiography, which achieves the bidirectional fusion between local features and global context information by constructing interactive learning between the convolution branch and the transformer branch. First, we construct a dual-attention pyramid transformer (DPT) branch to model the long-range dependency from spatial and channels and enhance the learning of global context information. Second, we design a bidirectional interactive fusion (BIF) unit that fuses the local and global features interactively, maximizes their preservation and refines the segmentation. Finally, we measure the clinical anatomical parameters through key point positioning. Based on the parasternal short-axis (PSAX) view of the heart base from pediatric echocardiography, we segment and quantify the right ventricular outflow tract (RVOT) and aorta (AO) with promising results, indicating the potential clinical application. The code is publicly available at: https://github.com/Zhaocheng1/IFT-Net.
Figure 2: IFT-Net architecture diagram. The first line is the DPT branch, the second line is the BIF, and the third line is the CNN branch.Xi is the output feature of the i th layer processed by the Transformer block. Yi is the output feature of the i th layer processed by the CNN block. DPT: Dual-attention pyramid Transformer. BIF: Bidirectional interactive fusion. GFL: Group feature learning. [Display omitted] |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 1361-8415 1361-8423 1361-8423 |
| DOI: | 10.1016/j.media.2022.102648 |