U-Net vs Transformer: Is U-Net Outdated in Medical Image Registration?

Due to their extreme long-range modeling capability, vision transformer-based networks have become increasingly popular in deformable image registration. We believe, however, that the receptive field of a 5-layer convolutional U-Net is sufficient to capture accurate deformations without needing long...

Full description

Saved in:
Bibliographic Details
Published inMachine Learning in Medical Imaging Vol. 13583; pp. 151 - 160
Main Authors Jia, Xi, Bartlett, Joseph, Zhang, Tianyang, Lu, Wenqi, Qiu, Zhaowen, Duan, Jinming
Format Book Chapter
LanguageEnglish
Published Switzerland Springer 2022
Springer Nature Switzerland
SeriesLecture Notes in Computer Science
Online AccessGet full text
ISBN9783031210136
3031210131
ISSN0302-9743
1611-3349
DOI10.1007/978-3-031-21014-3_16

Cover

More Information
Summary:Due to their extreme long-range modeling capability, vision transformer-based networks have become increasingly popular in deformable image registration. We believe, however, that the receptive field of a 5-layer convolutional U-Net is sufficient to capture accurate deformations without needing long-range dependencies. The purpose of this study is therefore to investigate whether U-Net-based methods are outdated compared to modern transformer-based approaches when applied to medical image registration. For this, we propose a large kernel U-Net (LKU-Net) by embedding a parallel convolutional block to a vanilla U-Net in order to enhance the effective receptive field. On the public 3D IXI brain dataset for atlas-based registration, we show that the performance of the vanilla U-Net is already comparable with that of state-of-the-art transformer-based networks (such as TransMorph), and that the proposed LKU-Net outperforms TransMorph by using only 1.12% of its parameters and 10.8% of its mult-adds operations. We further evaluate LKU-Net on a MICCAI Learn2Reg 2021 challenge dataset for inter-subject registration, our LKU-Net also outperforms TransMorph on this dataset and ranks first on the public leaderboard as of the submission of this work. With only modest modifications to the vanilla U-Net, we show that U-Net can outperform transformer-based architectures on inter-subject and atlas-based 3D medical image registration. Code is available at https://github.com/xi-jia/LKU-Net.
ISBN:9783031210136
3031210131
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-031-21014-3_16