Self Pre-Training with Adaptive Mask Autoencoders for Variable-Contrast 3D Medical Imaging

The Masked Autoencoder (MAE) has recently demonstrated effectiveness in pre-training Vision Transformers (ViT) for analyzing natural images. By reconstructing complete images from partially masked inputs, the ViT encoder gathers contextual information to predict the missing regions. This capability...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings (International Symposium on Biomedical Imaging) pp. 1 - 5
Main Authors	Das, Badhan Kumar, Zhao, Gengyan, Liu, Han, Re, Thomas J., Comaniciu, Dorin, Gibson, Eli, Maier, Andreas
Format	Conference Proceeding
Language	English
Published	IEEE 14.04.2025
Subjects	Adaptation models Autoencoders Biomedical imaging Computer vision Image segmentation Magnetic resonance imaging Masked Autoencoders Self Pre-training Solid modeling Three-dimensional displays Training Transformers Variable Inputs Vision Transformer
Online Access	Get full text
ISSN	1945-8452
DOI	10.1109/ISBI60581.2025.10981097

Cover

Abstract	The Masked Autoencoder (MAE) has recently demonstrated effectiveness in pre-training Vision Transformers (ViT) for analyzing natural images. By reconstructing complete images from partially masked inputs, the ViT encoder gathers contextual information to predict the missing regions. This capability to aggregate context is especially important in medical imaging, where anatomical structures are functionally and mechanically linked to surrounding regions. However, current methods do not consider variations in the number of input images, which is typically the case in realworld Magnetic Resonance (MR) studies. To address this limitation, we propose a 3D Adaptive Masked Autoencoders (AMAE) architecture that accommodates a variable number of 3D input contrasts per subject. A magnetic resonance imaging (MRI) dataset of 45,364 subjects was used for pretraining and a subset of 1648 training, 193 validation and 215 test subjects were used for finetuning. The performance demonstrates that self pre-training of this adaptive masked autoencoders can enhance the infarct segmentation performance by 2.8%-3.7% for ViT-based segmentation models.
AbstractList	The Masked Autoencoder (MAE) has recently demonstrated effectiveness in pre-training Vision Transformers (ViT) for analyzing natural images. By reconstructing complete images from partially masked inputs, the ViT encoder gathers contextual information to predict the missing regions. This capability to aggregate context is especially important in medical imaging, where anatomical structures are functionally and mechanically linked to surrounding regions. However, current methods do not consider variations in the number of input images, which is typically the case in realworld Magnetic Resonance (MR) studies. To address this limitation, we propose a 3D Adaptive Masked Autoencoders (AMAE) architecture that accommodates a variable number of 3D input contrasts per subject. A magnetic resonance imaging (MRI) dataset of 45,364 subjects was used for pretraining and a subset of 1648 training, 193 validation and 215 test subjects were used for finetuning. The performance demonstrates that self pre-training of this adaptive masked autoencoders can enhance the infarct segmentation performance by 2.8%-3.7% for ViT-based segmentation models.
Author	Re, Thomas J. Liu, Han Das, Badhan Kumar Maier, Andreas Gibson, Eli Zhao, Gengyan Comaniciu, Dorin
Author_xml	– sequence: 1 givenname: Badhan Kumar surname: Das fullname: Das, Badhan Kumar organization: Siemens Healthineers AG – sequence: 2 givenname: Gengyan surname: Zhao fullname: Zhao, Gengyan organization: Siemens Medical Solutions USA, Inc – sequence: 3 givenname: Han surname: Liu fullname: Liu, Han organization: Siemens Medical Solutions USA, Inc – sequence: 4 givenname: Thomas J. surname: Re fullname: Re, Thomas J. organization: Siemens Medical Solutions USA, Inc – sequence: 5 givenname: Dorin surname: Comaniciu fullname: Comaniciu, Dorin organization: Siemens Medical Solutions USA, Inc – sequence: 6 givenname: Eli surname: Gibson fullname: Gibson, Eli organization: Siemens Medical Solutions USA, Inc – sequence: 7 givenname: Andreas surname: Maier fullname: Maier, Andreas organization: FAU Erlangen-Nuremberg
BookMark	eNo1kNFKwzAYhaMoOOfeQDAv0Jk0SZNczjq1sKGw4oU342_zZ0a7dqRV8e0tqAcOB86B7-Kck5O2a5GQK87mnDN7XWxuiowpw-cpS9V8rMxofURmVlsjBFcpU2l2TCbcSpUYqdIzMuv7NzZKSymYnJCXDTaePkVMygihDe2OfoXhlS4cHIbwiXQN_TtdfAwdtnXnMPbUd5E-QwxQNZjkXTtE6AcqbukaXaihocUediPogpx6aHqc_eWUlHfLMn9IVo_3Rb5YJcGKIbF1ZbzjXBhXOa58ZqWtfJ1l2jvU2gmfWqfHTSiorckAFDhhmanAe-mcmJLLX2xAxO0hhj3E7-3_G-IHHUxXSQ
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ISBI60581.2025.10981097
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISBN	9798331520526
EISSN	1945-8452
EndPage	5
ExternalDocumentID	10981097
Genre	orig-research
GrantInformation_xml	– fundername: Siemens Healthineers funderid: 10.13039/501100011699
GroupedDBID	23N 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL RNS
ID	FETCH-LOGICAL-i93t-9cb8fd1138dbd15f6949bfc667fde77d3f29d7dbd35ac986aa5ad3908baff4dd3
IEDL.DBID	RIE
IngestDate	Wed Aug 27 01:53:17 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i93t-9cb8fd1138dbd15f6949bfc667fde77d3f29d7dbd35ac986aa5ad3908baff4dd3
PageCount	5
ParticipantIDs	ieee_primary_10981097
PublicationCentury	2000
PublicationDate	2025-April-14
PublicationDateYYYYMMDD	2025-04-14
PublicationDate_xml	– month: 04 year: 2025 text: 2025-April-14 day: 14
PublicationDecade	2020
PublicationTitle	Proceedings (International Symposium on Biomedical Imaging)
PublicationTitleAbbrev	ISBI
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0000744304
Score	2.297934
Snippet	The Masked Autoencoder (MAE) has recently demonstrated effectiveness in pre-training Vision Transformers (ViT) for analyzing natural images. By reconstructing...
SourceID	ieee
SourceType	Publisher
StartPage	1
SubjectTerms	Adaptation models Autoencoders Biomedical imaging Computer vision Image segmentation Magnetic resonance imaging Masked Autoencoders Self Pre-training Solid modeling Three-dimensional displays Training Transformers Variable Inputs Vision Transformer
Title	Self Pre-Training with Adaptive Mask Autoencoders for Variable-Contrast 3D Medical Imaging
URI	https://ieeexplore.ieee.org/document/10981097
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1ba8IwFA7Tp-1lN8fu5GGv6VrTpsmju4gOlIFuyF4kaU5guKlo-7Jfv5xa3QUGeyvNKYSE9Mu5fN8h5MoIPyDijEmDotrWAFNGK2ZsYsEjRBMcEoV7fdF5ih9Gyagiq5dcGAAoi88gwMcyl29nWYGhMn_ClcSMaY3UUilWZK1NQMVjYex986qGy5tddwc3Xcz6oRvYTIL11z_6qJQw0t4l_fUEVtUjk6DITZB9_NJm_PcM90jji7FHHzdYtE-2YHpAdr6JDR6SlwG8OW8EbFi1haAYhKUtq-f4z6M9vZzQVpHPUNsS65upv9DSZ-9MI72KoY7VQi9zyu9old6h3feyyVGDDNv3w9sOqzorsFfFc6YyI52NIi6tsVHihIqVcZkQqbOQppa7prKpH-OJzpQUWifachVKo52LreVHpD6dTeGYUG-ohdEx-HsQKooayXWonAITqhC4PiENXKXxfKWdMV4v0Okf78_INm4W5mui-JzU80UBFx72c3NZbvcn7zyuvA
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LS8MwGA86D-rF18S3OXhtbZdHm-N8seo2hFUZXkbSJCDTbWztxb_efF03HyB4K80XCAnJ9_z9PoQuFHcDnGZerIBUWyvjCSWFpzTTxmmIhrEAFO50eeuJ3vdZvwKrl1gYY0xZfGZ8-Cxz-XqcFRAqczdcxJAxXUVrjFLK5nCtZUjFaUPqvPOqissJXia9qwTyfuAINpi_mP-jk0qpSO62UHexhHn9yNAvcuVnH7_YGf-9xm1U_8Ls4celNtpBK2a0iza_0Q3uoZeeebNOyHhp1RgCQxgWN7WcwKuHO3I2xM0iHwO7JVQ4Y2fS4mfnTgPAygMmq6mc5Zjc4CrBg5P3ss1RHaV3t-l1y6t6K3ivguSeyFRsdRiSWCsdMssFFcpmnEdWmyjSxDaEjtwYYTITMZeSSU1EECtpLdWa7KPaaDwyBwg7QcmVpMZZQsApqmIiA2GFUYEIDJGHqA67NJjM2TMGiw06-uP_OVpvpZ32oJ10H47RBhwcZG9CeoJq-bQwp84IyNVZefSfRECyCQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28International+Symposium+on+Biomedical+Imaging%29&rft.atitle=Self+Pre-Training+with+Adaptive+Mask+Autoencoders+for+Variable-Contrast+3D+Medical+Imaging&rft.au=Das%2C+Badhan+Kumar&rft.au=Zhao%2C+Gengyan&rft.au=Liu%2C+Han&rft.au=Re%2C+Thomas+J.&rft.date=2025-04-14&rft.pub=IEEE&rft.eissn=1945-8452&rft.spage=1&rft.epage=5&rft_id=info:doi/10.1109%2FISBI60581.2025.10981097&rft.externalDocID=10981097