CASPFuse: An Infrared and Visible Image Fusion Method Based on Dual-Cycle Crosswise Awareness and Global Structure-Tensor Preservation

The inconsistency in the feature spaces of cross-modality images makes it challenging to adaptively fuse images of different modalities. The fusion rules inevitably have global or local proneness toward one modality. To address the challenge, this article proposes a novel fusion method based on dual...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on instrumentation and measurement Vol. 74; pp. 1 - 15
Main Authors	Li, Xuan, Chen, Rongfu, Wang, Jie, Chen, Weiwei, Zhou, Huabing, Ma, Jiayi
Format	Journal Article
Language	English
Published	New York IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Computer vision Deep learning Feature extraction Generators Image edge detection Image fusion Image quality Image reconstruction Information filters Infrared imagery Interference Low-pass filters modality transition Semantics Source code structure awareness Tensors Transformers
Online Access	Get full text
ISSN	0018-9456 1557-9662
DOI	10.1109/TIM.2024.3509580

Cover

More Information
Summary:	The inconsistency in the feature spaces of cross-modality images makes it challenging to adaptively fuse images of different modalities. The fusion rules inevitably have global or local proneness toward one modality. To address the challenge, this article proposes a novel fusion method based on dual-cycle crosswise awareness and global structure-tensor preservation (CASPFuse), which is a modal transition network and contrives images taking part in the fusion operation to have similar feature spaces. First, a dual-cycle modality transition (DMT) network is proposed. Under the guidance of cross-aware-based adversarial learning and the constraints of edge prior, it allows the generated images better inherit the specialties of original images on the basis of pursuing the modal transition. Second, a set of global structure-tensor preserving (STP) models and STP-aware loss are designed to enhance the capabilities of the network in structural preservation and modal consistency perception. STP-aware loss, collaborate with cycle-consistency loss and cross-aware loss, enable our network to effectively supervise the generation of high-quality pseudo images and to eliminate the adverse artifacts and structural degradation. Third, we devise a progressively adaptive fusion (PAF) network, which sequentially generates pairwise images with unified modalities and fine-tunes their structures to overcome the challenge of effective aggregation of different modal attributes. Extensive comparative experiments demonstrate that our CASPFuse outperforms state-of-the-art fusion methods in adequately expressing the advantageous complementary information of different modalities. The source code is available at https://github.com/xbsj-cool/CASPFuse .
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9456 1557-9662
DOI:	10.1109/TIM.2024.3509580