GFTT: Geographical Feature Tokenization Transformer for SAR-to-Optical Image Translation

Synthetic aperture radar (SAR) image to optical image translation not only assists information interpretability, but also fills the gaps in optical applications due to weather and light limitations. However, several studies have pointed out that specialized methods heavily struggle to deliver images...

Full description

Saved in:
Bibliographic Details
Published inIEEE journal of selected topics in applied earth observations and remote sensing Vol. 18; pp. 2975 - 2989
Main Authors Liang, Hongbo, Yang, Xuezhi, Yang, Xiangyu, Luo, Jinjin, Zhu, Jiajia
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.01.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1939-1404
2151-1535
DOI10.1109/JSTARS.2024.3523274

Cover

More Information
Summary:Synthetic aperture radar (SAR) image to optical image translation not only assists information interpretability, but also fills the gaps in optical applications due to weather and light limitations. However, several studies have pointed out that specialized methods heavily struggle to deliver images with widely varying optical imaging styles, thus, resulting in poor image translation with disharmonious and repetitive artifacts. Another critical issue attributes to the scarcity of geographical prior knowledge. The generator always attempts to produce images within a narrow scope of the data space, which severely restricts the semantic correspondence between SAR content and optical styles. In this article, we introduce a novel tokenization, namely geographical imaging tokenizer (GIT), which captures imaging style of ground materials in the optical image. Based on the GIT, we propose a geographical feature tokenization transformer framework (GFTT) that discovers the consensus between SAR and optical images. In addition, we leverage a self-supervisory task to encourage the transformer to learn meaningful semantic correspondence from local and global style patterns. Finally, we utilize the noise-contrastive estimation loss to maximize mutual information between the input and translated image. Through qualitative and quantitative experimental evaluations, we verify the reliability of the proposed GIT that aligns with authentic expressions of the optical observation scenario, and indicates the superiority of GFTT in contrast to the state-of-the-art algorithms.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2024.3523274