Multiscanning-Based RNN-Transformer for Hyperspectral Image Classification

The goal of hyperspectral image (HSI) classification is to assign land-cover labels to each HSI pixel in a patch-wise manner. Recently, sequential models, such as recurrent neural networks (RNN), have been developed as HSI classifiers which need to scan the HSI patch into a pixel-sequence with the s...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on geoscience and remote sensing Vol. 61; p. 1
Main Authors Zhou, Weilian, Kamata, Sei-ichiro, Wang, Haipeng, Xue, Xi
Format Journal Article
LanguageEnglish
Published New York IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0196-2892
1558-0644
DOI10.1109/TGRS.2023.3277014

Cover

More Information
Summary:The goal of hyperspectral image (HSI) classification is to assign land-cover labels to each HSI pixel in a patch-wise manner. Recently, sequential models, such as recurrent neural networks (RNN), have been developed as HSI classifiers which need to scan the HSI patch into a pixel-sequence with the scanning order first. However, RNNs have a biased ordering that cannot effectively allocate attention to each pixel in the sequence, and previous methods that use multiple scanning orders to average the features of RNNs are limited by the validity of these orders. To solve this issue, it is naturally inspired by Transformer and its self-attention to discriminatively distribute proper attention for each pixel of the pixel-sequence and each scanning order. Hence, in this study, we further develop the sequential HSI classifiers by a specially designed RNN-Transformer (RT) model to feature the multiple sequential characters of the HSI pixels in the HSI patch. Specifically, we introduce a multiscanning-controlled positional embedding strategy for the RT model to complement multiple feature fusion. Furthermore, the RT encoder is proposed for integrating ordering bias and attention re-allocation for feature generation at the sequence-level. Additionally, the spectral-spatial-based soft masked self-attention is proposed for suitable feature enhancement. Finally, an additional Fusion Transformer is deployed for scanning order-level attention allocation. As a result, the whole network can achieve competitive classification performance on four accessible datasets than other state-of-the-art methods. Our study further extends the research on sequential HSI classifiers.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2023.3277014