Multiscanning-Based RNN-Transformer for Hyperspectral Image Classification

The goal of hyperspectral image (HSI) classification is to assign land-cover labels to each HSI pixel in a patch-wise manner. Recently, sequential models, such as recurrent neural networks (RNN), have been developed as HSI classifiers which need to scan the HSI patch into a pixel-sequence with the s...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on geoscience and remote sensing Vol. 61; p. 1
Main Authors	Zhou, Weilian, Kamata, Sei-ichiro, Wang, Haipeng, Xue, Xi
Format	Journal Article
Language	English
Published	New York IEEE 01.01.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Classification Classifiers Coders Decoding Embedding Feature extraction Hyperspectral image classification Hyperspectral imaging Image classification Land cover multiscanning strategy Neural networks Pixels recurrent neural network Recurrent neural networks Scanning Sequencing Spectral analysis State-of-the-art reviews Task analysis Transformer Transformers
Online Access	Get full text
ISSN	0196-2892 1558-0644
DOI	10.1109/TGRS.2023.3277014

Cover

More Information
Summary:	The goal of hyperspectral image (HSI) classification is to assign land-cover labels to each HSI pixel in a patch-wise manner. Recently, sequential models, such as recurrent neural networks (RNN), have been developed as HSI classifiers which need to scan the HSI patch into a pixel-sequence with the scanning order first. However, RNNs have a biased ordering that cannot effectively allocate attention to each pixel in the sequence, and previous methods that use multiple scanning orders to average the features of RNNs are limited by the validity of these orders. To solve this issue, it is naturally inspired by Transformer and its self-attention to discriminatively distribute proper attention for each pixel of the pixel-sequence and each scanning order. Hence, in this study, we further develop the sequential HSI classifiers by a specially designed RNN-Transformer (RT) model to feature the multiple sequential characters of the HSI pixels in the HSI patch. Specifically, we introduce a multiscanning-controlled positional embedding strategy for the RT model to complement multiple feature fusion. Furthermore, the RT encoder is proposed for integrating ordering bias and attention re-allocation for feature generation at the sequence-level. Additionally, the spectral-spatial-based soft masked self-attention is proposed for suitable feature enhancement. Finally, an additional Fusion Transformer is deployed for scanning order-level attention allocation. As a result, the whole network can achieve competitive classification performance on four accessible datasets than other state-of-the-art methods. Our study further extends the research on sequential HSI classifiers.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0196-2892 1558-0644
DOI:	10.1109/TGRS.2023.3277014