A pipeline for processing hyperspectral images, with a case of melanin-containing barley grains as an example

Analysis of hyperspectral images is of great interest in plant studies. Nowadays, this analysis is used more and more widely, so the development of hyperspectral image processing methods is an urgent task. This paper presents a hyperspectral image processing pipeline that includes: preprocessing, ba...

Full description

Saved in:

Bibliographic Details
Published in	Vavilovskiĭ zhurnal genetiki i selekt͡s︡ii Vol. 28; no. 4; pp. 443 - 455
Main Authors	Busov, I. D., Genaev, M. A., Komyshev, E. G., Koval, V. S., Zykova, T. E., Glagoleva, A. Y., Afonnikov, D. A.
Format	Journal Article
Language	English
Published	Russia (Federation) The Federal Research Center Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences 01.07.2024 Siberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and Breeders
Subjects	barley grains hyperspectral images machine learning Original pigment composition statistical analysis pigment composition barley grains machine learning statistical analysis hyperspectral images
Online Access	Get full text
ISSN	2500-3259 2500-0462 2500-3259
DOI	10.18699/vjgb-24-50

Cover

More Information
Summary:	Analysis of hyperspectral images is of great interest in plant studies. Nowadays, this analysis is used more and more widely, so the development of hyperspectral image processing methods is an urgent task. This paper presents a hyperspectral image processing pipeline that includes: preprocessing, basic statistical analysis, visualization of a multichannel hyperspectral image, and solving classification and clustering problems using machine learning methods. The current version of the package implements the following methods: construction of a confidence interval of an arbitrary level for the difference of sample averages; verification of the similarity of intensity distributions of spectral lines for two sets of hyperspectral images on the basis of the Mann–Whitney U-criterion and Pearson’s criterion of agreement; visualization in two-dimensional space using dimensionality reduction methods PCA, ISOMAP and UMAP; classification using linear or ridge regression, random forest and catboost; clustering of samples using the EM-algorithm. The software pipeline is implemented in Python using the Pandas, NumPy, OpenCV, SciPy, Sklearn, Umap, CatBoost and Plotly libraries. The source code is available at: https://github.com/igor2704/Hyperspectral_images. The pipeline was applied to identify melanin pigment in the shell of barley grains based on hyperspectral data. Visualization based on PCA, UMAP and ISOMAP methods, as well as the use of clustering algorithms, showed that a linear separation of grain samples with and without pigmentation could be performed with high accuracy based on hyperspectral data. The analysis revealed statistically significant differences in the distribution of median intensities for samples of images of grains with and without pigmentation. Thus, it was demonstrated that hyperspectral images can be used to determine the presence or absence of melanin in barley grains with great accuracy. The flexible and convenient tool created in this work will significantly increase the efficiency of hyperspectral image analysis.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Correspondence to: I.D. Busov i.busov@g.nsu.ru
ISSN:	2500-3259 2500-0462 2500-3259
DOI:	10.18699/vjgb-24-50