Moisture content detection of Tibetan tea based on hyperspectral technology, machine vision and machine learning
The moisture content of tea leaves plays a dominant role in the processing and storage of tea leaves, and directly affects the color, flavor and value of tea leaves. This study aims to use hyperspectral imaging technology combined with machine learning methods to achieve nondestructive detection of...
Saved in:
| Published in | Journal of food measurement & characterization Vol. 19; no. 2; pp. 1167 - 1185 |
|---|---|
| Main Authors | , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
New York
Springer US
01.02.2025
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2193-4126 2193-4134 |
| DOI | 10.1007/s11694-024-03032-5 |
Cover
| Summary: | The moisture content of tea leaves plays a dominant role in the processing and storage of tea leaves, and directly affects the color, flavor and value of tea leaves. This study aims to use hyperspectral imaging technology combined with machine learning methods to achieve nondestructive detection of tea moisture content. The hyperspectral images of tea samples in the wavelength range of 387 ~ 1035 nm were collected, the region of interest (ROI) was intercepted by ENVI software and the spectral information was extracted by python programming software, and the texture information of the samples was extracted by using gray scale co-generation matrix (GLCM) to build a model based on spectral, texture and spectral-texture fusion for the detection of moisture content of Tibetan tea. The original Tibetan tea spectral data (RAW) and the fused spectral-texture features were preprocessed using six preprocessing algorithms, including standard normal variational transform (SNVT), multiple scattering correction (MSC), first-order derivative (FD), second-order derivative (SD), Savitzky-Golay (SG) filtering and Z-Score Standardization (ZSS). After extracting the Tibetan tea spectral, texture, and spectral-texture fusion features using GB, AdaBoost, RF, XGBoost, LightGBM, and CatBoost algorithms, respectively, the top 30 features were ranked according to their importance and were used as inputs to the RFR, CatBoostR, LightGBMR, and XGBoostR models. The XGBoost + CatBoostR model has the best performance with
R
c
2
,
R
p
2
, and RMSEC and RMSEP of 0.9814, 0.9788, and 0.2064, 0.2506, respectively. And according to the results of modeling, the features extracted by GB algorithm are filtered as inputs, and finally the Stacking model with XGBoostR and CatBoostR as base learners and CatBoostR as meta-learner is built. The prediction results of this model are more satisfactory, and its
R
c
2
,
R
p
2
, RMSEC, and RMSEP are 0.9947, 0.9817, and 0.1101, 0.2326, respectively. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 2193-4126 2193-4134 |
| DOI: | 10.1007/s11694-024-03032-5 |