A comparison of decision tree-based algorithms for food discrimination using vibrational spectroscopy

In this work, we systematically studied decision tree-based algorithms—decision tree, random forest, and XGBoost—for binary food discrimination. Using near-infrared (NIR) and Raman spectroscopy, accuracy values of up to 99 % were achieved for discriminating (1) gluten-containing and gluten-free brea...

Full description

Saved in:
Bibliographic Details
Published inFood chemistry Vol. 488; p. 144909
Main Authors da Silva, Leandro P., Oliveira, Micael D.L., Villa, Javier E.L.
Format Journal Article
LanguageEnglish
Published England Elsevier Ltd 01.10.2025
Subjects
Online AccessGet full text
ISSN0308-8146
1873-7072
1873-7072
DOI10.1016/j.foodchem.2025.144909

Cover

More Information
Summary:In this work, we systematically studied decision tree-based algorithms—decision tree, random forest, and XGBoost—for binary food discrimination. Using near-infrared (NIR) and Raman spectroscopy, accuracy values of up to 99 % were achieved for discriminating (1) gluten-containing and gluten-free bread and (2) pure and sucrose-adulterated coconut water, respectively. Moreover, NIR bands of water (OH bonds), protein content (CH and NH bonds), and Raman bands attributed to C–O–C bonds in the glycosidic structure were identified as the most important. In addition to traditional feature importance estimates, a strategy based on impurity reduction was proposed to improve chemical interpretability. The figure of merit and splitting method selected for optimizing the algorithms were also evaluated. Although random forest required a higher computational cost than decision tree and partial least squares discriminant analysis, it outperformed both in accuracy. It also provided more robust and chemically meaningful results than XGBoost. [Display omitted] •Direct bread and coconut water analyses by vibrational spectroscopy.•XGBoost and random forest outperform partial least squares discriminant analysis.•Consistent variable importance estimation approach by using random forest.•Improved chemical interpretability for binary food discrimination.•Systematic model validation by estimating figures of merit.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0308-8146
1873-7072
1873-7072
DOI:10.1016/j.foodchem.2025.144909