Evaluating machine learning pipelines for multimodal neuroimaging in small cohorts: an ALS case study
Advancements in machine learning hold great promise for the analysis of multimodal neuroimaging data. They can help identify biomarkers and improve diagnosis for various neurological disorders. However, the application of such techniques for rare and heterogeneous diseases remains challenging due to...
Saved in:
Published in | Frontiers in neuroinformatics Vol. 19; p. 1568116 |
---|---|
Main Authors | , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Switzerland
Frontiers Media
13.06.2025
Frontiers Media S.A |
Subjects | |
Online Access | Get full text |
ISSN | 1662-5196 1662-5196 |
DOI | 10.3389/fninf.2025.1568116 |
Cover
Summary: | Advancements in machine learning hold great promise for the analysis of multimodal neuroimaging data. They can help identify biomarkers and improve diagnosis for various neurological disorders. However, the application of such techniques for rare and heterogeneous diseases remains challenging due to small-cohorts available for acquiring data. Efforts are therefore commonly directed toward improving the classification models, in an effort to optimize outcomes given the limited data. In this study, we systematically evaluated the impact of various machine learning pipeline configurations, including scaling methods, feature selection, dimensionality reduction, and hyperparameter optimization. The efficacy of such components in the pipeline was evaluated on classification performance using multimodal MRI data from a cohort of 16 ALS patients and 14 healthy controls. Our findings reveal that, while certain pipeline components, such as subject-wise feature normalization, help improve classification outcomes, the overall influence of pipeline refinements on performance is modest. Feature selection and dimensionality reduction steps were found to have limited utility, and the choice of hyperparameter optimization strategies produced only marginal gains. Our results suggest that, for small-cohort studies, the emphasis should shift from extensive tuning of these pipelines to addressing data-related limitations, such as progressively expanding cohort size, integrating additional modalities, and maximizing the information extracted from existing datasets. This study provides a methodological framework to guide future research and emphasizes the need for dataset enrichment to improve clinical utility. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1662-5196 1662-5196 |
DOI: | 10.3389/fninf.2025.1568116 |