Multi-class classification of gait cycle phases using machine learning: a comprehensive study using two training methods

Walking is a fundamental human activity, and a deep understanding of its complexities is essential for accurately diagnosing and treating gait abnormalities and musculoskeletal disorders. This study investigates the application of machine learning (ML) methods for categorizing gait phases into their...

Full description

Saved in:

Bibliographic Details
Published in	Network modeling and analysis in health informatics and bioinformatics (Wien) Vol. 14; no. 1; p. 30
Main Authors	Mekni, Amal, Narayan, Jyotindra, Gritli, Hassène
Format	Journal Article
Language	English
Published	Vienna Springer Vienna 01.12.2025 Springer Nature B.V
Subjects	Abnormalities Accuracy Algorithms Applications of Graph Theory and Complex Networks Bioinformatics Classification Computational Biology/Bioinformatics Computer Science Datasets Decision trees Discriminant analysis Gait Gender Health Informatics Learning algorithms Machine learning Males Musculoskeletal diseases Neural networks Original Article Principal components analysis Random sampling Regression analysis Root-mean-square errors Scaling Statistical sampling Support vector machines Training methods Accuracy Machine learning Cross validation Gait phases Classification algorithms Root mean squared error Mean squared error
Online Access	Get full text
ISSN	2192-6670 2192-6662 2192-6670
DOI	10.1007/s13721-025-00522-4

Cover

More Information
Summary:	Walking is a fundamental human activity, and a deep understanding of its complexities is essential for accurately diagnosing and treating gait abnormalities and musculoskeletal disorders. This study investigates the application of machine learning (ML) methods for categorizing gait phases into their individual subphases, through two training methodologies using data from 100 individuals obtained from an open-source platform. The first method employed stratified random sampling, for which 80% of the data in each subphase is allocated for training, while the remaining 20% is reserved for testing. The second method involves training the models using data from 80% of all the participants and then testing them using data from the remaining 20%. Before implementing various ML algorithms, the dataset underwent two scaling techniques-Min-Max Scaling (MMS) and Standard Scaling (SS)-and one dimensionality reduction approach, Principal Component Analysis (PCA). After ensuring the dataset is appropriately scaled or dimensionally reduced, we implement and assess the performance of several ML models, namely k -Nearest Neighbors ( k -NN), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (NB), Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis (QDA). The evaluation of each model is based on multiple metrics, including Cross-Validation (CV) Score, Mean Squared Error (MSE), Root Mean Squared Error (RMSE), accuracy, and R 2 score. For the MMS technique, LDA achieved the best performance with the highest CV score (0.9671), lowest MSE (0.0286), and highest accuracy (97.14%) in the first training method, while SVM showed the best results with a CV score of 0.8615 and the lowest MSE (0.7674) in the second training method.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2192-6670 2192-6662 2192-6670
DOI:	10.1007/s13721-025-00522-4