Multi-class classification of gait cycle phases using machine learning: a comprehensive study using two training methods

Walking is a fundamental human activity, and a deep understanding of its complexities is essential for accurately diagnosing and treating gait abnormalities and musculoskeletal disorders. This study investigates the application of machine learning (ML) methods for categorizing gait phases into their...

Full description

Saved in:
Bibliographic Details
Published inNetwork modeling and analysis in health informatics and bioinformatics (Wien) Vol. 14; no. 1; p. 30
Main Authors Mekni, Amal, Narayan, Jyotindra, Gritli, Hassène
Format Journal Article
LanguageEnglish
Published Vienna Springer Vienna 01.12.2025
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN2192-6670
2192-6662
2192-6670
DOI10.1007/s13721-025-00522-4

Cover

More Information
Summary:Walking is a fundamental human activity, and a deep understanding of its complexities is essential for accurately diagnosing and treating gait abnormalities and musculoskeletal disorders. This study investigates the application of machine learning (ML) methods for categorizing gait phases into their individual subphases, through two training methodologies using data from 100 individuals obtained from an open-source platform. The first method employed stratified random sampling, for which 80% of the data in each subphase is allocated for training, while the remaining 20% is reserved for testing. The second method involves training the models using data from 80% of all the participants and then testing them using data from the remaining 20%. Before implementing various ML algorithms, the dataset underwent two scaling techniques-Min-Max Scaling (MMS) and Standard Scaling (SS)-and one dimensionality reduction approach, Principal Component Analysis (PCA). After ensuring the dataset is appropriately scaled or dimensionally reduced, we implement and assess the performance of several ML models, namely k -Nearest Neighbors ( k -NN), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (NB), Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis (QDA). The evaluation of each model is based on multiple metrics, including Cross-Validation (CV) Score, Mean Squared Error (MSE), Root Mean Squared Error (RMSE), accuracy, and R 2 score. For the MMS technique, LDA achieved the best performance with the highest CV score (0.9671), lowest MSE (0.0286), and highest accuracy (97.14%) in the first training method, while SVM showed the best results with a CV score of 0.8615 and the lowest MSE (0.7674) in the second training method.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2192-6670
2192-6662
2192-6670
DOI:10.1007/s13721-025-00522-4