Multi-class classification of gait cycle phases using machine learning: a comprehensive study using two training methods
Walking is a fundamental human activity, and a deep understanding of its complexities is essential for accurately diagnosing and treating gait abnormalities and musculoskeletal disorders. This study investigates the application of machine learning (ML) methods for categorizing gait phases into their...
Saved in:
| Published in | Network modeling and analysis in health informatics and bioinformatics (Wien) Vol. 14; no. 1; p. 30 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Vienna
Springer Vienna
01.12.2025
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 2192-6670 2192-6662 2192-6670 |
| DOI | 10.1007/s13721-025-00522-4 |
Cover
| Summary: | Walking is a fundamental human activity, and a deep understanding of its complexities is essential for accurately diagnosing and treating gait abnormalities and musculoskeletal disorders. This study investigates the application of machine learning (ML) methods for categorizing gait phases into their individual subphases, through two training methodologies using data from 100 individuals obtained from an open-source platform. The first method employed stratified random sampling, for which 80% of the data in each subphase is allocated for training, while the remaining 20% is reserved for testing. The second method involves training the models using data from 80% of all the participants and then testing them using data from the remaining 20%. Before implementing various ML algorithms, the dataset underwent two scaling techniques-Min-Max Scaling (MMS) and Standard Scaling (SS)-and one dimensionality reduction approach, Principal Component Analysis (PCA). After ensuring the dataset is appropriately scaled or dimensionally reduced, we implement and assess the performance of several ML models, namely
k
-Nearest Neighbors (
k
-NN), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (NB), Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis (QDA). The evaluation of each model is based on multiple metrics, including Cross-Validation (CV) Score, Mean Squared Error (MSE), Root Mean Squared Error (RMSE), accuracy, and
R
2
score. For the MMS technique, LDA achieved the best performance with the highest CV score (0.9671), lowest MSE (0.0286), and highest accuracy (97.14%) in the first training method, while SVM showed the best results with a CV score of 0.8615 and the lowest MSE (0.7674) in the second training method. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2192-6670 2192-6662 2192-6670 |
| DOI: | 10.1007/s13721-025-00522-4 |