Yoga pose classification: a CNN and MediaPipe inspired deep learning approach for real-world application

Yoga is a centuries-old style of exercise followed by sports personnel, patients, and physiotherapist as their regime. A correct posture and technique are the key points in yoga to reap the maximum benefits. Hence, developing a model to classify yoga postures correctly is a recently emerging researc...

Full description

Saved in:
Bibliographic Details
Published inJournal of ambient intelligence and humanized computing Vol. 14; no. 12; pp. 16551 - 16562
Main Authors Garg, Shubham, Saxena, Aman, Gupta, Richa
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.12.2023
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1868-5137
1868-5145
DOI10.1007/s12652-022-03910-0

Cover

More Information
Summary:Yoga is a centuries-old style of exercise followed by sports personnel, patients, and physiotherapist as their regime. A correct posture and technique are the key points in yoga to reap the maximum benefits. Hence, developing a model to classify yoga postures correctly is a recently emerging research topic. The paper presents a novel architecture that aims to classify various yoga poses. The proposed model estimates and classifies yoga poses into five broad categories with low latency. In the proposed architecture, the images are skeletonized before inputting into the model. The skeletonization process is done using the MediaPipe library for body keypoint detection. The paper compares the performance of various deep learning models with and without skeletonization. Different learning models showed the optimum result with the training of skeletonized images to the network. The comparison is drawn to establish the positive impact of skeletonization on the results obtained by various models. VGG16 achieves the highest validation accuracy on non-skeletonized images (95.6%), followed by InceptionV3, NASNetMobile, YogaConvo2d (proposed model) (89.9%), and lastly, InceptionResNetV2. In contrast, the proposed model YogaConvo2d using skeletonized images reports a validation accuracy of 99.62%, followed by VGG16, InceptionResNetV2, NASNetMobile, and InceptionV3.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1868-5137
1868-5145
DOI:10.1007/s12652-022-03910-0