Real time word gesture detection and performance analysis using RCNN and RNN algorithms along with speech generation

This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between human...

Full description

Saved in:

Bibliographic Details
Published in	AIP conference proceedings Vol. 3216; no. 1
Main Authors	Jawre, Bhushan, Vineetha, K. V.
Format	Journal Article Conference Proceeding
Language	English
Published	Melville American Institute of Physics 29.07.2024
Subjects	Algorithms Artificial neural networks Character recognition Human performance Object recognition Real time Recurrent neural networks Sequences Sign language Speech recognition Technology assessment User experience User interfaces Words (language)
Online Access	Get full text
ISSN	0094-243X 1551-7616
DOI	10.1063/5.0226657

Cover

Abstract	This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between humans and robots. The goal is to examine the performance of different techniques and assess their practical applicability in areas such as human-computer interaction and assistive technology. The outcomes of the experiments show the benefits and limits of each strategy, offering insight on their distinct strengths and shortcomings. The RNN design supports sequential learning and memory retention, making it ideal for replicating the sequential character of sign language. The RCNN method, on the other hand, detects and recognizes sign language motions using a region-based convolutional neural network architecture. RCNN models can successfully localize and categorize objects inside pictures, making them useful for recognizing hand forms and motions in sign language movies or frames. The RCNN architecture gathers visual information from predetermined zones of interest, enabling for precise and robust sign language identification. Both techniques have benefits and drawbacks. RNN with MediaPipe has outstanding temporal modelling skills, allowing for reliable identification of sign language sequences as well as the capacity to manage fluctuations in timing and motion. This paper also mentions about the speech generation system after the recognizing the gesture by the addition of which we want to help improve accessible communication tools for people with disabilities. It uses PYTT3 python module for the generation of speech.
AbstractList	This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between humans and robots. The goal is to examine the performance of different techniques and assess their practical applicability in areas such as human-computer interaction and assistive technology. The outcomes of the experiments show the benefits and limits of each strategy, offering insight on their distinct strengths and shortcomings. The RNN design supports sequential learning and memory retention, making it ideal for replicating the sequential character of sign language. The RCNN method, on the other hand, detects and recognizes sign language motions using a region-based convolutional neural network architecture. RCNN models can successfully localize and categorize objects inside pictures, making them useful for recognizing hand forms and motions in sign language movies or frames. The RCNN architecture gathers visual information from predetermined zones of interest, enabling for precise and robust sign language identification. Both techniques have benefits and drawbacks. RNN with MediaPipe has outstanding temporal modelling skills, allowing for reliable identification of sign language sequences as well as the capacity to manage fluctuations in timing and motion. This paper also mentions about the speech generation system after the recognizing the gesture by the addition of which we want to help improve accessible communication tools for people with disabilities. It uses PYTT3 python module for the generation of speech.
Author	Jawre, Bhushan Vineetha, K. V.
Author_xml	– sequence: 1 givenname: Bhushan surname: Jawre fullname: Jawre, Bhushan organization: Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vidhya Vidyapeetham, Bangalore, India – sequence: 2 givenname: K. V. surname: Vineetha fullname: Vineetha, K. V. email: jain_vineetha@blr.amrita.edu organization: Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vidhya Vidyapeetham, Bangalore, India
BookMark	eNotUMtqwzAQFCWFJmkP_QNBbwWnu3rZPpbQF4QUQgu9GdWWEgfHciWbkL-v8jjNDjsMMzMho9a1hpB7hBmC4k9yBowpJdMrMkYpMUkVqhEZA-QiYYL_3JBJCFsAlqdpNib9yuiG9vXO0L3zFV2b0A_e0Mr0puxr11LdVrQz3jq_021pItfNIdSBDqFu13Q1Xy5PmtURm7Xzdb_ZhXi6-N1HQkNnTLmJ1q3x-uh5S66tboK5u-CUfL--fM3fk8Xn28f8eZF0yHmf8BTsL-ZZJYSyuYVSIGYIFQrGWAUqE4BgJBM2tZhpBGkQS6u0qrjRueVT8nD27bz7G2KzYusGH_OHgkMmUyYgV1H1eFaFsu5P-YrO1zvtDwVCcVy1kMVlVf4PpFBrTQ
CODEN	APCPCS
ContentType	Journal Article Conference Proceeding
Copyright	Author(s) 2024 Author(s). Published under an exclusive license by AIP Publishing.
Copyright_xml	– notice: Author(s) – notice: 2024 Author(s). Published under an exclusive license by AIP Publishing.
DBID	8FD H8D L7M
DOI	10.1063/5.0226657
DatabaseName	Technology Research Database Aerospace Database Advanced Technologies Database with Aerospace
DatabaseTitle	Technology Research Database Aerospace Database Advanced Technologies Database with Aerospace
DatabaseTitleList	Technology Research Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Physics
EISSN	1551-7616
Editor	Doss, Arockia Selvakumar Arockia Short, Michael Schilberg, Daniel
Editor_xml	– sequence: 1 givenname: Arockia Selvakumar Arockia surname: Doss fullname: Doss, Arockia Selvakumar Arockia organization: Vellore Institute of Technology – sequence: 2 givenname: Michael surname: Short fullname: Short, Michael organization: Teesside University – sequence: 3 givenname: Daniel surname: Schilberg fullname: Schilberg, Daniel organization: Bochum University of Applied Sciences
ExternalDocumentID	acp
Genre	Conference Proceeding
GroupedDBID	-~X 23M 5GY 6IK AAAAW AABDS AAEUA AAPUP AAYIH ABJNI ACBRY ACZLF ADCTM AEJMO AFATG AFHCQ AGKCL AGLKD AGMXG AGTJO AHSDT AJJCW ALEPV ALMA_UNASSIGNED_HOLDINGS ATXIE AWQPM BPZLN F5P FDOHQ FFFMQ HAM IPLJI M71 M73 RIE RIP RQS SJN ~02 8FD ABJGX H8D L7M
ID	FETCH-LOGICAL-p133t-370fb198d446f9f0c411810d14222d0684010e524f7f18a105e11cf6a6d3ea9f3
ISSN	0094-243X
IngestDate	Mon Jun 30 17:39:51 EDT 2025 Tue Jul 30 04:08:02 EDT 2024
IsPeerReviewed	true
IsScholarly	true
Issue	1
Language	English
License	Published under an exclusive license by AIP Publishing.
LinkModel	OpenURL
MeetingName	4TH INTERNATIONAL CONFERENCE ON ROBOTICS, INTELLIGENT AUTOMATION AND CONTROL TECHNOLOGIES (RIACT2023)
MergedId	FETCHMERGED-LOGICAL-p133t-370fb198d446f9f0c411810d14222d0684010e524f7f18a105e11cf6a6d3ea9f3
Notes	ObjectType-Conference Proceeding-1 SourceType-Conference Papers & Proceedings-1 content type line 21
PQID	3085724096
PQPubID	2050672
PageCount	10
ParticipantIDs	proquest_journals_3085724096 scitation_primary_10_1063_5_0226657
PublicationCentury	2000
PublicationDate	20240729
PublicationDateYYYYMMDD	2024-07-29
PublicationDate_xml	– month: 07 year: 2024 text: 20240729 day: 29
PublicationDecade	2020
PublicationPlace	Melville
PublicationPlace_xml	– name: Melville
PublicationTitle	AIP conference proceedings
PublicationYear	2024
Publisher	American Institute of Physics
Publisher_xml	– name: American Institute of Physics
References	Liao, Hepfer, Hooper, Rose, Sperling, Stover (c11) 2019 (c6) 2013 Sreelakshmi, Premjith, Soman (c5) 2020; 171 Rasheed, Kamsin, Abdullah (c13) 2020; 144 Agarwal, Liao, Hooper, Sperling (c12) 2019 Chowdhuri, Parel, Maity (c14) 2012 Padmavathi (c4) 2013; RTPRIA
References_xml	– start-page: 153 year: 2013 ident: c6 article-title: Personalized Image Search publication-title: Proceedings of Second ‘Student Research Symposium – start-page: 1 year: 2012 ident: c14 article-title: Virtual classroom for deaf people publication-title: 2012 IEEE International Conference on Engineering Education:Innovative Practices and Future Trends (AICERA) – year: 2019 ident: c12 article-title: Applying Deep Learning to a Sign-Language Progress Monitoring System publication-title: Preceedings of AECT – volume: 171 start-page: 737 year: 2020 ident: c5 article-title: Detection of Hate Speech Text in Hindi-English Code-mixed Data publication-title: Procedia Computer Science – volume: RTPRIA start-page: 40 year: 2013 ident: c4 article-title: Indian Sign Language Character Recognition using Neural Networks publication-title: IJCA Special Issue on Recent Trends in Pattern Recognition and Image Analysis – year: 2019 ident: c11 article-title: AvenuePM-KidSpeak–a Gamified Tool for Progress Monitoring Oral Reading Fluency publication-title: Preceedings of AECT – volume: 144 start-page: 10370 year: 2020 ident: c13 article-title: Challenges in the online component of blended learning: A systematic review publication-title: Computers & Education
SSID	ssj0029778
Score	2.363965
Snippet	This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm....
SourceID	proquest scitation
SourceType	Aggregation Database Publisher
SubjectTerms	Algorithms Artificial neural networks Character recognition Human performance Object recognition Real time Recurrent neural networks Sequences Sign language Speech recognition Technology assessment User experience User interfaces Words (language)
Title	Real time word gesture detection and performance analysis using RCNN and RNN algorithms along with speech generation
URI	http://dx.doi.org/10.1063/5.0226657 https://www.proquest.com/docview/3085724096
Volume	3216
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1551-7616 dateEnd: 20241102 omitProxy: false ssIdentifier: ssj0029778 issn: 0094-243X databaseCode: ADMLS dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1Jb9NAFIBH0ArBjaWIQkEjgbggF89ixz6W0qqUNFQhQblZ9iwNEk1C7AiJX8974_EStUKCi-1YlhPNN5n35q2EvLEqVSAKVFAkygaSSR0UrNABt-lA8UgMQle34GIUn03l-SyadWFFLrukKg7V71vzSv6HKtwDrpgl-w9k25fCDbgGvnAEwnC8yfhWUXP06RIDx5tisd0zZX8ujI1LDrk22BlGv0OXEvoNtKmM7xSO1QK2Ugh8qZKNMyWMj0ejOpMRzz-uluvv1fy6hMtlY8otV8aoOTZkNust5_6H-aac11bW8_xXF2r7DcsgVrW76bMPtfXWBy7RrMm7Na51K22FNrjgVdU3NsImMuDSdf4FqeOX2ogFg7jOtGzWYsH9596su7HIg1YFZKJDUD_Qb9RJssZ7P_qSnU6Hw2xyMpu8Xf0MsMcY-uJ9w5W7ZJfDJIRVcffo48Xwa7tBB124ltz-1zaVqGLxvv22rZ3IfVBT6oiJnlIyeUj2unRNetmSf0TumMVjcs8PzxNSIX6K-Cnipx4_bfFTQEt7-GmDnzr8FPG7Z8Z4bvFTh58iflrjpx3-PTI9PZkcnwW-2UawYkJUIGhCW7A00VLGNrWhkpiSHGpnI9Qh1gRioYm4tAPLkhzUcsOYsnEea2Hy1IqnZGexXJhnhOpcaK60jmw-kKDiJ6awecHjJAqZYmGxTw6aIcz8v6nMBLZaAPUyjffJ63ZYs1VdcyVzsRKxyKLMc3j-95e8IA-66XpAdqr1xrwE9bEqXnnqfwDeYndV
linkProvider	EBSCOhost
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=AIP+conference+proceedings&rft.atitle=Real+time+word+gesture+detection+and+performance+analysis+using+RCNN+and+RNN+algorithms+along+with+speech+generation&rft.au=Bhushan%2C+Jawre&rft.au=Vineetha%2C+K+V&rft.date=2024-07-29&rft.pub=American+Institute+of+Physics&rft.issn=0094-243X&rft.eissn=1551-7616&rft.volume=3216&rft.issue=1&rft_id=info:doi/10.1063%2F5.0226657&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0094-243X&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0094-243X&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0094-243X&client=summon