Real time word gesture detection and performance analysis using RCNN and RNN algorithms along with speech generation

This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between human...

Full description

Saved in:
Bibliographic Details
Published inAIP conference proceedings Vol. 3216; no. 1
Main Authors Jawre, Bhushan, Vineetha, K. V.
Format Journal Article Conference Proceeding
LanguageEnglish
Published Melville American Institute of Physics 29.07.2024
Subjects
Online AccessGet full text
ISSN0094-243X
1551-7616
DOI10.1063/5.0226657

Cover

Abstract This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between humans and robots. The goal is to examine the performance of different techniques and assess their practical applicability in areas such as human-computer interaction and assistive technology. The outcomes of the experiments show the benefits and limits of each strategy, offering insight on their distinct strengths and shortcomings. The RNN design supports sequential learning and memory retention, making it ideal for replicating the sequential character of sign language. The RCNN method, on the other hand, detects and recognizes sign language motions using a region-based convolutional neural network architecture. RCNN models can successfully localize and categorize objects inside pictures, making them useful for recognizing hand forms and motions in sign language movies or frames. The RCNN architecture gathers visual information from predetermined zones of interest, enabling for precise and robust sign language identification. Both techniques have benefits and drawbacks. RNN with MediaPipe has outstanding temporal modelling skills, allowing for reliable identification of sign language sequences as well as the capacity to manage fluctuations in timing and motion. This paper also mentions about the speech generation system after the recognizing the gesture by the addition of which we want to help improve accessible communication tools for people with disabilities. It uses PYTT3 python module for the generation of speech.
AbstractList This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between humans and robots. The goal is to examine the performance of different techniques and assess their practical applicability in areas such as human-computer interaction and assistive technology. The outcomes of the experiments show the benefits and limits of each strategy, offering insight on their distinct strengths and shortcomings. The RNN design supports sequential learning and memory retention, making it ideal for replicating the sequential character of sign language. The RCNN method, on the other hand, detects and recognizes sign language motions using a region-based convolutional neural network architecture. RCNN models can successfully localize and categorize objects inside pictures, making them useful for recognizing hand forms and motions in sign language movies or frames. The RCNN architecture gathers visual information from predetermined zones of interest, enabling for precise and robust sign language identification. Both techniques have benefits and drawbacks. RNN with MediaPipe has outstanding temporal modelling skills, allowing for reliable identification of sign language sequences as well as the capacity to manage fluctuations in timing and motion. This paper also mentions about the speech generation system after the recognizing the gesture by the addition of which we want to help improve accessible communication tools for people with disabilities. It uses PYTT3 python module for the generation of speech.
Author Jawre, Bhushan
Vineetha, K. V.
Author_xml – sequence: 1
  givenname: Bhushan
  surname: Jawre
  fullname: Jawre, Bhushan
  organization: Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vidhya Vidyapeetham, Bangalore, India
– sequence: 2
  givenname: K. V.
  surname: Vineetha
  fullname: Vineetha, K. V.
  email: jain_vineetha@blr.amrita.edu
  organization: Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vidhya Vidyapeetham, Bangalore, India
BookMark eNotUMtqwzAQFCWFJmkP_QNBbwWnu3rZPpbQF4QUQgu9GdWWEgfHciWbkL-v8jjNDjsMMzMho9a1hpB7hBmC4k9yBowpJdMrMkYpMUkVqhEZA-QiYYL_3JBJCFsAlqdpNib9yuiG9vXO0L3zFV2b0A_e0Mr0puxr11LdVrQz3jq_021pItfNIdSBDqFu13Q1Xy5PmtURm7Xzdb_ZhXi6-N1HQkNnTLmJ1q3x-uh5S66tboK5u-CUfL--fM3fk8Xn28f8eZF0yHmf8BTsL-ZZJYSyuYVSIGYIFQrGWAUqE4BgJBM2tZhpBGkQS6u0qrjRueVT8nD27bz7G2KzYusGH_OHgkMmUyYgV1H1eFaFsu5P-YrO1zvtDwVCcVy1kMVlVf4PpFBrTQ
CODEN APCPCS
ContentType Journal Article
Conference Proceeding
Copyright Author(s)
2024 Author(s). Published under an exclusive license by AIP Publishing.
Copyright_xml – notice: Author(s)
– notice: 2024 Author(s). Published under an exclusive license by AIP Publishing.
DBID 8FD
H8D
L7M
DOI 10.1063/5.0226657
DatabaseName Technology Research Database
Aerospace Database
Advanced Technologies Database with Aerospace
DatabaseTitle Technology Research Database
Aerospace Database
Advanced Technologies Database with Aerospace
DatabaseTitleList Technology Research Database

DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 1551-7616
Editor Doss, Arockia Selvakumar Arockia
Short, Michael
Schilberg, Daniel
Editor_xml – sequence: 1
  givenname: Arockia Selvakumar Arockia
  surname: Doss
  fullname: Doss, Arockia Selvakumar Arockia
  organization: Vellore Institute of Technology
– sequence: 2
  givenname: Michael
  surname: Short
  fullname: Short, Michael
  organization: Teesside University
– sequence: 3
  givenname: Daniel
  surname: Schilberg
  fullname: Schilberg, Daniel
  organization: Bochum University of Applied Sciences
ExternalDocumentID acp
Genre Conference Proceeding
GroupedDBID -~X
23M
5GY
6IK
AAAAW
AABDS
AAEUA
AAPUP
AAYIH
ABJNI
ACBRY
ACZLF
ADCTM
AEJMO
AFATG
AFHCQ
AGKCL
AGLKD
AGMXG
AGTJO
AHSDT
AJJCW
ALEPV
ALMA_UNASSIGNED_HOLDINGS
ATXIE
AWQPM
BPZLN
F5P
FDOHQ
FFFMQ
HAM
IPLJI
M71
M73
RIE
RIP
RQS
SJN
~02
8FD
ABJGX
H8D
L7M
ID FETCH-LOGICAL-p133t-370fb198d446f9f0c411810d14222d0684010e524f7f18a105e11cf6a6d3ea9f3
ISSN 0094-243X
IngestDate Mon Jun 30 17:39:51 EDT 2025
Tue Jul 30 04:08:02 EDT 2024
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License Published under an exclusive license by AIP Publishing.
LinkModel OpenURL
MeetingName 4TH INTERNATIONAL CONFERENCE ON ROBOTICS, INTELLIGENT AUTOMATION AND CONTROL TECHNOLOGIES (RIACT2023)
MergedId FETCHMERGED-LOGICAL-p133t-370fb198d446f9f0c411810d14222d0684010e524f7f18a105e11cf6a6d3ea9f3
Notes ObjectType-Conference Proceeding-1
SourceType-Conference Papers & Proceedings-1
content type line 21
PQID 3085724096
PQPubID 2050672
PageCount 10
ParticipantIDs proquest_journals_3085724096
scitation_primary_10_1063_5_0226657
PublicationCentury 2000
PublicationDate 20240729
PublicationDateYYYYMMDD 2024-07-29
PublicationDate_xml – month: 07
  year: 2024
  text: 20240729
  day: 29
PublicationDecade 2020
PublicationPlace Melville
PublicationPlace_xml – name: Melville
PublicationTitle AIP conference proceedings
PublicationYear 2024
Publisher American Institute of Physics
Publisher_xml – name: American Institute of Physics
References Liao, Hepfer, Hooper, Rose, Sperling, Stover (c11) 2019
(c6) 2013
Sreelakshmi, Premjith, Soman (c5) 2020; 171
Rasheed, Kamsin, Abdullah (c13) 2020; 144
Agarwal, Liao, Hooper, Sperling (c12) 2019
Chowdhuri, Parel, Maity (c14) 2012
Padmavathi (c4) 2013; RTPRIA
References_xml – start-page: 153
  year: 2013
  ident: c6
  article-title: Personalized Image Search
  publication-title: Proceedings of Second ‘Student Research Symposium
– start-page: 1
  year: 2012
  ident: c14
  article-title: Virtual classroom for deaf people
  publication-title: 2012 IEEE International Conference on Engineering Education:Innovative Practices and Future Trends (AICERA)
– year: 2019
  ident: c12
  article-title: Applying Deep Learning to a Sign-Language Progress Monitoring System
  publication-title: Preceedings of AECT
– volume: 171
  start-page: 737
  year: 2020
  ident: c5
  article-title: Detection of Hate Speech Text in Hindi-English Code-mixed Data
  publication-title: Procedia Computer Science
– volume: RTPRIA
  start-page: 40
  year: 2013
  ident: c4
  article-title: Indian Sign Language Character Recognition using Neural Networks
  publication-title: IJCA Special Issue on Recent Trends in Pattern Recognition and Image Analysis
– year: 2019
  ident: c11
  article-title: AvenuePM-KidSpeak–a Gamified Tool for Progress Monitoring Oral Reading Fluency
  publication-title: Preceedings of AECT
– volume: 144
  start-page: 10370
  year: 2020
  ident: c13
  article-title: Challenges in the online component of blended learning: A systematic review
  publication-title: Computers & Education
SSID ssj0029778
Score 2.363965
Snippet This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm....
SourceID proquest
scitation
SourceType Aggregation Database
Publisher
SubjectTerms Algorithms
Artificial neural networks
Character recognition
Human performance
Object recognition
Real time
Recurrent neural networks
Sequences
Sign language
Speech recognition
Technology assessment
User experience
User interfaces
Words (language)
Title Real time word gesture detection and performance analysis using RCNN and RNN algorithms along with speech generation
URI http://dx.doi.org/10.1063/5.0226657
https://www.proquest.com/docview/3085724096
Volume 3216
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1551-7616
  dateEnd: 20241102
  omitProxy: false
  ssIdentifier: ssj0029778
  issn: 0094-243X
  databaseCode: ADMLS
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1Jb9NAFIBH0ArBjaWIQkEjgbggF89ixz6W0qqUNFQhQblZ9iwNEk1C7AiJX8974_EStUKCi-1YlhPNN5n35q2EvLEqVSAKVFAkygaSSR0UrNABt-lA8UgMQle34GIUn03l-SyadWFFLrukKg7V71vzSv6HKtwDrpgl-w9k25fCDbgGvnAEwnC8yfhWUXP06RIDx5tisd0zZX8ujI1LDrk22BlGv0OXEvoNtKmM7xSO1QK2Ugh8qZKNMyWMj0ejOpMRzz-uluvv1fy6hMtlY8otV8aoOTZkNust5_6H-aac11bW8_xXF2r7DcsgVrW76bMPtfXWBy7RrMm7Na51K22FNrjgVdU3NsImMuDSdf4FqeOX2ogFg7jOtGzWYsH9596su7HIg1YFZKJDUD_Qb9RJssZ7P_qSnU6Hw2xyMpu8Xf0MsMcY-uJ9w5W7ZJfDJIRVcffo48Xwa7tBB124ltz-1zaVqGLxvv22rZ3IfVBT6oiJnlIyeUj2unRNetmSf0TumMVjcs8PzxNSIX6K-Cnipx4_bfFTQEt7-GmDnzr8FPG7Z8Z4bvFTh58iflrjpx3-PTI9PZkcnwW-2UawYkJUIGhCW7A00VLGNrWhkpiSHGpnI9Qh1gRioYm4tAPLkhzUcsOYsnEea2Hy1IqnZGexXJhnhOpcaK60jmw-kKDiJ6awecHjJAqZYmGxTw6aIcz8v6nMBLZaAPUyjffJ63ZYs1VdcyVzsRKxyKLMc3j-95e8IA-66XpAdqr1xrwE9bEqXnnqfwDeYndV
linkProvider EBSCOhost
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=AIP+conference+proceedings&rft.atitle=Real+time+word+gesture+detection+and+performance+analysis+using+RCNN+and+RNN+algorithms+along+with+speech+generation&rft.au=Bhushan%2C+Jawre&rft.au=Vineetha%2C+K+V&rft.date=2024-07-29&rft.pub=American+Institute+of+Physics&rft.issn=0094-243X&rft.eissn=1551-7616&rft.volume=3216&rft.issue=1&rft_id=info:doi/10.1063%2F5.0226657&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0094-243X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0094-243X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0094-243X&client=summon