Real time word gesture detection and performance analysis using RCNN and RNN algorithms along with speech generation
This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between human...
        Saved in:
      
    
          | Published in | AIP conference proceedings Vol. 3216; no. 1 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article Conference Proceeding | 
| Language | English | 
| Published | 
        Melville
          American Institute of Physics
    
        29.07.2024
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0094-243X 1551-7616  | 
| DOI | 10.1063/5.0226657 | 
Cover
| Abstract | This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between humans and robots. The goal is to examine the performance of different techniques and assess their practical applicability in areas such as human-computer interaction and assistive technology. The outcomes of the experiments show the benefits and limits of each strategy, offering insight on their distinct strengths and shortcomings. The RNN design supports sequential learning and memory retention, making it ideal for replicating the sequential character of sign language. The RCNN method, on the other hand, detects and recognizes sign language motions using a region-based convolutional neural network architecture. RCNN models can successfully localize and categorize objects inside pictures, making them useful for recognizing hand forms and motions in sign language movies or frames. The RCNN architecture gathers visual information from predetermined zones of interest, enabling for precise and robust sign language identification. Both techniques have benefits and drawbacks. RNN with MediaPipe has outstanding temporal modelling skills, allowing for reliable identification of sign language sequences as well as the capacity to manage fluctuations in timing and motion. This paper also mentions about the speech generation system after the recognizing the gesture by the addition of which we want to help improve accessible communication tools for people with disabilities. It uses PYTT3 python module for the generation of speech. | 
    
|---|---|
| AbstractList | This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between humans and robots. The goal is to examine the performance of different techniques and assess their practical applicability in areas such as human-computer interaction and assistive technology. The outcomes of the experiments show the benefits and limits of each strategy, offering insight on their distinct strengths and shortcomings. The RNN design supports sequential learning and memory retention, making it ideal for replicating the sequential character of sign language. The RCNN method, on the other hand, detects and recognizes sign language motions using a region-based convolutional neural network architecture. RCNN models can successfully localize and categorize objects inside pictures, making them useful for recognizing hand forms and motions in sign language movies or frames. The RCNN architecture gathers visual information from predetermined zones of interest, enabling for precise and robust sign language identification. Both techniques have benefits and drawbacks. RNN with MediaPipe has outstanding temporal modelling skills, allowing for reliable identification of sign language sequences as well as the capacity to manage fluctuations in timing and motion. This paper also mentions about the speech generation system after the recognizing the gesture by the addition of which we want to help improve accessible communication tools for people with disabilities. It uses PYTT3 python module for the generation of speech. | 
    
| Author | Jawre, Bhushan Vineetha, K. V.  | 
    
| Author_xml | – sequence: 1 givenname: Bhushan surname: Jawre fullname: Jawre, Bhushan organization: Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vidhya Vidyapeetham, Bangalore, India – sequence: 2 givenname: K. V. surname: Vineetha fullname: Vineetha, K. V. email: jain_vineetha@blr.amrita.edu organization: Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vidhya Vidyapeetham, Bangalore, India  | 
    
| BookMark | eNotUMtqwzAQFCWFJmkP_QNBbwWnu3rZPpbQF4QUQgu9GdWWEgfHciWbkL-v8jjNDjsMMzMho9a1hpB7hBmC4k9yBowpJdMrMkYpMUkVqhEZA-QiYYL_3JBJCFsAlqdpNib9yuiG9vXO0L3zFV2b0A_e0Mr0puxr11LdVrQz3jq_021pItfNIdSBDqFu13Q1Xy5PmtURm7Xzdb_ZhXi6-N1HQkNnTLmJ1q3x-uh5S66tboK5u-CUfL--fM3fk8Xn28f8eZF0yHmf8BTsL-ZZJYSyuYVSIGYIFQrGWAUqE4BgJBM2tZhpBGkQS6u0qrjRueVT8nD27bz7G2KzYusGH_OHgkMmUyYgV1H1eFaFsu5P-YrO1zvtDwVCcVy1kMVlVf4PpFBrTQ | 
    
| CODEN | APCPCS | 
    
| ContentType | Journal Article Conference Proceeding  | 
    
| Copyright | Author(s) 2024 Author(s). Published under an exclusive license by AIP Publishing.  | 
    
| Copyright_xml | – notice: Author(s) – notice: 2024 Author(s). Published under an exclusive license by AIP Publishing.  | 
    
| DBID | 8FD H8D L7M  | 
    
| DOI | 10.1063/5.0226657 | 
    
| DatabaseName | Technology Research Database Aerospace Database Advanced Technologies Database with Aerospace  | 
    
| DatabaseTitle | Technology Research Database Aerospace Database Advanced Technologies Database with Aerospace  | 
    
| DatabaseTitleList | Technology Research Database | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Physics | 
    
| EISSN | 1551-7616 | 
    
| Editor | Doss, Arockia Selvakumar Arockia Short, Michael Schilberg, Daniel  | 
    
| Editor_xml | – sequence: 1 givenname: Arockia Selvakumar Arockia surname: Doss fullname: Doss, Arockia Selvakumar Arockia organization: Vellore Institute of Technology – sequence: 2 givenname: Michael surname: Short fullname: Short, Michael organization: Teesside University – sequence: 3 givenname: Daniel surname: Schilberg fullname: Schilberg, Daniel organization: Bochum University of Applied Sciences  | 
    
| ExternalDocumentID | acp | 
    
| Genre | Conference Proceeding | 
    
| GroupedDBID | -~X 23M 5GY 6IK AAAAW AABDS AAEUA AAPUP AAYIH ABJNI ACBRY ACZLF ADCTM AEJMO AFATG AFHCQ AGKCL AGLKD AGMXG AGTJO AHSDT AJJCW ALEPV ALMA_UNASSIGNED_HOLDINGS ATXIE AWQPM BPZLN F5P FDOHQ FFFMQ HAM IPLJI M71 M73 RIE RIP RQS SJN ~02 8FD ABJGX H8D L7M  | 
    
| ID | FETCH-LOGICAL-p133t-370fb198d446f9f0c411810d14222d0684010e524f7f18a105e11cf6a6d3ea9f3 | 
    
| ISSN | 0094-243X | 
    
| IngestDate | Mon Jun 30 17:39:51 EDT 2025 Tue Jul 30 04:08:02 EDT 2024  | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | 1 | 
    
| Language | English | 
    
| License | Published under an exclusive license by AIP Publishing. | 
    
| LinkModel | OpenURL | 
    
| MeetingName | 4TH INTERNATIONAL CONFERENCE ON ROBOTICS, INTELLIGENT AUTOMATION AND CONTROL TECHNOLOGIES (RIACT2023) | 
    
| MergedId | FETCHMERGED-LOGICAL-p133t-370fb198d446f9f0c411810d14222d0684010e524f7f18a105e11cf6a6d3ea9f3 | 
    
| Notes | ObjectType-Conference Proceeding-1 SourceType-Conference Papers & Proceedings-1 content type line 21  | 
    
| PQID | 3085724096 | 
    
| PQPubID | 2050672 | 
    
| PageCount | 10 | 
    
| ParticipantIDs | proquest_journals_3085724096 scitation_primary_10_1063_5_0226657  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 20240729 | 
    
| PublicationDateYYYYMMDD | 2024-07-29 | 
    
| PublicationDate_xml | – month: 07 year: 2024 text: 20240729 day: 29  | 
    
| PublicationDecade | 2020 | 
    
| PublicationPlace | Melville | 
    
| PublicationPlace_xml | – name: Melville | 
    
| PublicationTitle | AIP conference proceedings | 
    
| PublicationYear | 2024 | 
    
| Publisher | American Institute of Physics | 
    
| Publisher_xml | – name: American Institute of Physics | 
    
| References | Liao, Hepfer, Hooper, Rose, Sperling, Stover (c11) 2019 (c6) 2013 Sreelakshmi, Premjith, Soman (c5) 2020; 171 Rasheed, Kamsin, Abdullah (c13) 2020; 144 Agarwal, Liao, Hooper, Sperling (c12) 2019 Chowdhuri, Parel, Maity (c14) 2012 Padmavathi (c4) 2013; RTPRIA  | 
    
| References_xml | – start-page: 153 year: 2013 ident: c6 article-title: Personalized Image Search publication-title: Proceedings of Second ‘Student Research Symposium – start-page: 1 year: 2012 ident: c14 article-title: Virtual classroom for deaf people publication-title: 2012 IEEE International Conference on Engineering Education:Innovative Practices and Future Trends (AICERA) – year: 2019 ident: c12 article-title: Applying Deep Learning to a Sign-Language Progress Monitoring System publication-title: Preceedings of AECT – volume: 171 start-page: 737 year: 2020 ident: c5 article-title: Detection of Hate Speech Text in Hindi-English Code-mixed Data publication-title: Procedia Computer Science – volume: RTPRIA start-page: 40 year: 2013 ident: c4 article-title: Indian Sign Language Character Recognition using Neural Networks publication-title: IJCA Special Issue on Recent Trends in Pattern Recognition and Image Analysis – year: 2019 ident: c11 article-title: AvenuePM-KidSpeak–a Gamified Tool for Progress Monitoring Oral Reading Fluency publication-title: Preceedings of AECT – volume: 144 start-page: 10370 year: 2020 ident: c13 article-title: Challenges in the online component of blended learning: A systematic review publication-title: Computers & Education  | 
    
| SSID | ssj0029778 | 
    
| Score | 2.363965 | 
    
| Snippet | This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm.... | 
    
| SourceID | proquest scitation  | 
    
| SourceType | Aggregation Database Publisher  | 
    
| SubjectTerms | Algorithms Artificial neural networks Character recognition Human performance Object recognition Real time Recurrent neural networks Sequences Sign language Speech recognition Technology assessment User experience User interfaces Words (language)  | 
    
| Title | Real time word gesture detection and performance analysis using RCNN and RNN algorithms along with speech generation | 
    
| URI | http://dx.doi.org/10.1063/5.0226657 https://www.proquest.com/docview/3085724096  | 
    
| Volume | 3216 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1551-7616 dateEnd: 20241102 omitProxy: false ssIdentifier: ssj0029778 issn: 0094-243X databaseCode: ADMLS dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1Jb9NAFIBH0ArBjaWIQkEjgbggF89ixz6W0qqUNFQhQblZ9iwNEk1C7AiJX8974_EStUKCi-1YlhPNN5n35q2EvLEqVSAKVFAkygaSSR0UrNABt-lA8UgMQle34GIUn03l-SyadWFFLrukKg7V71vzSv6HKtwDrpgl-w9k25fCDbgGvnAEwnC8yfhWUXP06RIDx5tisd0zZX8ujI1LDrk22BlGv0OXEvoNtKmM7xSO1QK2Ugh8qZKNMyWMj0ejOpMRzz-uluvv1fy6hMtlY8otV8aoOTZkNust5_6H-aac11bW8_xXF2r7DcsgVrW76bMPtfXWBy7RrMm7Na51K22FNrjgVdU3NsImMuDSdf4FqeOX2ogFg7jOtGzWYsH9596su7HIg1YFZKJDUD_Qb9RJssZ7P_qSnU6Hw2xyMpu8Xf0MsMcY-uJ9w5W7ZJfDJIRVcffo48Xwa7tBB124ltz-1zaVqGLxvv22rZ3IfVBT6oiJnlIyeUj2unRNetmSf0TumMVjcs8PzxNSIX6K-Cnipx4_bfFTQEt7-GmDnzr8FPG7Z8Z4bvFTh58iflrjpx3-PTI9PZkcnwW-2UawYkJUIGhCW7A00VLGNrWhkpiSHGpnI9Qh1gRioYm4tAPLkhzUcsOYsnEea2Hy1IqnZGexXJhnhOpcaK60jmw-kKDiJ6awecHjJAqZYmGxTw6aIcz8v6nMBLZaAPUyjffJ63ZYs1VdcyVzsRKxyKLMc3j-95e8IA-66XpAdqr1xrwE9bEqXnnqfwDeYndV | 
    
| linkProvider | EBSCOhost | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=AIP+conference+proceedings&rft.atitle=Real+time+word+gesture+detection+and+performance+analysis+using+RCNN+and+RNN+algorithms+along+with+speech+generation&rft.au=Bhushan%2C+Jawre&rft.au=Vineetha%2C+K+V&rft.date=2024-07-29&rft.pub=American+Institute+of+Physics&rft.issn=0094-243X&rft.eissn=1551-7616&rft.volume=3216&rft.issue=1&rft_id=info:doi/10.1063%2F5.0226657&rft.externalDBID=NO_FULL_TEXT | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0094-243X&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0094-243X&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0094-243X&client=summon |