CoAtNet for Chest X-Ray Report Generation with Bi-LSTM and Multi-Head Attention
In clinical environments, Chest X-Ray (CXR) represents the most prevalent diagnostic instrument, particularly facilitating diagnostic procedures through medical report. However, manual report preparation is time-consuming, highly dependent on the expertise of radiologists, and carries the risk of er...
        Saved in:
      
    
          | Published in | Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol. 7; no. 4; pp. 654 - 672 | 
|---|---|
| Main Authors | , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
          
        20.10.2025
     | 
| Online Access | Get full text | 
| ISSN | 2656-8624 2656-8624  | 
| DOI | 10.35882/ijeeemi.v7i4.271 | 
Cover
| Abstract | In clinical environments, Chest X-Ray (CXR) represents the most prevalent diagnostic instrument, particularly facilitating diagnostic procedures through medical report. However, manual report preparation is time-consuming, highly dependent on the expertise of radiologists, and carries the risk of errors due to high workloads and limited expert staff. Therefore, an automated system based on artificial intelligence is needed to ease the workload of radiologists while increasing consistency. This study aims to develop an automated medical report generation system with balanced data distribution, reliable encoder, and bidirectional contextual understanding. The main contributions of this study include the implementation of an undersampling strategy based on majority captions followed by oversampling on minority labels while maintaining a proportion of labels with higher frequencies, the use of Bi-LSTM with Multi Head Attention (MHA) to strengthen text context understanding, and the use of CoAtNet as a visual encoder that combines the strengths of CNN and Transformer. The methodology incorporates image preprocessing via gamma correction for contrast improvement, data selection, balancing through combined undersampling and oversampling, and CoAtNet implementation as encoder paired with Bi-LSTM and MHA as decoder. Experimental execution employed the IU X-ray dataset, with assessment conducted using BLEU and ROUGE-L metrics. Outcomes revealed that the CoAtNet configuration with Bi-LSTM and MHA, coupled with the undersampling-oversampling strategy, delivered superior performance evidenced by a cumulative score of 1.642, with BLEU-1 to BLEU-4 and ROUGE-L achieving 0.480, 0.329, 0.245, 0.183, and 0.405, respectively. These findings prove that the combination of data balancing strategies with CoAtNet and Bi-LSTM is able to produce more accurate automated medical reports and reduce bias towards the majority label. | 
    
|---|---|
| AbstractList | In clinical environments, Chest X-Ray (CXR) represents the most prevalent diagnostic instrument, particularly facilitating diagnostic procedures through medical report. However, manual report preparation is time-consuming, highly dependent on the expertise of radiologists, and carries the risk of errors due to high workloads and limited expert staff. Therefore, an automated system based on artificial intelligence is needed to ease the workload of radiologists while increasing consistency. This study aims to develop an automated medical report generation system with balanced data distribution, reliable encoder, and bidirectional contextual understanding. The main contributions of this study include the implementation of an undersampling strategy based on majority captions followed by oversampling on minority labels while maintaining a proportion of labels with higher frequencies, the use of Bi-LSTM with Multi Head Attention (MHA) to strengthen text context understanding, and the use of CoAtNet as a visual encoder that combines the strengths of CNN and Transformer. The methodology incorporates image preprocessing via gamma correction for contrast improvement, data selection, balancing through combined undersampling and oversampling, and CoAtNet implementation as encoder paired with Bi-LSTM and MHA as decoder. Experimental execution employed the IU X-ray dataset, with assessment conducted using BLEU and ROUGE-L metrics. Outcomes revealed that the CoAtNet configuration with Bi-LSTM and MHA, coupled with the undersampling-oversampling strategy, delivered superior performance evidenced by a cumulative score of 1.642, with BLEU-1 to BLEU-4 and ROUGE-L achieving 0.480, 0.329, 0.245, 0.183, and 0.405, respectively. These findings prove that the combination of data balancing strategies with CoAtNet and Bi-LSTM is able to produce more accurate automated medical reports and reduce bias towards the majority label. | 
    
| Author | Yustanti, Wiyli Akbar, Rafy Aulia Putra, Ricky Eka  | 
    
| Author_xml | – sequence: 1 givenname: Rafy Aulia orcidid: 0009-0003-6991-0694 surname: Akbar fullname: Akbar, Rafy Aulia – sequence: 2 givenname: Ricky Eka orcidid: 0000-0002-5515-7967 surname: Putra fullname: Putra, Ricky Eka – sequence: 3 givenname: Wiyli orcidid: 0000-0002-9574-7072 surname: Yustanti fullname: Yustanti, Wiyli  | 
    
| BookMark | eNqNkMtuwjAQRa2KSqWUD-jOP2BqO_EjSxq1UIkWibLoLnLiiTAKDnJMEX_fUFh02dXcGc25i3OPBr71gNAjo5NEaM2f3BYAdm7yrVw64YrdoCGXQhIteTr4k-_QuOu2lFKus0QrNUTLvJ3GD4i4bgPON9BF_EVW5oRXsG9DxDPwEEx0rcdHFzf42ZHF5_odG2_x-6GJjszBWDyNEfz56wHd1qbpYHydI7R-fVnnc7JYzt7y6YJUWjDCRF3qrBSlzGSZgKpqW1cUWMZZyqylqRA0S7LS0P4uLQibmKqsUgpKSdpvI8QvtQe_N6ejaZpiH9zOhFPBaPErpbhKKc5Sil5KD7ELVIW26wLU_2B-AKMpagA | 
    
| ContentType | Journal Article | 
    
| DBID | AAYXX CITATION ADTOC UNPAY  | 
    
| DOI | 10.35882/ijeeemi.v7i4.271 | 
    
| DatabaseName | CrossRef Unpaywall for CDI: Periodical Content Unpaywall  | 
    
| DatabaseTitle | CrossRef | 
    
| DatabaseTitleList | CrossRef | 
    
| Database_xml | – sequence: 1 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| EISSN | 2656-8624 | 
    
| EndPage | 672 | 
    
| ExternalDocumentID | 10.35882/ijeeemi.v7i4.271 10_35882_ijeeemi_v7i4_271  | 
    
| GroupedDBID | AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION M~E ADTOC UNPAY  | 
    
| ID | FETCH-LOGICAL-c851-15fb89b5b696b3e7cfdfc0e192141dd04550939ba0dfc6de5d3acbc40e77605d3 | 
    
| IEDL.DBID | UNPAY | 
    
| ISSN | 2656-8624 | 
    
| IngestDate | Sun Oct 26 03:29:53 EDT 2025 Sat Oct 25 05:19:07 EDT 2025  | 
    
| IsDoiOpenAccess | true | 
    
| IsOpenAccess | true | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | true | 
    
| Issue | 4 | 
    
| Language | English | 
    
| License | https://creativecommons.org/licenses/by-sa/4.0 cc-by-sa  | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c851-15fb89b5b696b3e7cfdfc0e192141dd04550939ba0dfc6de5d3acbc40e77605d3 | 
    
| ORCID | 0009-0003-6991-0694 0000-0002-5515-7967 0000-0002-9574-7072  | 
    
| OpenAccessLink | https://proxy.k.utb.cz/login?url=https://www.ijeeemi.org/index.php/ijeeemi/article/download/271/241 | 
    
| PageCount | 19 | 
    
| ParticipantIDs | unpaywall_primary_10_35882_ijeeemi_v7i4_271 crossref_primary_10_35882_ijeeemi_v7i4_271  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2025-10-20 | 
    
| PublicationDateYYYYMMDD | 2025-10-20 | 
    
| PublicationDate_xml | – month: 10 year: 2025 text: 2025-10-20 day: 20  | 
    
| PublicationDecade | 2020 | 
    
| PublicationTitle | Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics | 
    
| PublicationYear | 2025 | 
    
| SSID | ssj0002893877 | 
    
| Score | 2.3091486 | 
    
| Snippet | In clinical environments, Chest X-Ray (CXR) represents the most prevalent diagnostic instrument, particularly facilitating diagnostic procedures through... | 
    
| SourceID | unpaywall crossref  | 
    
| SourceType | Open Access Repository Index Database  | 
    
| StartPage | 654 | 
    
| Title | CoAtNet for Chest X-Ray Report Generation with Bi-LSTM and Multi-Head Attention | 
    
| URI | https://www.ijeeemi.org/index.php/ijeeemi/article/download/271/241 | 
    
| UnpaywallVersion | publishedVersion | 
    
| Volume | 7 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2656-8624 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0002893877 issn: 2656-8624 databaseCode: M~E dateStart: 20190101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NS8MwFA-6HTz5gYoTHTl4UrJ-pmmPdWwMcVN0g3kq-YTp7IZ0yjz4t5s0nUwvIh4bXkv6XiC_5L3f7wFwFsdcKs4ZEpwQFJpMYRIohRRnbkCFH7EyY9ofRL1ReDXG4zUujCmrnDxKKZ9tHr_UDDRCEU416lQedYQRk59R4fjEc3zDXa9HWOPxGqiPBrfpg-kqp8EKMgwIm84MsIaTq--0XskkbPnE-7YhbS3yOV2-0el0bZfp7gC-mp8tLnlqLQrW4u8_pBv_9wO7YLsCoTC1JntgQ-b74KY9S4uBLKCGsbBt2mjBMbqjS2gxOrQC1SaO0FzewssJur4f9iHNBSxpvKin1wtMi8JWUB6AYbczbPdQ1W4BcQ27kIcVixOGWZRELJCEK6G4K41eWugJ4Rr6cxIkjLp6PBISi4ByxkNXEqLPRCI4BLV8lssjAEONObBwXeELHEZMJZQHPtFv6uc4Fl4DnK9cns2tqEamDyNlfLLKTZmJT6Zd0wAXX0H53fr4T9YnoFa8LOSpxhUFa4LN_kenWS2fT-Z71qE | 
    
| linkProvider | Unpaywall | 
    
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NS8MwFA-yHTz5gYoTlRw8KVk_0jTtsQ7HEDdFN5inkk-Yzm5Ip8y_3qTpZHoR8djwWtL3Avkl7_1-D4CzJBFKC8GRFJSiyGYKU6w10oL7mMkw5lXGtD-Ie6PoekzGa1wYW1Y5eVJKvbg8fqUZaIUivHrUqz3qSSsmP2PSC2nghZa73oyJweMN0BwN7rJH21XOgBVkGRAunYmJgZOr77Tf6CRqhzT4tiFtLoo5W76z6XRtl-luA7GanysueW4vSt4WHz-kG__3AztgqwahMHMmu2BDFXvgtjPLyoEqoYGxsGPbaMExumdL6DA6dALVNo7QXt7Cywm6eRj2ISskrGi8qGfWC8zK0lVQ7oNh92rY6aG63QISBnahgGiepJzwOI05VlRoqYWvrF5aFEjpW_pzilPOfDMeS0UkZoKLyFeUmjORxAegUcwKdQhgZDAHkb4vQ0mimOuUCRxS86Z5ThIZtMD5yuX53Ilq5OYwUsUnr92U2_jkxjUtcPEVlN-tj_5kfQwa5etCnRhcUfLTeuF8AuNA1XA | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=CoAtNet+for+Chest+X-Ray+Report+Generation+with+Bi-LSTM+and+Multi-Head+Attention&rft.jtitle=Indonesian+Journal+of+Electronics%2C+Electromedical+Engineering%2C+and+Medical+Informatics&rft.au=Akbar%2C+Rafy+Aulia&rft.au=Putra%2C+Ricky+Eka&rft.au=Yustanti%2C+Wiyli&rft.date=2025-10-20&rft.issn=2656-8624&rft.eissn=2656-8624&rft.volume=7&rft.issue=4&rft.spage=654&rft.epage=672&rft_id=info:doi/10.35882%2Fijeeemi.v7i4.271&rft.externalDBID=n%2Fa&rft.externalDocID=10_35882_ijeeemi_v7i4_271 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2656-8624&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2656-8624&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2656-8624&client=summon |