Performance of ChatGPT-3.5 and ChatGPT-4 in the field of specialist medical knowledge on National Specialization Exam in neurosurgery
Introduction: In recent times, there has been an increased number of published materials related to artificial intelligence (AI) in both the medical field, and specifically, in the domain of neurosurgery. Studies integrating AI into neurosurgical practice suggest an ongoing shift towards a greater d...
        Saved in:
      
    
          | Published in | Annales Academiae Medicae Silesiensis Vol. 78; pp. 253 - 258 | 
|---|---|
| Main Authors | , , , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
            Śląski Uniwersytet Medyczny w Katowicach
    
        15.10.2024
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1734-025X 0208-5607 1734-025X  | 
| DOI | 10.18794/aams/186827 | 
Cover
| Abstract | Introduction: In recent times, there has been an increased number of published materials related to artificial intelligence (AI) in both the medical field, and specifically, in the domain of neurosurgery. Studies integrating AI into neurosurgical practice suggest an ongoing shift towards a greater dependence on AI-assisted tools for diagnostics, image analysis, and decision-making. Material and methods: The study evaluated the performance of ChatGPT-3.5 and ChatGPT-4 on a neurosurgery exam from Autumn 2017, which was the latest exam with officially provided answers on the Medical Examinations Center in Łódź, Poland (Centrum Egzaminów Medycznych – CEM) website. The passing score for the National Specialization Exam (Państwowy Egzamin Specjalizacyjny – PES) in Poland, as administered by CEM, is 56% of the valid questions. This exam, chosen from CEM, comprised 116 single-choice questions after eliminating four outdated questions. These questions were categorized into ten thematic groups based on the subjects they address. For data collection, both ChatGPT versions were briefed on the exam rules and asked to rate their confidence in each answer on a scale from 1 (definitely not sure) to 5 (definitely sure). All the interactions were conducted in Polish and were recorded. Results: ChatGPT-4 significantly outperformed ChatGPT-3.5, showing a notable improvement with a 29.4% margin (p < 0.001). Unlike ChatGPT-3.5, ChatGPT-4 successfully reached the passing threshold for the PES. ChatGPT-3.5 and ChatGPT-4 had the same answers in 61 questions (52.58%), both were correct in 28 questions (24.14%), and were incorrect in 33 questions (28.45%). Conclusions: ChatGPT-4 shows improved accuracy over ChatGPT-3.5, likely due to advanced algorithms and a broader training dataset, highlighting its better grasp of complex neurosurgical concepts. | 
    
|---|---|
| AbstractList | Introduction: In recent times, there has been an increased number of published materials related to artificial intelligence (AI) in both the medical field, and specifically, in the domain of neurosurgery. Studies integrating AI into neurosurgical practice suggest an ongoing shift towards a greater dependence on AI-assisted tools for diagnostics, image analysis, and decision-making. Material and methods: The study evaluated the performance of ChatGPT-3.5 and ChatGPT-4 on a neurosurgery exam from Autumn 2017, which was the latest exam with officially provided answers on the Medical Examinations Center in Łódź, Poland (Centrum Egzaminów Medycznych – CEM) website. The passing score for the National Specialization Exam (Państwowy Egzamin Specjalizacyjny – PES) in Poland, as administered by CEM, is 56% of the valid questions. This exam, chosen from CEM, comprised 116 single-choice questions after eliminating four outdated questions. These questions were categorized into ten thematic groups based on the subjects they address. For data collection, both ChatGPT versions were briefed on the exam rules and asked to rate their confidence in each answer on a scale from 1 (definitely not sure) to 5 (definitely sure). All the interactions were conducted in Polish and were recorded. Results: ChatGPT-4 significantly outperformed ChatGPT-3.5, showing a notable improvement with a 29.4% margin (p < 0.001). Unlike ChatGPT-3.5, ChatGPT-4 successfully reached the passing threshold for the PES. ChatGPT-3.5 and ChatGPT-4 had the same answers in 61 questions (52.58%), both were correct in 28 questions (24.14%), and were incorrect in 33 questions (28.45%). Conclusions: ChatGPT-4 shows improved accuracy over ChatGPT-3.5, likely due to advanced algorithms and a broader training dataset, highlighting its better grasp of complex neurosurgical concepts. | 
    
| Author | Rudnik, Adam Ciekalski, Marcin Błaszczyk, Bartłomiej Laskowski, Marcin Setlak, Marcin Paździora, Piotr Laskowski, Maciej  | 
    
| Author_xml | – sequence: 1 givenname: Maciej orcidid: 0009-0005-5809-0875 surname: Laskowski fullname: Laskowski, Maciej organization: Students’ Scientific Club, Department of Neurosurgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland – sequence: 2 givenname: Marcin orcidid: 0000-0003-1392-2007 surname: Ciekalski fullname: Ciekalski, Marcin organization: Students’ Scientific Club, Department of Neurosurgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland – sequence: 3 givenname: Marcin surname: Laskowski fullname: Laskowski, Marcin organization: Unhyped, AI Growth Partner, Kraków, Poland – sequence: 4 givenname: Bartłomiej surname: Błaszczyk fullname: Błaszczyk, Bartłomiej organization: Department of Neurosurgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland – sequence: 5 givenname: Marcin surname: Setlak fullname: Setlak, Marcin organization: Department of Neurosurgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland – sequence: 6 givenname: Piotr surname: Paździora fullname: Paździora, Piotr organization: Department of Neurosurgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland – sequence: 7 givenname: Adam surname: Rudnik fullname: Rudnik, Adam organization: Department of Neurosurgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland  | 
    
| BookMark | eNp9kc1KAzEURoNUUGt3PkAewGmTmWSSLqVoFUQLVnA33OanRmeSkplS6973Nm1FXJlNko_DCbnfGer54A1CF5QMqRRjNgJo2hGVpczFETqlomAZyflL78_5BA3a9o2kVUomKT1FXzMTbYgNeGVwsHjyCt10Ns-KIcfg9e-dYedx92qwdabWO7JdGeWgdm2HG6Odghq_-7CpjV4mk8cP0LngU_r0A37uA3z9Ac1O5s06hnYdlyZuz9Gxhbo1g5-9j55vrueT2-z-cXo3ubrPFGVUZBoWUoLllhDJ85wzQoBYIgQdA9GUG6BMkDQMGBdMW6J4qbVSnApZskQWfXR38OoAb9Uqugbitgrgqn0Q4rKC2DlVm6ooyUKL8SK3eRoVSS9bqTkvCyELI83OlR1ca7-C7Qbq-ldISbWvpNpVUh0qSfzlgVfp22009n_8G6f5j4Y | 
    
| Cites_doi | 10.17691/stm2020.12.5.12 10.1227/neu.0000000000002632 10.1016/j.wneu.2022.12.087 10.3171/2023.2.JNS23419 10.1038/d41586-023-00680-3  | 
    
| ContentType | Journal Article | 
    
| DBID | AAYXX CITATION ADTOC UNPAY DOA  | 
    
| DOI | 10.18794/aams/186827 | 
    
| DatabaseName | CrossRef Unpaywall for CDI: Periodical Content Unpaywall [Open Access] DOAJ 오픈액세스 저널 디렉토리  | 
    
| DatabaseTitle | CrossRef | 
    
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Medicine | 
    
| EISSN | 1734-025X | 
    
| EndPage | 258 | 
    
| ExternalDocumentID | oai_doaj_org_article_360bd79b2f28480ab8f8d5563783e8ef 10.18794/aams/186827 10_18794_aams_186827  | 
    
| GroupedDBID | AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION GROUPED_DOAJ Y2W ADTOC UNPAY  | 
    
| ID | FETCH-LOGICAL-c1417-dab88af5f0085225400a0f07719a0d15ea1470879a934df0c56ddcc517864a0f3 | 
    
| IEDL.DBID | UNPAY | 
    
| ISSN | 1734-025X 0208-5607  | 
    
| IngestDate | Fri Oct 03 12:51:15 EDT 2025 Sun Sep 07 11:22:23 EDT 2025 Tue Jul 01 04:22:21 EDT 2025  | 
    
| IsDoiOpenAccess | true | 
    
| IsOpenAccess | true | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Language | English | 
    
| License | https://creativecommons.org/licenses/by-sa/4.0 cc-by-sa  | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-c1417-dab88af5f0085225400a0f07719a0d15ea1470879a934df0c56ddcc517864a0f3 | 
    
| ORCID | 0000-0003-1392-2007 0009-0005-5809-0875  | 
    
| OpenAccessLink | https://proxy.k.utb.cz/login?url=https://doi.org/10.18794/aams/186827 | 
    
| PageCount | 6 | 
    
| ParticipantIDs | doaj_primary_oai_doaj_org_article_360bd79b2f28480ab8f8d5563783e8ef unpaywall_primary_10_18794_aams_186827 crossref_primary_10_18794_aams_186827  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2024-10-15 | 
    
| PublicationDateYYYYMMDD | 2024-10-15 | 
    
| PublicationDate_xml | – month: 10 year: 2024 text: 2024-10-15 day: 15  | 
    
| PublicationDecade | 2020 | 
    
| PublicationTitle | Annales Academiae Medicae Silesiensis | 
    
| PublicationYear | 2024 | 
    
| Publisher | Śląski Uniwersytet Medyczny w Katowicach | 
    
| Publisher_xml | – name: Śląski Uniwersytet Medyczny w Katowicach | 
    
| References | ref-1271793 ref-1271794 ref-1271795 ref-1271796 ref-1271792 ref-1271801 ref-1271797 ref-1271798 ref-1271799 ref-1271800  | 
    
| References_xml | – ident: ref-1271798 doi: 10.17691/stm2020.12.5.12 – ident: ref-1271799 doi: 10.1227/neu.0000000000002632 – ident: ref-1271797 doi: 10.1016/j.wneu.2022.12.087 – ident: ref-1271800 doi: 10.3171/2023.2.JNS23419 – ident: ref-1271792 – ident: ref-1271793 – ident: ref-1271796 – ident: ref-1271794 – ident: ref-1271795 – ident: ref-1271801 doi: 10.1038/d41586-023-00680-3  | 
    
| SSID | ssj0000684811 | 
    
| Score | 2.2733903 | 
    
| Snippet | Introduction: In recent times, there has been an increased number of published materials related to artificial intelligence (AI) in both the medical field, and... | 
    
| SourceID | doaj unpaywall crossref  | 
    
| SourceType | Open Website Open Access Repository Index Database  | 
    
| StartPage | 253 | 
    
| SubjectTerms | artificial intelligence (ai) chatgpt neurosurgery  | 
    
| SummonAdditionalLinks | – databaseName: [Open Access] DOAJ 오픈액세스 저널 디렉토리 dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV27TsMwFLVQBx4D4inKSx6ALTRp7NgZoWqpkIo6tFK3yPFDILVpRVsBH8B_4xubkAkWxliWHZ1r-1zb1-cidKUEz42llkDGnAbEqDTgUiWBNNzwVDJLsnCgP3hK-mPyOKGTWqoviAlz8sAOuFachLliad42diHlochtGwpUrRiPNdcGVt-Qp7XNlFuDQSY-8pHu3A66lhCzZQvE4SGBTI2DSqn-HbS1Lhbi401MpzV-6e2hXe8Y4jv3Q_toQxcHaHPgr74P0efwJ8Qfzw3uPIvVw3AUxLcUi0JV3wS_FNh6dbiMTYOaS5di3toTz9y1DK5O0vC8wF4ae4p9Lnr_MhN338UMGnOSl-759BEa97qjTj_wORQCGRFLQMoCxoWhBnwrO3ftlBWhCRmLUhGqiGoRERZagEQaE2VCSROlpKQR4wmxNeNj1CjmhT5BWEcK8iZqlUtBgMYUo8aAhBkVMU11E11_o5otnFRGBlsMQD8D9DOHfhPdA-RVHRC4Lgus2TNv9uwvszfRTWWwX3s7_Y_eztB22zozwFkRPUeN1etaX1hnZJVfluPuC7i53Nw priority: 102 providerName: Directory of Open Access Journals  | 
    
| Title | Performance of ChatGPT-3.5 and ChatGPT-4 in the field of specialist medical knowledge on National Specialization Exam in neurosurgery | 
    
| URI | https://doi.org/10.18794/aams/186827 https://doaj.org/article/360bd79b2f28480ab8f8d5563783e8ef  | 
    
| UnpaywallVersion | publishedVersion | 
    
| Volume | 78 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1734-025X dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0000684811 issn: 0208-5607 databaseCode: DOA dateStart: 20220101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT-MwEB5BkWA58NgFbZeHfABuLkljx86RRQWEBOqBSuUUOX6I1bYpWlotcOd_M07c8jgAx0QTx_aM9Y09nm8A9oyShUNooTqRnDJnMiq1Sal20slMCwRZf6B_cZme9dh5n_fnYG-aC_M6fi_RVg6VGt4dek73tpiHhZSjx92Ahd5l9-i6Oj6JJMX2qhoqImEUIbwf7re___wN8lQE_cuwNClv1cN_NRi8QpWTVehM-1NfJvnbmoyLln58R9X4WYfXYCW4leSotoN1mLPld1i8CIHzH_DUfUkQICNHjm_U-LR7RZMWJ6o0s2dG_pQEfUJS3Wzzknd1gXq0BjKsgzpkdg5HRiUJxNoDEirZh7xO0rlXQ99YTZhZJ19vQO-kc3V8RkMFBqpjhvBlVCGlctx5zwxXPi54FblIiDhTkYm5VTETEQ5bZQkzLtI8NUZrHguZMpRMNqFRjkr7E4iNja-6aE2hFfMgaAR3zhOgcZXwzDZhf6qd_LYm2sj9BsXPae7nNK_ntAm_vepmMp4eu3qBSsjDasuTNCqMyIq2Q_SVEY7CSeOp0IRMrLSuCQczxX_4t19fFdyCb210dzyqxXwbGuN_E7uD7sq42K22-bvBZp8BMk7mIg | 
    
| linkProvider | Unpaywall | 
    
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTxsxEB7RILVwgJaHCAXkA-3NYTdrr71HGoVGSKAciJSeVl4_1IpkE0EiHnf-N-NdJzwObY-7Gnttz1jfrMfzDcCxUbJwCC1UJ5JT5kxGpTYp1U46mWmBIOsP9C8u096AnQ_5cAWOF7kwr-P3Em3lRKnx7YnndG-LD7CacvS4G7A6uOyf_qqOTyJJsb-qhopIGEUIH4b77e-bv0GeiqB_HT7Ny6l6uFOj0StUOduE7mI89WWS69Z8VrT04zuqxn8N-DNsBLeSnNZ28AVWbLkFHy9C4HwbnvovCQJk4kjnt5r97F_RpMWJKs3ymZE_JUGfkFQ327zkbV2gHq2BjOugDlmew5FJSQKx9oiESvYhr5N079XYd1YTZtbJ1zswOOtedXo0VGCgOmYIX0YVUirHnffMcOfjhleRi4SIMxWZmFsVMxHhtFWWMOMizVNjtOaxkClDyWQXGuWktHtAbGx81UVrCq2YB0EjuHOeAI2rhGe2Cd8W2smnNdFG7n9Q_Jrmfk3zek2b8MOrbinj6bGrF6iEPOy2PEmjwoisaDtEXxnhLJw0ngpNyMRK65rwfan4v35t_38Fv8JaG90dj2oxP4DG7GZuD9FdmRVHwVqfAdLz5S0 | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Performance+of+ChatGPT-3.5+and+ChatGPT-4+in+the+field+of+specialist+medical+knowledge+on+National+Specialization+Exam+in+neurosurgery&rft.jtitle=Annales+Academiae+Medicae+Silesiensis&rft.au=Laskowski%2C+Maciej&rft.au=Ciekalski%2C+Marcin&rft.au=Laskowski%2C+Marcin&rft.au=B%C5%82aszczyk%2C+Bart%C5%82omiej&rft.date=2024-10-15&rft.issn=1734-025X&rft.eissn=1734-025X&rft.volume=78&rft.spage=253&rft.epage=258&rft_id=info:doi/10.18794%2Faams%2F186827&rft.externalDBID=n%2Fa&rft.externalDocID=10_18794_aams_186827 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1734-025X&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1734-025X&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1734-025X&client=summon |