Evaluating ChatGPT-4's performance on oral and maxillofacial queries: Chain of Thought and standard method

Oral and maxillofacial diseases affect approximately 3.5 billion people worldwide. With the continuous advancement of Artificial Intelligence technologies, particularly the application of generative pre-trained transformers like ChatGPT-4, there is potential to enhance public awareness of the preven...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in oral health Vol. 6; p. 1541976
Main Authors Ji, Kaiyuan, Wu, Zhihan, Han, Jing, Zhai, Guangtao, Liu, Jiannan
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media SA 2025
Frontiers Media S.A
Subjects
Online AccessGet full text
ISSN2673-4842
2673-4842
DOI10.3389/froh.2025.1541976

Cover

More Information
Summary:Oral and maxillofacial diseases affect approximately 3.5 billion people worldwide. With the continuous advancement of Artificial Intelligence technologies, particularly the application of generative pre-trained transformers like ChatGPT-4, there is potential to enhance public awareness of the prevention and early detection of these diseases. This study evaluated the performance of ChatGPT-4 in addressing oral and maxillofacial disease questions using standard approaches and the Chain of Thought (CoT) method, aiming to gain a deeper understanding of its capabilities, potential, and limitations. Three experts, drawing from their extensive experience and the most common questions in clinical settings, selected 130 open-ended questions and 1,805 multiple-choice questions from the national dental licensing examination. These questions encompass 12 areas of oral and maxillofacial surgery, including Prosthodontics, Pediatric Dentistry, Maxillofacial Tumors and Salivary Gland Diseases, and maxillofacial Infections. Using CoT approach, ChatGPT-4 exhibited marked enhancements in accuracy, structure, completeness, professionalism, and overall impression for open-ended questions, revealing statistically significant differences compared to its performance on general oral and maxillofacial inquiries. In the realm of multiple-choice questions, the application of CoT method boosted ChatGPT-4's accuracy across all major subjects, achieving an overall accuracy increase of 3.1%. When employing ChatGPT-4 to address questions in oral and maxillofacial surgery, incorporating CoT as a querying method can enhance its performance and help the public improve their understanding and awareness of such issues. However, it is not advisable to consider it a substitute for doctors.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Zhi-Cheng Li, Chinese Academy of Sciences (CAS), China
Fa-yu Liu, China Medical University, China
These authors have contributed equally to this work
Edited by: Leandro Napier Souza, Federal University of Minas Gerais, Brazil
Reviewed by: Khalid Almas, Imam Abdulrahman Bin Faisal University, Saudi Arabia
ISSN:2673-4842
2673-4842
DOI:10.3389/froh.2025.1541976