Testing the knowledge of artificial intelligence chatbots in pharmacology: examples of two groups of drugs

The study aimed to evaluate eight artificial intelligence chatbots (ChatGPT-3.5, Microsoft Copilot, Gemini, You.com, Perplexity, Character.ai, Claude 3.5, and ChatRTX) in answering questions related to two pharmacological topics taught during the basic pharmacology curriculum for medical students: a...

Full description

Saved in:

Bibliographic Details
Published in	PeerJ. Computer science Vol. 11; p. e2954
Main Authors	Granat, Marcin Mateusz, Paź, Aleksandra, Mirowska-Guzel, Dagmara
Format	Journal Article
Language	English
Published	PeerJ. Ltd 15.07.2025 PeerJ Inc
Subjects	AI chatbots Antifungal agents Antifungal drugs Artificial intelligence Hypolipidemic drugs Medical students Pharmacology United States California
Online Access	Get full text
ISSN	2376-5992 2376-5992
DOI	10.7717/peerj-cs.2954

Cover

More Information
Summary:	The study aimed to evaluate eight artificial intelligence chatbots (ChatGPT-3.5, Microsoft Copilot, Gemini, You.com, Perplexity, Character.ai, Claude 3.5, and ChatRTX) in answering questions related to two pharmacological topics taught during the basic pharmacology curriculum for medical students: antifungal drugs and hypolipidemic drugs. Chatbots' performance was assessed by answering 60 single-choice questions on antifungal and hypolipidemic drugs topics. The questions were designed to have four answers (a, b, c, and d), and the artificial intelligence (AI) role was to choose the proper one. The assessment was performed twice with a 1-year hiatus to determine if artificial intelligence chatbots' effectiveness changed over time. All the answers were checked for being right or wrong according to up-to-date pharmacology knowledge. To improve the clarity of results, to each score, a mark was assigned based on the grading system applied in our unit. Statistica software version 13.3 and Microsoft Excel 2010 were used for statistical analysis. In 2023, the best results on the subject of antifungal drugs were obtained by Gemini (formerly Bard) and on the topic of hypolipidemic drugs by You.com (formerly YouChat). In 2024Microsoft Copilot answered correctly the highest number of questions in both topics. The total results of all artificial intelligence chatbots in 2023 and 2024 were compared using t-test for dependent samples. Statistical analysis revealed that artificial intelligence chatbots improved over time in both pharmacological topics, but this change was not statistically significant (p = 0.784 for antifungal drugs subject and p = 0.056 for hypolipidemic drugs). The accuracy of AI chatbots' responses regarding antifungal and hypolipidemic drugs improved over one year, though not significantly. None of the tested AI systems provided correct answers to all questions within these pharmacological fields.
ISSN:	2376-5992 2376-5992
DOI:	10.7717/peerj-cs.2954