A systematic review and meta-analysis of artificial intelligence versus clinicians for skin cancer diagnosis

Scientific research of artificial intelligence (AI) in dermatology has increased exponentially. The objective of this study was to perform a systematic review and meta-analysis to evaluate the performance of AI algorithms for skin cancer classification in comparison to clinicians with different leve...

Full description

Saved in:
Bibliographic Details
Published inNPJ digital medicine Vol. 7; no. 1; pp. 125 - 23
Main Authors Salinas, Maria Paz, Sepúlveda, Javiera, Hidalgo, Leonel, Peirano, Dominga, Morel, Macarena, Uribe, Pablo, Rotemberg, Veronica, Briones, Juan, Mery, Domingo, Navarrete-Dechent, Cristian
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 14.05.2024
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text
ISSN2398-6352
2398-6352
DOI10.1038/s41746-024-01103-x

Cover

More Information
Summary:Scientific research of artificial intelligence (AI) in dermatology has increased exponentially. The objective of this study was to perform a systematic review and meta-analysis to evaluate the performance of AI algorithms for skin cancer classification in comparison to clinicians with different levels of expertise. Based on PRISMA guidelines, 3 electronic databases (PubMed, Embase, and Cochrane Library) were screened for relevant articles up to August 2022. The quality of the studies was assessed using QUADAS-2. A meta-analysis of sensitivity and specificity was performed for the accuracy of AI and clinicians. Fifty-three studies were included in the systematic review, and 19 met the inclusion criteria for the meta-analysis. Considering all studies and all subgroups of clinicians, we found a sensitivity (Sn) and specificity (Sp) of 87.0% and 77.1% for AI algorithms, respectively, and a Sn of 79.78% and Sp of 73.6% for all clinicians (overall); differences were statistically significant for both Sn and Sp. The difference between AI performance (Sn 92.5%, Sp 66.5%) vs. generalists (Sn 64.6%, Sp 72.8%), was greater, when compared with expert clinicians. Performance between AI algorithms (Sn 86.3%, Sp 78.4%) vs expert dermatologists (Sn 84.2%, Sp 74.4%) was clinically comparable. Limitations of AI algorithms in clinical practice should be considered, and future studies should focus on real-world settings, and towards AI-assistance.
Bibliography:ObjectType-Article-1
ObjectType-Evidence Based Healthcare-3
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Review-3
content type line 23
ISSN:2398-6352
2398-6352
DOI:10.1038/s41746-024-01103-x