Performance of a deep neural network in teledermatology: a single‐centre prospective diagnostic study
Background The use of artificial intelligence (AI) algorithms for the diagnosis of skin diseases has shown promise in experimental settings but has not been yet tested in real‐life conditions. Objective To assess the diagnostic performance and potential clinical utility of a 174‐multiclass AI algori...
Saved in:
| Published in | Journal of the European Academy of Dermatology and Venereology Vol. 35; no. 2; pp. 546 - 553 |
|---|---|
| Main Authors | , , , , , , , , , , , , , , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
England
01.02.2021
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0926-9959 1468-3083 1468-3083 |
| DOI | 10.1111/jdv.16979 |
Cover
| Summary: | Background
The use of artificial intelligence (AI) algorithms for the diagnosis of skin diseases has shown promise in experimental settings but has not been yet tested in real‐life conditions.
Objective
To assess the diagnostic performance and potential clinical utility of a 174‐multiclass AI algorithm in a real‐life telemedicine setting.
Methods
Prospective, diagnostic accuracy study including consecutive patients who submitted images for teledermatology evaluation. The treating dermatologist chose a single image to upload to a web application during teleconsultation. A follow‐up reader study including nine healthcare providers (3 dermatologists, 3 dermatology residents and 3 general practitioners) was performed.
Results
A total of 340 cases from 281 patients met study inclusion criteria. The mean (SD) age of patients was 33.7 (17.5) years; 63% (n = 177) were female. Exposure to the AI algorithm results was considered useful in 11.8% of visits (n = 40) and the teledermatologist correctly modified the real‐time diagnosis in 0.6% (n = 2) of cases. The overall top‐1 accuracy of the algorithm (41.2%) was lower than that of the dermatologists (60.1%), residents (57.8%) and general practitioners (49.3%) (all comparisons P < 0.05, in the reader study). When the analysis was limited to the diagnoses on which the algorithm had been explicitly trained, the balanced top‐1 accuracy of the algorithm (47.6%) was comparable to the dermatologists (49.7%) and residents (47.7%) but superior to the general practitioners (39.7%; P = 0.049). Algorithm performance was associated with patient skin type and image quality.
Conclusions
A 174‐disease class AI algorithm appears to be a promising tool in the triage and evaluation of lesions with patient‐taken photographs via telemedicine. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Co-senior authors. |
| ISSN: | 0926-9959 1468-3083 1468-3083 |
| DOI: | 10.1111/jdv.16979 |