Evaluating a clinically available artificial intelligence model for intracranial aneurysm detection: a multi-reader study and algorithmic audit
Purpose We aimed to validate a clinically available artificial intelligence (AI) model to assist general radiologists in the detection of intracranial aneurysm (IA) in a multi-reader multi-case (MRMC) study, and to explore its performance in routine clinical settings. Methods Two distinct cohorts of...
Saved in:
| Published in | Neuroradiology Vol. 67; no. 4; pp. 855 - 864 |
|---|---|
| Main Authors | , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.04.2025
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0028-3940 1432-1920 1432-1920 |
| DOI | 10.1007/s00234-024-03536-3 |
Cover
| Summary: | Purpose
We aimed to validate a clinically available artificial intelligence (AI) model to assist general radiologists in the detection of intracranial aneurysm (IA) in a multi-reader multi-case (MRMC) study, and to explore its performance in routine clinical settings.
Methods
Two distinct cohorts of head CT angiography (CTA) data were assembled to validate an AI model. Cohort 1, comprising gold-standard consecutive CTA cases, was used in an MRMC study involving six board-certified general radiologists. Cohort 2, representing clinical CTA cases, was used to simulate a routine clinical setting. Following these evaluations, an algorithmic audit was conducted to identify any unusual or unexpected behaviors exhibited by the model.
Results
Cohort 1 consisted of 131 CTA cases, while Cohort 2 included 515 CTA cases. In the MRMC study, the AI-assisted strategy demonstrated a significant improvement in aneurysm diagnostic performance, with the area under the receiver operating characteristic curve increasing from 0.815 (95%CI: 0.754–0.875) to 0.875 (95%CI: 0.831–0.921;
p
= 0.008). In the AI-based first-reader study, 60.4% of the CTA cases were identified as negative by the AI, with a high negative predictive value of 0.994 (95%CI: 0.977–0.999). The algorithmic audit highlighted two issues for improvement: the accurate detection of tiny aneurysms and the effective exclusion of false-positive lesions.
Conclusion
This study highlights the clinical utility of a high-performance AI model in detecting IAs, significantly improving general radiologists’ diagnostic performance with the potential to reduce their workload in routine clinical practice. The algorithmic audit offers insights to guide the development and validation of future AI models. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 0028-3940 1432-1920 1432-1920 |
| DOI: | 10.1007/s00234-024-03536-3 |