Multi-Institutional Evaluation and Training of Breast Density Classification AI Algorithm Using ACR Connect and AI-LAB

To demonstrate and test the capabilities of the ACR Connect and AI-LAB software platform by implementing multi-institutional artificial intelligence (AI) training and validation for breast density classification. In this proof-of-concept study, six US-based hospitals installed Connect and AI-LAB. A...

Full description

Saved in:
Bibliographic Details
Published inJournal of the American College of Radiology Vol. 22; no. 2; pp. 211 - 219
Main Authors Brink, Laura, Romero, Ricardo Amaya, Coombs, Laura, Tilkin, Mike, Mazaheri, Sina, Gichoya, Judy, Zaiman, Zachary, Trivedi, Hari, Medina, Adam, Bizzo, Bernardo C., Chang, Ken, Kalpathy-Cramer, Jayashree, Kalra, Mannudeep K., Astuto, Bruno, Ramirez, Carolina, Majumdar, Sharmila, Lee, Amie Y., Lee, Christoph I., Cross, Nathan M., Chen, Po-Hao, Ciancibello, Michael, Chiunda, Allan, Nachand, Douglas, Shah, Chintan, Wald, Christoph
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 01.02.2025
Subjects
Online AccessGet full text
ISSN1546-1440
1558-349X
1558-349X
DOI10.1016/j.jacr.2024.11.003

Cover

More Information
Summary:To demonstrate and test the capabilities of the ACR Connect and AI-LAB software platform by implementing multi-institutional artificial intelligence (AI) training and validation for breast density classification. In this proof-of-concept study, six US-based hospitals installed Connect and AI-LAB. A breast density algorithm was trained and tested on retrospective mammograms. We recorded time to receive institutional review board approval, to install software locally, and to complete the testing and training. We calculated the performance of the breast density algorithm at each participating hospital and compared it to the performance of a holdout multi-institutional clinical trial testing dataset and a retrospective multi-institutional dataset. We calculated the performance of the locally fine-tuned models on the holdout test datasets. The median time to receive institutional review board approval was 66 days, and the median time to successfully install Connect and AI-LAB locally was 157 days. The median time to complete breast density algorithm testing and training was 216 days. The breast density algorithm performed worse at each hospital than on the holdout test dataset, suggesting poor generalizability of the base model. The fine-tuned models had mixed performance locally and performed poorly on the test dataset. In this study, we demonstrate the successful installation and implementation of Connect and AI-LAB software platforms at six facilities using a breast density algorithm. Our results suggest poor generalizability of an algorithm trained on a single dataset and algorithms fine-tuned at individual institutions, emphasizing the hypothetical importance of multi-institutional testing and training.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1546-1440
1558-349X
1558-349X
DOI:10.1016/j.jacr.2024.11.003