Retrospective Clinical Trial to Evaluate the Effectiveness of a New Tanner–Whitehouse-Based Bone Age Assessment Algorithm Trained with a Deep Neural Network System

Background/Objectives: To develop an automated deep learning-based bone age prediction model using the Tanner–Whitehouse (TW3) method and evaluate its feasibility by comparing its performance with that of pediatric radiologists. Methods: The hand and wrist radiographs of 560 Korean children and adol...

Full description

Saved in:
Bibliographic Details
Published inDiagnostics (Basel) Vol. 15; no. 8; p. 993
Main Authors Lee, Meesun, Choi, Young-Hun, Lee, Seul-Bi, Choi, Jae-Won, Lee, Seunghyun, Hwang, Jae-Yeon, Cheon, Jung-Eun, Hong, SungHyuk, Kim, Jeonghoon, Cho, Yeon-Jin
Format Journal Article
LanguageEnglish
Published Switzerland MDPI AG 14.04.2025
MDPI
Subjects
Online AccessGet full text
ISSN2075-4418
2075-4418
DOI10.3390/diagnostics15080993

Cover

More Information
Summary:Background/Objectives: To develop an automated deep learning-based bone age prediction model using the Tanner–Whitehouse (TW3) method and evaluate its feasibility by comparing its performance with that of pediatric radiologists. Methods: The hand and wrist radiographs of 560 Korean children and adolescents (280 female, 280 male, mean age 9.43 ± 2.92 years) were evaluated using the TW3-based model and three pediatric radiologists. Images with bony destruction, congenital anomalies, or non-diagnostic quality were excluded. A commercialized AI solution built upon the Rotated Single Shot MultiBox Detector (SSD) and EfficientNet-B0 was used. Bone age measurements from the model and radiologists were compared using the paired t-tests. Linear regression analysis was performed and the coefficient of determination (r²), mean absolute error (MAE), and root mean square error (RMSE) were measured. A Bland–Altman analysis was conducted and the proportion of bone age predictions within 0.6 years of the radiologists’ assessments was calculated. Results: The TW3-based model demonstrated no significant differences between bone age measurements and radiologists, except for participants <6 and >13 years old (overall, p = 0.874; 6–8 years, p = 0.737; 8–9 years, p = 0.093; 9–10 years, p = 0.301; 10–11 years, p = 0.584; 11–13 years, p = 0.976; <6 or >13 years, p < 0.001). There was a strong linear correlation between the model prediction and radiologist assessments (r2 = 0.977). The RMSE and MAE values of the model were 0.529 (95% CI, 0.482–0.575) and 0.388 (95% CI, 0.361–0.417) years. Overall, 82.3% of bone age model predictions were within 0.6 years of the radiologists’ interpretation. Conclusions: Automated deep learning-based bone age assessment has the potential to reduce radiologists’ workload and provide standardized measurements for clinical decision making.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2075-4418
2075-4418
DOI:10.3390/diagnostics15080993