Automated detection of spinal bone marrow oedema in axial spondyloarthritis: training and validation using two large phase 3 trial datasets
Abstract Objective To evaluate the performance of machine learning (ML) models for the automated scoring of spinal MRI bone marrow oedema (BMO) in patients with axial spondyloarthritis (axSpA) and compare them with expert scoring. Methods ML algorithms using SpineNet software were trained and valida...
Saved in:
| Published in | Rheumatology (Oxford, England) Vol. 64; no. 10; pp. 5446 - 5454 |
|---|---|
| Main Authors | , , , , , , , , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
England
Oxford University Press
01.10.2025
Oxford Publishing Limited (England) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1462-0324 1462-0332 1462-0332 |
| DOI | 10.1093/rheumatology/keaf323 |
Cover
| Summary: | Abstract
Objective
To evaluate the performance of machine learning (ML) models for the automated scoring of spinal MRI bone marrow oedema (BMO) in patients with axial spondyloarthritis (axSpA) and compare them with expert scoring.
Methods
ML algorithms using SpineNet software were trained and validated on 3483 spinal MRIs from 686 axSpA patients across two clinical trial datasets. The scoring pipeline involved (i) detection and labelling of vertebral bodies and (ii) classification of vertebral units for the presence or absence of BMO. Two models were tested: Model 1, without manual segmentation, and Model 2, incorporating an intermediate manual segmentation step. Model outputs were compared with those of human experts using kappa statistics, balanced accuracy, sensitivity, specificity and AUC.
Results
Both models performed comparably to expert readers, regarding presence vs absence of BMO. Model 1 outperformed Model 2, with an AUC of 0.94 (vs 0.88), accuracy of 75.8% (vs 70.5%) and kappa of 0.50 (vs 0.31) using absolute reader consensus scoring as the external reference; this performance was similar to the expert inter-reader accuracy of 76.8% and kappa of 0.47 in a radiographic axSpA dataset. In a non-radiographic axSpA dataset, Model 1 achieved an AUC of 0.97 (vs 0.91 for Model 2), accuracy of 74.6% (vs 70%) and kappa of 0.52 (vs 0.27), comparable to the expert inter-reader accuracy of 74.2% and kappa of 0.46.
Conclusion
ML software shows potential for automated MRI BMO assessment in axSpA, offering benefits such as improved consistency, reduced labour costs and minimized inter- and intra-reader variability.
Trial registration
Clinicaltrials.gov, http://clinicaltrials.gov, MEASURE 1 study (NCT01358175); PREVENT study (NCT02696031). |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1462-0324 1462-0332 1462-0332 |
| DOI: | 10.1093/rheumatology/keaf323 |