A Novel Fundus Image Reading Tool for Efficient Generation of a Multi-dimensional Categorical Image Database for Machine Learning Algorithm Training

We described a novel multi-step retinal fundus image reading system for providing high-quality large data for machine learning algorithms, and assessed the grader variability in the large-scale dataset generated with this system. A 5-step retinal fundus image reading tool was developed that rates im...

Full description

Saved in:
Bibliographic Details
Published inJournal of Korean medical science Vol. 33; no. 43; pp. e239 - 12
Main Authors Park, Sang Jun, Shin, Joo Young, Kim, Sangkeun, Son, Jaemin, Jung, Kyu-Hwan, Park, Kyu Hyung
Format Journal Article
LanguageEnglish
Published Korea (South) The Korean Academy of Medical Sciences 22.10.2018
대한의학회
Subjects
Online AccessGet full text
ISSN1011-8934
1598-6357
1598-6357
DOI10.3346/jkms.2018.33.e239

Cover

More Information
Summary:We described a novel multi-step retinal fundus image reading system for providing high-quality large data for machine learning algorithms, and assessed the grader variability in the large-scale dataset generated with this system. A 5-step retinal fundus image reading tool was developed that rates image quality, presence of abnormality, findings with location information, diagnoses, and clinical significance. Each image was evaluated by 3 different graders. Agreements among graders for each decision were evaluated. The 234,242 readings of 79,458 images were collected from 55 licensed ophthalmologists during 6 months. The 34,364 images were graded as abnormal by at-least one rater. Of these, all three raters agreed in 46.6% in abnormality, while 69.9% of the images were rated as abnormal by two or more raters. Agreement rate of at-least two raters on a certain finding was 26.7%-65.2%, and complete agreement rate of all-three raters was 5.7%-43.3%. As for diagnoses, agreement of at-least two raters was 35.6%-65.6%, and complete agreement rate was 11.0%-40.0%. Agreement of findings and diagnoses were higher when restricted to images with prior complete agreement on abnormality. Retinal/glaucoma specialists showed higher agreements on findings and diagnoses of their corresponding subspecialties. This novel reading tool for retinal fundus images generated a large-scale dataset with high level of information, which can be utilized in future development of machine learning-based algorithms for automated identification of abnormal conditions and clinical decision supporting system. These results emphasize the importance of addressing grader variability in algorithm developments.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Sang Jun Park and Joo Young Shin contributed equally to this work.
ISSN:1011-8934
1598-6357
1598-6357
DOI:10.3346/jkms.2018.33.e239