A Novel Fundus Image Reading Tool for Efficient Generation of a Multi-dimensional Categorical Image Database for Machine Learning Algorithm Training

We described a novel multi-step retinal fundus image reading system for providing high-quality large data for machine learning algorithms, and assessed the grader variability in the large-scale dataset generated with this system. A 5-step retinal fundus image reading tool was developed that rates im...

Full description

Saved in:

Bibliographic Details
Published in	Journal of Korean medical science Vol. 33; no. 43; pp. e239 - 12
Main Authors	Park, Sang Jun, Shin, Joo Young, Kim, Sangkeun, Son, Jaemin, Jung, Kyu-Hwan, Park, Kyu Hyung
Format	Journal Article
Language	English
Published	Korea (South) The Korean Academy of Medical Sciences 22.10.2018 대한의학회
Subjects	Databases, Factual Fundus Oculi Humans Machine Learning Original Republic of Korea Retina - diagnostic imaging 의학일반 Republic of Korea Retina Fundus Image Reading Tool Machine Learning Deep Learning Grader
Online Access	Get full text
ISSN	1011-8934 1598-6357 1598-6357
DOI	10.3346/jkms.2018.33.e239

Cover

More Information
Summary:	We described a novel multi-step retinal fundus image reading system for providing high-quality large data for machine learning algorithms, and assessed the grader variability in the large-scale dataset generated with this system. A 5-step retinal fundus image reading tool was developed that rates image quality, presence of abnormality, findings with location information, diagnoses, and clinical significance. Each image was evaluated by 3 different graders. Agreements among graders for each decision were evaluated. The 234,242 readings of 79,458 images were collected from 55 licensed ophthalmologists during 6 months. The 34,364 images were graded as abnormal by at-least one rater. Of these, all three raters agreed in 46.6% in abnormality, while 69.9% of the images were rated as abnormal by two or more raters. Agreement rate of at-least two raters on a certain finding was 26.7%-65.2%, and complete agreement rate of all-three raters was 5.7%-43.3%. As for diagnoses, agreement of at-least two raters was 35.6%-65.6%, and complete agreement rate was 11.0%-40.0%. Agreement of findings and diagnoses were higher when restricted to images with prior complete agreement on abnormality. Retinal/glaucoma specialists showed higher agreements on findings and diagnoses of their corresponding subspecialties. This novel reading tool for retinal fundus images generated a large-scale dataset with high level of information, which can be utilized in future development of machine learning-based algorithms for automated identification of abnormal conditions and clinical decision supporting system. These results emphasize the importance of addressing grader variability in algorithm developments.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Sang Jun Park and Joo Young Shin contributed equally to this work.
ISSN:	1011-8934 1598-6357 1598-6357
DOI:	10.3346/jkms.2018.33.e239