Deep Learning–Based Precision Cropping of Eye Regions in Strabismus Photographs: Algorithm Development and Validation Study for Workflow Optimization

Traditional ocular gaze photograph preprocessing, relying on manual cropping and head tilt correction, is time-consuming and inconsistent, limiting artificial intelligence (AI) model development and clinical application. This study aimed to address these challenges using an advanced preprocessing al...

Full description

Saved in:

Bibliographic Details
Published in	Journal of medical Internet research Vol. 27; no. 10; p. e74402
Main Authors	Wu, Dawen, Li, Yanfei, Yang, Zeyi, Yin, Teng, Chen, Xiaohang, Liu, Jingyu, Shang, Wenyi, Xie, Bin, Yang, Guoyuan, Zhang, Haixian, Liu, Longqian
Format	Journal Article
Language	English
Published	Canada Journal of Medical Internet Research 17.07.2025 JMIR Publications
Subjects	Adult Algorithms Artificial Intelligence Assistive Technology for Vision Loss/Impairment Care and treatment Cross-Sectional Studies Deep Learning Diagnosis Eye - diagnostic imaging Female Health aspects Humans Image Processing, Computer-Assisted - methods Machine Learning Male Medical screening Methods Middle Aged Ophthalmology Original Paper Photography Prospective Studies Retrospective Studies Strabismus Strabismus - diagnostic imaging Tools, Programs and Algorithms Workflow Canada AI management system image preprocessing ocular alignment artificial intelligence clinical workflow
Online Access	Get full text
ISSN	1438-8871 1439-4456 1438-8871
DOI	10.2196/74402

Cover

More Information
Summary:	Traditional ocular gaze photograph preprocessing, relying on manual cropping and head tilt correction, is time-consuming and inconsistent, limiting artificial intelligence (AI) model development and clinical application. This study aimed to address these challenges using an advanced preprocessing algorithm to enhance the accuracy, efficiency, and standardization of eye region cropping for clinical workflows and AI data preprocessing. This retrospective and prospective cross-sectional study utilized 5832 images from 648 inpatients and outpatients, capturing 3 gaze positions under diverse conditions, including obstructions and varying distances. The preprocessing algorithm, based on a rotating bounding box detection framework, was trained and evaluated using precision, recall, and mean average precision (mAP) at various intersections over union thresholds. A 5-fold cross-validation was performed on an inpatient dataset, with additional testing on an independent outpatient dataset and an external cross-population dataset of 500 images from the IMDB-WIKI collection, representing diverse ethnicities and ages. Expert validation confirmed alignment with clinical standards across 96 images (48 images from a Chinese dataset of patients with strabismus and 48 images from IMDB-WIKI). Gradient-weighted class activation mapping heatmaps were used to assess model interpretability. A control experiment with 5 optometry specialists compared manual and automated cropping efficiency. Downstream task validation involved preprocessing 1000 primary gaze photographs using the Dlib toolkit, faster region-based convolutional neural network (R-CNN; both without head tilt correction), and our model (with correction), evaluating the impact of head tilt correction via the vision transformer strabismus screening network through 5-fold cross-validation. The model achieved exceptional performance across datasets: on the 5-fold cross-validation set, it recorded a mean precision of 1.000 (95% CI 1.000-1.000), recall of 1.000 (95% CI 1.000-1.000), mAP50 of 0.995 (95% CI 0.995-0.995), and mAP95 of 0.893 (95% CI 0.870-0.918); on the internal independent test set, precision and recall were 1.000, with mAP50 of 0.995 and mAP95 of 0.801; and on the external cross-population test set, precision and recall were 1.000, with mAP50 of 0.937 and mAP95 of 0.792. The control experiment reduced image preparation time from 10 hours for manual cropping of 900 photos to 30 seconds with the automated model. Downstream strabismus screening task validation showed our model (with head tilt correction) achieving an area under the curve of 0.917 (95% CI 0.901-0.933), surpassing Dlib-toolkit and faster R-CNN (both without head tilt correction) with an area under the curve of 0.856 (P=.02) and 0.884 (P=.05), respectively. Heatmaps highlighted core ocular focus, aligning with head tilt directions. This study delivers an AI-driven platform featuring a preprocessing algorithm that automates eye region cropping, correcting head tilt variations to improve image quality for AI development and clinical use. Integrated with electronic archives and patient-physician interaction, it enhances workflow efficiency, ensures telemedicine privacy, and supports ophthalmological research and strabismus care.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Undefined-3 these authors contributed equally
ISSN:	1438-8871 1439-4456 1438-8871
DOI:	10.2196/74402