Cleaning Highly Unbalanced Multisource Image Dataset for Quality Control in Cervical Precancer Screening

Automated visual evaluation (AVE) of uterine cervix images is a deep learning algorithm that aims to improve cervical pre-cancer screening in low or medium resource regions (LMRR). Image quality control is an important pre-step in the development and use of AVE. In our work, we use data retrospectiv...

Full description

Saved in:

Bibliographic Details
Published in	Recent Trends in Image Processing and Pattern Recognition Vol. 1576; pp. 3 - 13
Main Authors	Xue, Zhiyun, Guo, Peng, Angara, Sandeep, Pal, Anabik, Jeronimo, Jose, Desai, Kanan T., Ajenifuja, Olusegun K., Adepiti, Clement A., Sanjose, Silvia D., Schiffman, Mark, Antani, Sameer
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2022 Springer International Publishing
Series	Communications in Computer and Information Science
Subjects	Acetowhitening Anomaly detection Cervical cancer Cross-dataset evaluation Deep learning Green-filter Highly unbalanced dataset Lugol’s iodine
Online Access	Get full text
ISBN	3031070046 9783031070044
ISSN	1865-0929 1865-0937
DOI	10.1007/978-3-031-07005-1_1

Cover

More Information
Summary:	Automated visual evaluation (AVE) of uterine cervix images is a deep learning algorithm that aims to improve cervical pre-cancer screening in low or medium resource regions (LMRR). Image quality control is an important pre-step in the development and use of AVE. In our work, we use data retrospectively collected from different sources/providers for analysis. In addition to good images, the datasets include low-quality images, green-filter images, and post Lugol’s iodine images. The latter two are uncommon in VIA (visual inspection with acetic acid) and should be removed along with low-quality images. In this paper, we apply and compare two state-of-the-art deep learning networks to filter out those two types of cervix images after cervix detection. One of the deep learning networks is DeepSAD, a semi-supervised anomaly detection network, while the other is ResNeSt, an improved variant of the ResNet classification network. Specifically, we study and evaluate the algorithms on a highly unbalanced large dataset consisting of four subsets from different geographic regions acquired with different imaging device types. We also examine the cross-dataset performance of the algorithms. Both networks can achieve high performance (accuracy above 97% and F1 score above 94%) on the test set.
ISBN:	3031070046 9783031070044
ISSN:	1865-0929 1865-0937
DOI:	10.1007/978-3-031-07005-1_1