Cleaning Highly Unbalanced Multisource Image Dataset for Quality Control in Cervical Precancer Screening

Automated visual evaluation (AVE) of uterine cervix images is a deep learning algorithm that aims to improve cervical pre-cancer screening in low or medium resource regions (LMRR). Image quality control is an important pre-step in the development and use of AVE. In our work, we use data retrospectiv...

Full description

Saved in:
Bibliographic Details
Published inRecent Trends in Image Processing and Pattern Recognition Vol. 1576; pp. 3 - 13
Main Authors Xue, Zhiyun, Guo, Peng, Angara, Sandeep, Pal, Anabik, Jeronimo, Jose, Desai, Kanan T., Ajenifuja, Olusegun K., Adepiti, Clement A., Sanjose, Silvia D., Schiffman, Mark, Antani, Sameer
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2022
Springer International Publishing
SeriesCommunications in Computer and Information Science
Subjects
Online AccessGet full text
ISBN3031070046
9783031070044
ISSN1865-0929
1865-0937
DOI10.1007/978-3-031-07005-1_1

Cover

More Information
Summary:Automated visual evaluation (AVE) of uterine cervix images is a deep learning algorithm that aims to improve cervical pre-cancer screening in low or medium resource regions (LMRR). Image quality control is an important pre-step in the development and use of AVE. In our work, we use data retrospectively collected from different sources/providers for analysis. In addition to good images, the datasets include low-quality images, green-filter images, and post Lugol’s iodine images. The latter two are uncommon in VIA (visual inspection with acetic acid) and should be removed along with low-quality images. In this paper, we apply and compare two state-of-the-art deep learning networks to filter out those two types of cervix images after cervix detection. One of the deep learning networks is DeepSAD, a semi-supervised anomaly detection network, while the other is ResNeSt, an improved variant of the ResNet classification network. Specifically, we study and evaluate the algorithms on a highly unbalanced large dataset consisting of four subsets from different geographic regions acquired with different imaging device types. We also examine the cross-dataset performance of the algorithms. Both networks can achieve high performance (accuracy above 97% and F1 score above 94%) on the test set.
ISBN:3031070046
9783031070044
ISSN:1865-0929
1865-0937
DOI:10.1007/978-3-031-07005-1_1