Cleaning Highly Unbalanced Multisource Image Dataset for Quality Control in Cervical Precancer Screening
Automated visual evaluation (AVE) of uterine cervix images is a deep learning algorithm that aims to improve cervical pre-cancer screening in low or medium resource regions (LMRR). Image quality control is an important pre-step in the development and use of AVE. In our work, we use data retrospectiv...
Saved in:
Published in | Recent Trends in Image Processing and Pattern Recognition Vol. 1576; pp. 3 - 13 |
---|---|
Main Authors | , , , , , , , , , , |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer International Publishing AG
2022
Springer International Publishing |
Series | Communications in Computer and Information Science |
Subjects | |
Online Access | Get full text |
ISBN | 3031070046 9783031070044 |
ISSN | 1865-0929 1865-0937 |
DOI | 10.1007/978-3-031-07005-1_1 |
Cover
Summary: | Automated visual evaluation (AVE) of uterine cervix images is a deep learning algorithm that aims to improve cervical pre-cancer screening in low or medium resource regions (LMRR). Image quality control is an important pre-step in the development and use of AVE. In our work, we use data retrospectively collected from different sources/providers for analysis. In addition to good images, the datasets include low-quality images, green-filter images, and post Lugol’s iodine images. The latter two are uncommon in VIA (visual inspection with acetic acid) and should be removed along with low-quality images. In this paper, we apply and compare two state-of-the-art deep learning networks to filter out those two types of cervix images after cervix detection. One of the deep learning networks is DeepSAD, a semi-supervised anomaly detection network, while the other is ResNeSt, an improved variant of the ResNet classification network. Specifically, we study and evaluate the algorithms on a highly unbalanced large dataset consisting of four subsets from different geographic regions acquired with different imaging device types. We also examine the cross-dataset performance of the algorithms. Both networks can achieve high performance (accuracy above 97% and F1 score above 94%) on the test set. |
---|---|
ISBN: | 3031070046 9783031070044 |
ISSN: | 1865-0929 1865-0937 |
DOI: | 10.1007/978-3-031-07005-1_1 |