CAISHI: A benchmark histopathological H&E image dataset for cervical adenocarcinoma in situ identification, retrieval and few-shot learning evaluation

A benchmark histopathological Hematoxylin and Eosin (H&E) image dataset for Cervical Adenocarcinoma in Situ (CAISHI), containing 2240 histopathological images of Cervical Adenocarcinoma in Situ (AIS), is established to fill the current data gap, of which 1010 are images of normal cervical glands...

Full description

Saved in:
Bibliographic Details
Published inData in brief Vol. 53; p. 110141
Main Authors Yang, Xinyi, Li, Chen, He, Ruilin, Yang, Jinzhu, Sun, Hongzan, Jiang, Tao, Grzegorzek, Marcin, Li, Xiaohan, Liu, Chang
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier Inc 01.04.2024
Subjects
Online AccessGet full text
ISSN2352-3409
2352-3409
DOI10.1016/j.dib.2024.110141

Cover

More Information
Summary:A benchmark histopathological Hematoxylin and Eosin (H&E) image dataset for Cervical Adenocarcinoma in Situ (CAISHI), containing 2240 histopathological images of Cervical Adenocarcinoma in Situ (AIS), is established to fill the current data gap, of which 1010 are images of normal cervical glands and another 1230 are images of cervical AIS. The sampling method is endoscope biopsy. Pathological sections are obtained by H&E staining from Shengjing Hospital, China Medical University. These images have a magnification of 100 and are captured by the Axio Scope. A1 microscope. The size of the image is 3840 × 2160 pixels, and the format is “.png”. The collection of CAISHI is subject to an ethical review by China Medical University with approval number 2022PS841K. These images are analyzed at multiple levels, including classification tasks and image retrieval tasks. A variety of computer vision and machine learning methods are used to evaluate the performance of the data. For classification tasks, a variety of classical machine learning classifiers such as k-means, support vector machines (SVM), and random forests (RF), as well as convolutional neural network classifiers such as Residual Network 50 (ResNet50), Vision Transformer (ViT), Inception version 3 (Inception-V3), and Visual Geometry Group Network 16 (VGG-16), are used. In addition, the Siamese network is used to evaluate few-shot learning tasks. In terms of image retrieval functions, color features, texture features, and deep learning features are extracted, and their performances are tested. CAISHI can help with the early diagnosis and screening of cervical cancer. Researchers can use this dataset to develop new computer-aided diagnostic tools that could improve the accuracy and efficiency of cervical cancer screening and advance the development of automated diagnostic algorithms.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2352-3409
2352-3409
DOI:10.1016/j.dib.2024.110141