AI-Based Glioma Grading for a Trustworthy Diagnosis: An Analytical Pipeline for Improved Reliability

Glioma is the most common type of tumor in humans originating in the brain. According to the World Health Organization, gliomas can be graded on a four-stage scale, ranging from the most benign to the most malignant. The grading of these tumors from image information is a far from trivial task for r...

Full description

Saved in:
Bibliographic Details
Published inCancers Vol. 15; no. 13; p. 3369
Main Authors Pitarch, Carla, Ribas, Vicent, Vellido, Alfredo
Format Journal Article
LanguageEnglish
Published Switzerland MDPI AG 27.06.2023
MDPI
Subjects
Online AccessGet full text
ISSN2072-6694
2072-6694
DOI10.3390/cancers15133369

Cover

More Information
Summary:Glioma is the most common type of tumor in humans originating in the brain. According to the World Health Organization, gliomas can be graded on a four-stage scale, ranging from the most benign to the most malignant. The grading of these tumors from image information is a far from trivial task for radiologists and one in which they could be assisted by machine-learning-based decision support. However, the machine learning analytical pipeline is also fraught with perils stemming from different sources, such as inadvertent data leakage, adequacy of 2D image sampling, or classifier assessment biases. In this paper, we analyze a glioma database sourced from multiple datasets using a simple classifier, aiming to obtain a reliable tumor grading and, on the way, we provide a few guidelines to ensure such reliability. Our results reveal that by focusing on the tumor region of interest and using data augmentation techniques we significantly enhanced the accuracy and confidence in tumor classifications. Evaluation on an independent test set resulted in an AUC-ROC of 0.932 in the discrimination of low-grade gliomas from high-grade gliomas, and an AUC-ROC of 0.893 in the classification of grades 2, 3, and 4. The study also highlights the importance of providing, beyond generic classification performance, measures of how reliable and trustworthy the model’s output is, thus assessing the model’s certainty and robustness.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2072-6694
2072-6694
DOI:10.3390/cancers15133369