Determining the most accurate machine learning algorithms for medical diagnosis using the monk' problems database and statistical measurements

Computer-aided diagnosis process in the field of health, especially cancer diagnosis, is of vital importance. Computer-aided diagnosis helps specialist physicians to make the most accurate diagnosis. According to research studies, it has been stated that the number of wrong or late diagnosis increas...

Full description

Saved in:
Bibliographic Details
Published inJournal of experimental & theoretical artificial intelligence Vol. 37; no. 2; pp. 357 - 376
Main Author Avuçlu, Emre
Format Journal Article
LanguageEnglish
Published Abingdon Taylor & Francis 17.02.2025
Taylor & Francis Ltd
Subjects
Online AccessGet full text
ISSN0952-813X
1362-3079
DOI10.1080/0952813X.2023.2196984

Cover

More Information
Summary:Computer-aided diagnosis process in the field of health, especially cancer diagnosis, is of vital importance. Computer-aided diagnosis helps specialist physicians to make the most accurate diagnosis. According to research studies, it has been stated that the number of wrong or late diagnosis increases with each passing year and ultimately causes the death of people living in many parts of the world. For this reason, some calculations must be made to determine the most accurate one in the algorithm to be used to make the correct diagnosis. In this study, three different database Monk' problems were used to determine the most accurate algorithm for medical diagnosis. Monk' problems are used as one of the several classification problems used to create an important comparative study. Train and test operations were performed using five different Machine Learning Algorithms (MLAs) (k Nearest Neighbor (k-NN), Decision Tree Algorithm (DT), Random Forest Algorithm (RF), Naive Bayes algorithm (NB), Support Vector Cases (SVM)). These machine learning algorithms are compared statistically in terms of performance. Two different databases in the medical field were used to test the results (Breast Cancer Coimbra Data Set, Diabetic Retinopathy Debrecen Data Set). In the test processes in the experimental studies, the highest accuracy rate was obtained from the k-NN, DT, RF, NB, SVM algorithms, respectively; 0.9758, 1, 1, 0.9180, 0.9344. The best performance was obtained from RF MLA for 1. dataset, DT MLA for 2. dataset, highest accuracy rates from k-NN and RF MLAs in 3. dataset.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0952-813X
1362-3079
DOI:10.1080/0952813X.2023.2196984