Examining the Relationship of Breast Cancer Data With Survival Chance and Comparison of Algorithms on Breast Cancer Prediction
This article compares the performance of machine learning algorithms on breast cancer data. The aim is to predict the survival status of breast cancer patients and contribute to the development of clinical decision support systems. Using a dataset obtained from the National Cancer Institute, XGBoost...
Saved in:
| Published in | International Journal of Applied Methods in Electronics and Computers |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
31.03.2025
|
| Online Access | Get full text |
| ISSN | 3023-4409 3023-4409 |
| DOI | 10.58190/ijamec.2025.117 |
Cover
| Summary: | This article compares the performance of machine learning algorithms on breast cancer data. The aim is to predict the survival status of breast cancer patients and contribute to the development of clinical decision support systems. Using a dataset obtained from the National Cancer Institute, XGBoost, Random Forest, Support Vector Machines (SVM), and Logistic Regression algorithms were compared. Data preprocessing steps were applied, correlation analysis was performed, and it was determined that the XGBoost algorithm showed the best performance with hyperparameter optimization. The metrics obtained after hyperparameter optimization of the XGBoost algorithm show an overall accuracy of 92%. Optimization has resulted in high performance for class 0 (precision 92%, recall 98%), but the recall for class 1 remains at 54%. The article discusses the effect of data imbalance on the results and offers suggestions for future studies.
This article compares the performance of machine learning algorithms on breast cancer data. The aim is to predict the survival status of breast cancer patients and contribute to the development of clinical decision support systems. Using a dataset obtained from the National Cancer Institute, XGBoost, Random Forest, Support Vector Machines (SVM), and Logistic Regression algorithms were compared. Data preprocessing steps were applied, correlation analysis was performed, and it was determined that the XGBoost algorithm showed the best performance with hyperparameter optimization. The metrics obtained after hyperparameter optimization of the XGBoost algorithm show an overall accuracy of 92%. Optimization has resulted in high performance for class 0 (precision 92%, recall 98%), but the recall for class 1 remains at 54%. The article discusses the effect of data imbalance on the results and offers suggestions for future studies. |
|---|---|
| ISSN: | 3023-4409 3023-4409 |
| DOI: | 10.58190/ijamec.2025.117 |