Predicting cardiovascular disease by combining optimal feature selection methods with machine learning

Cardiovascular Disease (CVD) is one of the main causes of death in the world. Early detection could prevent deaths associated to cardiac problems. In this work, we propose a methodology based on data pre-processing and Machine Learning (ML) techniques for predicting cardiovascular disease, by using...

Full description

Saved in:
Bibliographic Details
Published in2020 39th International Conference of the Chilean Computer Science Society (SCCC) pp. 1 - 8
Main Authors Segura, Mauricio Rodriguez, Nicolis, Orietta, Marquez, Billy Peralta, Carrillo Azocar, Juan
Format Conference Proceeding
LanguageEnglish
Published IEEE 16.11.2020
Subjects
Online AccessGet full text
DOI10.1109/SCCC51225.2020.9281168

Cover

More Information
Summary:Cardiovascular Disease (CVD) is one of the main causes of death in the world. Early detection could prevent deaths associated to cardiac problems. In this work, we propose a methodology based on data pre-processing and Machine Learning (ML) techniques for predicting cardiovascular disease, by using the Sleep Heart Health Study (SHHS) dataset. First, the principal component analysis and lowest p-value logistic regression are applied to select optimal features which could be related to the CVD. Then, the selected features are used for training four ML algorithms: Naïve Bayes (NB), Feed Forward Neural Networks (NN), Support Vector Machine (SVM) and Random Forest (RF). A binary feature was considered as output of the proposed models and the SMOTE sampling has been used for balancing the training set. Among the proposed methods, NN provided the best accuracy (0.81) and AUC (0.76) outperforming the results obtained in other studies.
DOI:10.1109/SCCC51225.2020.9281168