A classification framework using filter–wrapper based feature selection approach for the diagnosis of congenital heart failure

Premature mortality from cardiovascular disease can be reduced with early detection of heart failure by analysing the patients’ risk factors and assuring accurate diagnosis. This work proposes a clinical decision support system for the diagnosis of congenital heart failure by utilizing a data pre-pr...

Full description

Saved in:
Bibliographic Details
Published inJournal of intelligent & fuzzy systems Vol. 44; no. 4; pp. 6183 - 6218
Main Authors Navin, K.S., Nehemiah, H. Khanna, Nancy Jane, Y., Veena Saroji, H.
Format Journal Article
LanguageEnglish
Published London, England SAGE Publications 01.01.2023
Sage Publications Ltd
Subjects
Online AccessGet full text
ISSN1064-1246
1875-8967
DOI10.3233/JIFS-221348

Cover

More Information
Summary:Premature mortality from cardiovascular disease can be reduced with early detection of heart failure by analysing the patients’ risk factors and assuring accurate diagnosis. This work proposes a clinical decision support system for the diagnosis of congenital heart failure by utilizing a data pre-processing approach for dealing missing values and a filter-wrapper based method for selecting the most relevant features. Missing values are imputed using a missForest method in four out of eight heart disease datasets collected from the Machine Learning Repository maintained by University of California, Irvine. The Fast Correlation Based Filter is used as the filter approach, while the union of the Atom Search Optimization Algorithm and the Henry Gas Solubility Optimization represent the wrapper-based algorithms, with the fitness function as the combination of accuracy, G-mean, and Matthew’s correlation coefficient measured by the Support Vector Machine. A total of four boosted classifiers namely, XGBoost, AdaBoost, CatBoost, and LightGBM are trained using the selected features. The proposed work achieves an accuracy of 89%, 84%, 83%, 80% for Heart Failure Clinical Records, 81%, 80%, 83%, 82% for Single Proton Emission Computed Tomography, 90%, 82%, 93%, 80% for Single Proton Emission Computed Tomography F, 80%, 80%, 81%, 80% for Statlog Heart Disease, 80%, 85%, 83%, 86% for Cleveland Heart Disease, 82%, 85%, 85%, 82% for Hungarian Heart Disease, 80%, 81%, 79%, 82% for VA Long Beach, 97%, 89%, 98%, 97%, for Switzerland Heart Disease for four classifiers respectively. The suggested technique outperformed the other classifiers when evaluated against Random Forest, Classification and Regression Trees, Support Vector Machine, and K-Nearest Neighbor.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1064-1246
1875-8967
DOI:10.3233/JIFS-221348