Mean based relief: An improved feature selection method based on ReliefF

Selection of relevant features is vitally important in machine learning tasks involving large datasets with numerous features. It helps in reducing the dimensionality of a dataset and improving model performance. This study introduces a feature selection technique named μ -Relief, which is based on...

Full description

Saved in:
Bibliographic Details
Published inApplied intelligence (Dordrecht, Netherlands) Vol. 53; no. 19; pp. 23004 - 23028
Main Authors Aggarwal, Nitisha, Shukla, Unmesh, Saxena, Geetika Jain, Rawat, Mukesh, Bafila, Anil Singh, Singh, Sanjeev, Pundir, Amit
Format Journal Article
LanguageEnglish
Published New York Springer US 01.10.2023
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0924-669X
1573-7497
DOI10.1007/s10489-023-04662-w

Cover

More Information
Summary:Selection of relevant features is vitally important in machine learning tasks involving large datasets with numerous features. It helps in reducing the dimensionality of a dataset and improving model performance. This study introduces a feature selection technique named μ -Relief, which is based on ReliefF, one of the most extensively used Relief-based algorithms. μ -Relief effectively determines the most relevant feature subset and significantly outperforms the ReliefF algorithm. ReliefF estimates feature quality considering only the nearest neighbors, resulting in low classification accuracy on non-uniformly distributed or noisy datasets. The proposed μ -Relief technique considers neighbors with more effective information on the basis of mean distance. It utilizes neighbors far from the mean distance to obtain feature weight estimates, which improves the algorithm’s performance. The algorithm was tested on thirteen real-world datasets and validated on three synthetic datasets. Its effectiveness in selecting relevant features was evaluated by comparing it to other well-known feature selection algorithms, namely Chi-Square, ANOVA, MI, CMIM, MRMR, SURF*, MultiSURF, MultiSURF*, and ReliefF. When evaluated using multiple classifiers trained on the features selected by different feature selection techniques, the metrics of classification accuracy, weighted F1-score, and ROC-AUC, showed that μ -Relief effectively determined relevant features and outperformed other techniques.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-023-04662-w