Feature selection algorithm based on optimized genetic algorithm and the application in high-dimensional data processing

High-dimensional data is widely used in many fields, but selecting key features from it is challenging. Feature selection can reduce data dimensionality and weaken noise interference, thereby improving model efficiency and enhancing model interpretability. In order to improve the efficiency and accu...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 19; no. 5; p. e0303088
Main Author Feng, Guilian
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 09.05.2024
Public Library of Science (PLoS)
Subjects
Online AccessGet full text
ISSN1932-6203
1932-6203
DOI10.1371/journal.pone.0303088

Cover

More Information
Summary:High-dimensional data is widely used in many fields, but selecting key features from it is challenging. Feature selection can reduce data dimensionality and weaken noise interference, thereby improving model efficiency and enhancing model interpretability. In order to improve the efficiency and accuracy of high-dimensional data processing, a feature selection method based on optimized genetic algorithm is proposed in this study. The algorithm simulates the process of natural selection, searches for possible subsets of feature, and finds the subsets of feature that optimizes the performance of the model. The results show that when the value of K is less than 4 or more than 8, the recognition rate is very low. After adaptive bias filtering, 724 features are filtered to 372, and the accuracy is improved from 0.9352 to 0.9815. From 714 features to 406 Gaussian codes, the accuracy is improved from 0.9625 to 0.9754. Among all tests, the colon has the highest average accuracy, followed by small round blue cell tumor(SRBCT), lymphoma, central nervous system(CNS) and ovaries. The green curve is the best, with stable performance and a time range of 0–300. While maintaining the efficiency, it can reach 4.48 as soon as possible. The feature selection method has practical significance for high-dimensional data processing, improves the efficiency and accuracy of data processing, and provides an effective new method for high-dimensional data processing.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Competing Interests: The authors have declared that no competing interests exist.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0303088