Research on the Variable Selection Methods Based on Random Forests

In modern statistical models, the problem of variable selection has been paid much attention because it can enhance the explanatory rate of models and reduce the amount of calculation. This paper studies variable selection based on random forest, and compares random forest with regularization penalt...

Full description

Saved in:
Bibliographic Details
Published in2022 7th International Conference on Computational Intelligence and Applications (ICCIA) pp. 59 - 64
Main Author Lu, Tianchi
Format Conference Proceeding
LanguageEnglish
Published IEEE 24.06.2022
Subjects
Online AccessGet full text
DOI10.1109/ICCIA55271.2022.9828423

Cover

More Information
Summary:In modern statistical models, the problem of variable selection has been paid much attention because it can enhance the explanatory rate of models and reduce the amount of calculation. This paper studies variable selection based on random forest, and compares random forest with regularization penalty methods. The specific research steps are as follows: First, the related theories of random forest and regularization penalty methods are explained. The applicability and validity of each method are compared by simulation in linear and Logistic classification models. Finally, the advantages and disadvantages of each method on different datasets are obtained through simulation and data analysis. That is, random forest has better performance when data sample is large and random forest is more stable than regularization penalty methods.
DOI:10.1109/ICCIA55271.2022.9828423