Prediction of stock price movement using an improved NSGA-II-RF algorithm with a three-stage feature engineering process

Prediction of stock price has been a hot topic in artificial intelligence field. Computational intelligent methods such as machine learning or deep learning are explored in the prediction system in recent years. However, making accurate predictions of stock price direction is still a big challenge b...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 18; no. 6; p. e0287754
Main Authors Zeng, Xiaohua, Cai, Jieping, Liang, Changzhou, Yuan, Chiping
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 28.06.2023
Public Library of Science (PLoS)
Subjects
Online AccessGet full text
ISSN1932-6203
1932-6203
DOI10.1371/journal.pone.0287754

Cover

More Information
Summary:Prediction of stock price has been a hot topic in artificial intelligence field. Computational intelligent methods such as machine learning or deep learning are explored in the prediction system in recent years. However, making accurate predictions of stock price direction is still a big challenge because stock prices are affected by nonlinear, nonstationary, and high dimensional features. In previous works, feature engineering was overlooked. How to select the optimal feature sets that affect stock price is a prominent solution. Hence, our motivation for this article is to propose an improved many-objective optimization algorithm integrating random forest (I-NSGA-II-RF) algorithm with a three-stage feature engineering process in order to decrease the computational complexity and improve the accuracy of prediction system. Maximizing accuracy and minimizing the optimal solution set are the optimization directions of the model in this study. The integrated information initialization population of two filtered feature selection methods is used to optimize the I-NSGA-II algorithm, using multiple chromosome hybrid coding to synchronously select features and optimize model parameters. Finally, the selected feature subset and parameters are input to the RF for training, prediction, and iterative optimization. Experimental results show that the I-NSGA-II-RF algorithm has the highest average accuracy, the smallest optimal solution set, and the shortest running time compared to the unmodified multi-objective feature selection algorithm and the single target feature selection algorithm. Compared to the deep learning model, this model has interpretability, higher accuracy, and less running time.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Competing Interests: The authors have declared that no competing interests exist.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0287754