A novel wrapper feature selection algorithm based on iterated greedy metaheuristic for sentiment classification

•A novel iterated greedy based feature selection algorithm for sentiment analysis.•A greedy selection procedure that benefits from pre-calculated filter-based scores.•Outperforms state-of-the-art results for 9 public sentiment classification datasets used. In recent years, sentiment analysis is beco...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 146; p. 113176
Main Authors Gokalp, Osman, Tasci, Erdal, Ugur, Aybars
Format Journal Article
LanguageEnglish
Published New York Elsevier Ltd 15.05.2020
Elsevier BV
Subjects
Online AccessGet full text
ISSN0957-4174
1873-6793
DOI10.1016/j.eswa.2020.113176

Cover

More Information
Summary:•A novel iterated greedy based feature selection algorithm for sentiment analysis.•A greedy selection procedure that benefits from pre-calculated filter-based scores.•Outperforms state-of-the-art results for 9 public sentiment classification datasets used. In recent years, sentiment analysis is becoming more and more important as the number of digital text resources increases in parallel with the development of information technology. Feature selection is a crucial sub-stage for the sentiment analysis as it can improve the overall predictive performance of a classifier while reducing the dimensionality of a problem. In this study, we propose a novel wrapper feature selection algorithm based on Iterated Greedy (IG) metaheuristic for sentiment classification. We also develop a selection procedure that is based on pre-calculated filter scores for the greedy construction part of the IG algorithm. A comprehensive experimental study is conducted on commonly-used sentiment analysis datasets to assess the performance of the proposed method. The computational results show that the proposed algorithm achieves 96.45% and 90.74% accuracy rates on average by using Multinomial Naïve Bayes classifier for 9 public sentiment and 4 Amazon product reviews datasets, respectively. The results also reveal that our algorithm outperforms state-of-the-art results for the 9 public sentiment datasets. Moreover, the proposed algorithm produces highly competitive results with state-of-the-art feature selection algorithms for 4 Amazon datasets.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2020.113176