A feature selection model for document classification using Tom and Jerry Optimization algorithm
Since the last decade, high-dimensional data has been increasing in various document mining fields, such as text summarization, text clustering, and text classification. The curse of dimensionality has an impact on the classification model’s performance. The feature selection strategy is extremely e...
Saved in:
| Published in | Multimedia tools and applications Vol. 83; no. 4; pp. 10273 - 10295 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
New York
Springer US
01.01.2024
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1380-7501 1573-7721 |
| DOI | 10.1007/s11042-023-15828-6 |
Cover
| Summary: | Since the last decade, high-dimensional data has been increasing in various document mining fields, such as text summarization, text clustering, and text classification. The curse of dimensionality has an impact on the classification model’s performance. The feature selection strategy is extremely effective in dealing with the curse of dimensionality issue. In this work, we present the Tom and Jerry Optimization technique(TJO) for feature subset selection. The proposed work uses the classifier error rate and the feature chosen rate to measure the candidate’s fitness. The performance of the proposed scheme is examined using two popular benchmark text corpus and compared with five metaheuristic approaches. The best success rate obtained by the proposed scheme is 95.77%, whereas the best precision is 0.9509, recall is 0.9577 and F1-score is 0.9541. According to the comparison results, the proposed feature subset selection scheme outperforms the standard strategy. |
|---|---|
| ISSN: | 1380-7501 1573-7721 |
| DOI: | 10.1007/s11042-023-15828-6 |