Comparison of SVM and some Older Classification Algorithms in Text Classification Tasks

Document classification has already been widely studied. In fact, some studies compared feature selection techniques or feature space transformation whereas some others compared the performance of different algorithms. Recently, following the rising interest towards the Support Vector Machine, vario...

Full description

Saved in:

Bibliographic Details
Published in	Artificial Intelligence in Theory and Practice Vol. 217; pp. 169 - 178
Main Authors	Colas, Fabrice, Brazdil, Pavel
Format	Book Chapter
Language	English
Published	The Netherlands Springer 2006 Springer US
Series	IFIP International Federation for Information Processing
Subjects	Artificial intelligence Classification Task Classifier Comparison Feature Space Support Vector Machine Text Categorization
Online Access	Get full text
ISBN	0387346546 9780387346540
ISSN	1571-5736
DOI	10.1007/978-0-387-34747-9_18

Cover

More Information
Summary:	Document classification has already been widely studied. In fact, some studies compared feature selection techniques or feature space transformation whereas some others compared the performance of different algorithms. Recently, following the rising interest towards the Support Vector Machine, various studies showed that SVM outperforms other classification algorithms. So should we just not bother about other classification algorithms and opt always for SVM ? We have decided to investigate this issue and compared SVM to kNN and naive Bayes on binary classification tasks. An important issue is to compare optimized versions of these algorithms, which is what we have done. Our results show all the classifiers achieved comparable performance on most problems. One surprising result is that SVM was not a clear winner, despite quite good overall performance. If a suitable preprocessing is used with kNN, this algorithm continues to achieve very good results and scales up well with the number of documents, which is not the case for SVM. As for naive Bayes, it also achieved good performance.
ISBN:	0387346546 9780387346540
ISSN:	1571-5736
DOI:	10.1007/978-0-387-34747-9_18