Single pass text classification by direct feature weighting

The Feature Weighting Classifier (FWC) is an efficient multi-class classification algorithm for text data that uses Information Gain to directly estimate per-class feature weights in the classifier. This classifier requires only a single pass over the dataset to compute the feature frequencies per c...

Full description

Saved in:
Bibliographic Details
Published inKnowledge and information systems Vol. 28; no. 1; pp. 79 - 98
Main Authors Malik, Hassan H., Fradkin, Dmitriy, Moerchen, Fabian
Format Journal Article
LanguageEnglish
Published London Springer-Verlag 01.07.2011
Springer
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0219-1377
0219-3116
DOI10.1007/s10115-010-0317-9

Cover

More Information
Summary:The Feature Weighting Classifier (FWC) is an efficient multi-class classification algorithm for text data that uses Information Gain to directly estimate per-class feature weights in the classifier. This classifier requires only a single pass over the dataset to compute the feature frequencies per class, is easy to implement, and has memory usage that is linear in the number of features. Results of experiments performed on 128 binary and multi-class text and web datasets show that FWC’s performance is at least comparable to, and often better than that of Naive Bayes, TWCNB, Winnow, Balanced Winnow and linear SVM. On a large-scale web dataset with 12,294 classes and 135,973 training instances, FWC trained in 13 s and yielded comparable classification performance to a state of the art multi-class SVM implementation, which took over 15 min to train.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-2
content type line 23
ISSN:0219-1377
0219-3116
DOI:10.1007/s10115-010-0317-9