Single pass text classification by direct feature weighting
The Feature Weighting Classifier (FWC) is an efficient multi-class classification algorithm for text data that uses Information Gain to directly estimate per-class feature weights in the classifier. This classifier requires only a single pass over the dataset to compute the feature frequencies per c...
Saved in:
| Published in | Knowledge and information systems Vol. 28; no. 1; pp. 79 - 98 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
London
Springer-Verlag
01.07.2011
Springer Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0219-1377 0219-3116 |
| DOI | 10.1007/s10115-010-0317-9 |
Cover
| Summary: | The Feature Weighting Classifier (FWC) is an efficient multi-class classification algorithm for text data that uses Information Gain to directly estimate per-class feature weights in the classifier. This classifier requires only a single pass over the dataset to compute the feature frequencies per class, is easy to implement, and has memory usage that is linear in the number of features. Results of experiments performed on 128 binary and multi-class text and web datasets show that FWC’s performance is at least comparable to, and often better than that of Naive Bayes, TWCNB, Winnow, Balanced Winnow and linear SVM. On a large-scale web dataset with 12,294 classes and 135,973 training instances, FWC trained in 13 s and yielded comparable classification performance to a state of the art multi-class SVM implementation, which took over 15 min to train. |
|---|---|
| Bibliography: | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23 |
| ISSN: | 0219-1377 0219-3116 |
| DOI: | 10.1007/s10115-010-0317-9 |