A Fast KNN Algorithm for Text Categorization
The KNN algorithm applied to text categorization is a simple, valid and non-parameter method. The traditional KNN has a fatal defect that the time of similarity computing is huge. The practicality will be lost when the KNN algorithm is applied to text categorization with the high dimension and huge...
Saved in:
| Published in | 2007 International Conference on Machine Learning and Cybernetics Vol. 6; pp. 3436 - 3441 |
|---|---|
| Main Authors | , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
01.08.2007
|
| Subjects | |
| Online Access | Get full text |
| ISBN | 1424409721 9781424409723 |
| ISSN | 2160-133X |
| DOI | 10.1109/ICMLC.2007.4370742 |
Cover
| Summary: | The KNN algorithm applied to text categorization is a simple, valid and non-parameter method. The traditional KNN has a fatal defect that the time of similarity computing is huge. The practicality will be lost when the KNN algorithm is applied to text categorization with the high dimension and huge samples. In this paper, a method called TFKNN(Tree-Fast-K-Nearest-Neighbor) is presented, which can search the exact k nearest neighbors quickly. In the method, a SSR tree for searching K nearest neighbors is created, in which all child nodes of each non-leaf node are ranked according to the distances between their central points and the central point of their parent. Then the searching scope is reduced based on the tree. Subsequently , the time of similarity computing is decreased largely. |
|---|---|
| ISBN: | 1424409721 9781424409723 |
| ISSN: | 2160-133X |
| DOI: | 10.1109/ICMLC.2007.4370742 |