Document–document similarity approaches and science mapping: Experimental comparison of five approaches
This paper treats document–document similarity approaches in the context of science mapping. Five approaches, involving nine methods, are compared experimentally. We compare text-based approaches, the citation-based bibliographic coupling approach, and approaches that combine text-based approaches a...
Saved in:
Published in | Journal of informetrics Vol. 3; no. 1; pp. 49 - 63 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
2009
|
Subjects | |
Online Access | Get full text |
ISSN | 1751-1577 1875-5879 1875-5879 |
DOI | 10.1016/j.joi.2008.11.003 |
Cover
Summary: | This paper treats document–document similarity approaches in the context of science mapping. Five approaches, involving nine methods, are compared experimentally. We compare text-based approaches, the citation-based bibliographic coupling approach, and approaches that combine text-based approaches and bibliographic coupling. Forty-three articles, published in the journal
Information Retrieval, are used as test documents. We investigate how well the approaches agree with a ground truth subject classification of the test documents, when the complete linkage method is used, and under two types of similarities, first-order and second-order. The results show that it is possible to achieve a very good approximation of the classification by means of automatic grouping of articles. One text-only method and one combination method, under second-order similarities in both cases, give rise to cluster solutions that to a large extent agree with the classification. |
---|---|
ISSN: | 1751-1577 1875-5879 1875-5879 |
DOI: | 10.1016/j.joi.2008.11.003 |