Methodology for refining subject terms and supporting subject indexing with taxonomy: A case study of the APO digital repository

In digital repositories, it is crucial to refine existing subject terms and exploit a taxonomy with subject terms, in order to promote information retrieval tasks such as indexing, cataloging and searching of digital documents. In this paper, we address how to refine an existing set of subject terms...

Full description

Saved in:
Bibliographic Details
Published inDecision Support Systems Vol. 146; p. 113542
Main Authors Kang, Yong-Bin, Woo, Jihoon, Kneebone, Les, Sellis, Timos
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 01.07.2021
Elsevier Sequoia S.A
Subjects
Online AccessGet full text
ISSN0167-9236
1873-5797
DOI10.1016/j.dss.2021.113542

Cover

More Information
Summary:In digital repositories, it is crucial to refine existing subject terms and exploit a taxonomy with subject terms, in order to promote information retrieval tasks such as indexing, cataloging and searching of digital documents. In this paper, we address how to refine an existing set of subject terms, often containing irrelevant ones or creating noise, that are used to index digital documents. Further, we present how to automatically induce a subject term taxonomy to capture and utilise the semantic relations among subject terms. Most related works have little studied these problems, focusing mostly on creating subject terms or building a taxonomy of key terms from text documents. We propose a methodology22The source code of the proposed methodology is available at https://github.com/Yongbinkang/SubjectTracker for refining an existing set of subject terms in a digital repository by identifying their semantics, as well as inducing a taxonomy with subject terms by analysing their mutual usages, maximising their semantic relatedness. Then, we present a case study using the (Analysis & Policy Observatory) APO digital repository to analyse the proposed methodology and demonstrate its applicability. Further, to validate the generalisability of the proposed taxonomy inducing method, we evaluate it using a gold-standard taxonomy in life sciences, Medical Subject Headings (MeSH), in comparison with the state–of-the-art taxonomy inducing method, TaxoFinder. Our evaluation shows that our methodology has a high potential for refining an existing set of subject terms and capturing their semantic relationships by inducing a subject term taxonomy. •Propose a methodology for refining existing subject terms by estimating their frequencies and semantics, and for inducing a taxonomy from the refined subject terms by integrating their mutual usages.•Provide thorough analysis of our proposed methodology using the APO (Analysis & Policy Observatory) digital repository to show the applicability of the methodology•Measure the generalisability of the proposed taxonomy inducing method, in comparison with the state–of-the-art taxonomy inducing method, TaxoFinder
Bibliography:ObjectType-Case Study-2
SourceType-Scholarly Journals-1
content type line 14
ObjectType-Feature-4
ObjectType-Report-1
ObjectType-Article-3
ISSN:0167-9236
1873-5797
DOI:10.1016/j.dss.2021.113542