Automated taxonomy alignment via large language models: bridging the gap between knowledge domains

Taxonomy alignment is essential for integrating knowledge across diverse domains and languages, facilitating information retrieval and data integration. Traditional methods heavily reliant on domain experts are time-consuming and resource-intensive. To address this challenge, this paper proposes an...

Full description

Saved in:
Bibliographic Details
Published inScientometrics Vol. 129; no. 9; pp. 5287 - 5312
Main Authors Cui, Wentao, Xiao, Meng, Wang, Ludi, Wang, Xuezhi, Du, Yi, Zhou, Yuanchun
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.09.2024
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0138-9130
1588-2861
DOI10.1007/s11192-024-05111-2

Cover

More Information
Summary:Taxonomy alignment is essential for integrating knowledge across diverse domains and languages, facilitating information retrieval and data integration. Traditional methods heavily reliant on domain experts are time-consuming and resource-intensive. To address this challenge, this paper proposes an automated taxonomy alignment approach leveraging large language models (LLMs). We introduce a method that embeds taxonomy nodes into a continuous low-dimensional vector space, utilizing hierarchical relationships within category concepts to enhance alignment accuracy. Our approach capitalizes on the contextual understanding and semantic information capabilities of LLMs, offering a promising solution to the challenges of taxonomy alignment. We conducted experiments on two pairs of real-world taxonomies and demonstrated that our method is comparable in accuracy to manual alignment, while significantly reducing time, operational, and maintenance costs associated with taxonomy alignment. Our case study showcases the effectiveness of our approach by visualizing the taxonomy alignment results. This automated alignment framework addresses the increasing demand for accurate and efficient alignment processes across diverse knowledge domains.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0138-9130
1588-2861
DOI:10.1007/s11192-024-05111-2