Automated taxonomy alignment via large language models: bridging the gap between knowledge domains

Taxonomy alignment is essential for integrating knowledge across diverse domains and languages, facilitating information retrieval and data integration. Traditional methods heavily reliant on domain experts are time-consuming and resource-intensive. To address this challenge, this paper proposes an...

Full description

Saved in:

Bibliographic Details
Published in	Scientometrics Vol. 129; no. 9; pp. 5287 - 5312
Main Authors	Cui, Wentao, Xiao, Meng, Wang, Ludi, Wang, Xuezhi, Du, Yi, Zhou, Yuanchun
Format	Journal Article
Language	English
Published	Cham Springer International Publishing 01.09.2024 Springer Nature B.V
Subjects	Alignment Astronomy Automation Computer Science Data integration Dewey Decimal Classification Economics Information processing Information retrieval Information science Information Storage and Retrieval Interoperability Knowledge Language Large language models Library collections Library management Library of Congress Classification Library Science Maintenance costs Medical Subject Headings-MeSH Methods Natural language processing Ontology Semantics Taxonomy Vector spaces Vocabularies & taxonomies Taxonomy alignment Information science Large language model Word embedding
Online Access	Get full text
ISSN	0138-9130 1588-2861
DOI	10.1007/s11192-024-05111-2

Cover

More Information
Summary:	Taxonomy alignment is essential for integrating knowledge across diverse domains and languages, facilitating information retrieval and data integration. Traditional methods heavily reliant on domain experts are time-consuming and resource-intensive. To address this challenge, this paper proposes an automated taxonomy alignment approach leveraging large language models (LLMs). We introduce a method that embeds taxonomy nodes into a continuous low-dimensional vector space, utilizing hierarchical relationships within category concepts to enhance alignment accuracy. Our approach capitalizes on the contextual understanding and semantic information capabilities of LLMs, offering a promising solution to the challenges of taxonomy alignment. We conducted experiments on two pairs of real-world taxonomies and demonstrated that our method is comparable in accuracy to manual alignment, while significantly reducing time, operational, and maintenance costs associated with taxonomy alignment. Our case study showcases the effectiveness of our approach by visualizing the taxonomy alignment results. This automated alignment framework addresses the increasing demand for accurate and efficient alignment processes across diverse knowledge domains.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0138-9130 1588-2861
DOI:	10.1007/s11192-024-05111-2