一种基于WordNet的混合式语义相似度计算方法
语义相似度的计算是自然语言处理中的重要研究内容,在过去几十年的研究工作中,已有大量的语义相似度计算方法被提出并广泛应用于语义消歧、文本聚类等领域中。基于wordNet本体,改进了信息量IC计算模型,进而提出了两种混合式的语义相似度的计算方法。实验结果表明,由于同时考虑了概念节点在wordNet中的最短路径距离和IC语义距离,所提方法优于已有方法,其计算结果更加接近人类的主观判断。...
Saved in:
| Published in | 计算机工程与科学 Vol. 39; no. 5; pp. 971 - 977 |
|---|---|
| Main Author | |
| Format | Journal Article |
| Language | Chinese |
| Published |
北京交通大学软件学院,北京,100044
2017
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1007-130X |
| DOI | 10.3969/j.issn.1007-130X.2017.05.023 |
Cover
| Summary: | 语义相似度的计算是自然语言处理中的重要研究内容,在过去几十年的研究工作中,已有大量的语义相似度计算方法被提出并广泛应用于语义消歧、文本聚类等领域中。基于wordNet本体,改进了信息量IC计算模型,进而提出了两种混合式的语义相似度的计算方法。实验结果表明,由于同时考虑了概念节点在wordNet中的最短路径距离和IC语义距离,所提方法优于已有方法,其计算结果更加接近人类的主观判断。 |
|---|---|
| Bibliography: | Calculation of semantic similarity is an important research content of natural language pro- cessing (NLP), and many measurements have been proposed for the past few decades. These measure- ments have been widely used in word sense disambiguation, text clustering and other research fields. We propose a new measurement to calculate information content (IC) with WordNet ontology, and then propose two new hybrid measurements to calculate semantic similarity. Experimental results show that the proposed method is better than the existing methods for considering both the shortest path distance and the IC semantic distance simultaneously , and the results are more close to human judgment. 43-1258/TP ZHANG Si-qi, XING Wei-wei, CAI Yuan-yuan ( School of Software Engineering, Beijing Jiaotong University, Beiiing 100044, China) WordNet ; semantic similarity ; information content ; ontology |
| ISSN: | 1007-130X |
| DOI: | 10.3969/j.issn.1007-130X.2017.05.023 |