Word Sense Disambiguation with a Similarity-Smoothed Case Library

A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using dependency trees to define local contexts of test words as feature vectors in which each feature of a word is a path in the dependency tree of i...

Full description

Saved in:

Bibliographic Details
Published in	Computers and the humanities Vol. 34; no. 1/2; pp. 147 - 152
Main Author	Lin, Dekang
Format	Journal Article
Language	English
Published	New York Kluwer Academic Publishers 01.04.2000 Pergamon Springer Nature B.V
Subjects	Algorithms Ambiguity Cities Computational linguistics Computer Applications Computer Generated Language Analysis Computer programming Corpus Linguistics Dictionaries English Systems Ethnic conflict Libraries Natural Language Processing Nouns Polysemy Semantics Verbs Word Meaning Word sense disambiguation Words
Online Access	Get full text
ISSN	0010-4817 1574-020X 1572-8412 1574-0218
DOI	10.1023/a:1002633105432

Cover

More Information
Summary:	A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using dependency trees to define local contexts of test words as feature vectors in which each feature of a word is a path in the dependency tree of its containing sentence. Data sparseness is addressed by applying a similarity function to a thesaurus extracted from a 125 million word corpus, thereby recognizing commonalities between local contexts; target words are tagged with the sense value of the example having the maximally similar local context. Training with the entire training corpus yielded robust SENSEVAL evaluation results of 0.701 recall & 0.706 precision; running the system without the thesaurus produced a 4%-6% drop in both values, & a 7% drop resulted when local contexts were formalized as surrounding words instead of dependency trees. 2 Tables, 6 References. J. Hitchcock
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0010-4817 1574-020X 1572-8412 1574-0218
DOI:	10.1023/a:1002633105432