Word Sense Disambiguation with a Similarity-Smoothed Case Library

A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using dependency trees to define local contexts of test words as feature vectors in which each feature of a word is a path in the dependency tree of i...

Full description

Saved in:
Bibliographic Details
Published inComputers and the humanities Vol. 34; no. 1/2; pp. 147 - 152
Main Author Lin, Dekang
Format Journal Article
LanguageEnglish
Published New York Kluwer Academic Publishers 01.04.2000
Pergamon
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0010-4817
1574-020X
1572-8412
1574-0218
DOI10.1023/a:1002633105432

Cover

Abstract A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using dependency trees to define local contexts of test words as feature vectors in which each feature of a word is a path in the dependency tree of its containing sentence. Data sparseness is addressed by applying a similarity function to a thesaurus extracted from a 125 million word corpus, thereby recognizing commonalities between local contexts; target words are tagged with the sense value of the example having the maximally similar local context. Training with the entire training corpus yielded robust SENSEVAL evaluation results of 0.701 recall & 0.706 precision; running the system without the thesaurus produced a 4%-6% drop in both values, & a 7% drop resulted when local contexts were formalized as surrounding words instead of dependency trees. 2 Tables, 6 References. J. Hitchcock
AbstractList Lin presents a case-based algorithm for word sense disambiguation. The case library consists of local contexts of sense-tagged examples in the training corpus.
A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using dependency trees to define local contexts of test words as feature vectors in which each feature of a word is a path in the dependency tree of its containing sentence. Data sparseness is addressed by applying a similarity function to a thesaurus extracted from a 125 million word corpus, thereby recognizing commonalities between local contexts; target words are tagged with the sense value of the example having the maximally similar local context. Training with the entire training corpus yielded robust SENSEVAL evaluation results of 0.701 recall & 0.706 precision; running the system without the thesaurus produced a 4%-6% drop in both values, & a 7% drop resulted when local contexts were formalized as surrounding words instead of dependency trees. 2 Tables, 6 References. J. Hitchcock
Author Lin, Dekang
Author_xml – sequence: 1
  givenname: Dekang
  surname: Lin
  fullname: Lin, Dekang
BookMark eNp9kMtr4zAQh0VpoUnbc08Fswu9eTOjhy3vLaSPXQj0kJYejWTLWwXbykoKS_77qk0ptLA9zWG-3zy-KTkc3WgIOUf4gUDZTP1EAFowhiA4owdkgqKkueRID8kEACHnEstjMg1hDQmlVE7I_NH5NluZMZjsygY1aPtnq6J1Y_bPxqdMZSs72F55G3f5anAuPpk2W6iEL632yu9OyVGn-mDO3uoJebi5vl_8ypd3t78X82XecKxirhhQKTWjDFtA1irdAhS64qhVA6LiXWFSq9AFYCVK0TW6Y8AaDcApdMBOyOV-7sa7v1sTYj3Y0Ji-V6Nx21BLIWgFnCXw2ydw7bZ-TLfVFHlZVVhUCfr-PwhLAbJAJotEzfZU410I3nT1xtshPV0j1C_W63n9wXpKiE-JxsZXn9Er23-Ru9jn1iE6_74mSQMuk69nUtGNeQ
CODEN COHUAD
CitedBy_id crossref_primary_10_1111_exsy_12075
crossref_primary_10_1093_llc_fqs074
crossref_primary_10_1631_jzus_2006_A1609
crossref_primary_10_3182_20020721_6_ES_1901_01454
Cites_doi 10.3115/981574.981590
10.3115/981732.981752
10.3115/981863.981869
10.3115/981732.981745
10.7551/mitpress/7287.003.0018
10.3115/980432.980696
ContentType Journal Article
Copyright Copyright 2000 Kluwer Academic Publishers
Copyright Kluwer Academic Publishers Apr 2000
Copyright_xml – notice: Copyright 2000 Kluwer Academic Publishers
– notice: Copyright Kluwer Academic Publishers Apr 2000
DBID AAYXX
CITATION
HFIND
JQCIK
K30
PAAUG
PAWHS
PAWZZ
PAXOH
PBHAV
PBQSW
PBYQZ
PCIWU
PCMID
PCZJX
PDGRG
PDWWI
PETMR
PFVGT
PGXDX
PIHIL
PISVA
PJCTQ
PJTMS
PLCHJ
PMHAD
PNQDJ
POUND
PPLAD
PQAPC
PQCAN
PQCMW
PQEME
PQHKH
PQMID
PQNCT
PQNET
PQSCT
PQSET
PSVJG
PVMQY
PZGFC
7SC
7T9
8FD
JQ2
L7M
L~C
L~D
DOI 10.1023/a:1002633105432
DatabaseName CrossRef
Periodicals Index Online Segment 16
Periodicals Index Online Segment 33
Periodicals Index Online
Primary Sources Access—Foundation Edition (Plan E) - West
Primary Sources Access (Plan D) - International
Primary Sources Access & Build (Plan A) - MEA
Primary Sources Access—Foundation Edition (Plan E) - Midwest
Primary Sources Access—Foundation Edition (Plan E) - Northeast
Primary Sources Access (Plan D) - Southeast
Primary Sources Access (Plan D) - North Central
Primary Sources Access—Foundation Edition (Plan E) - Southeast
Primary Sources Access (Plan D) - South Central
Primary Sources Access & Build (Plan A) - UK / I
Primary Sources Access (Plan D) - Canada
Primary Sources Access (Plan D) - EMEALA
Primary Sources Access—Foundation Edition (Plan E) - North Central
Primary Sources Access—Foundation Edition (Plan E) - South Central
Primary Sources Access & Build (Plan A) - International
Primary Sources Access—Foundation Edition (Plan E) - International
Primary Sources Access (Plan D) - West
Periodicals Index Online Segments 1-50
Primary Sources Access (Plan D) - APAC
Primary Sources Access (Plan D) - Midwest
Primary Sources Access (Plan D) - MEA
Primary Sources Access—Foundation Edition (Plan E) - Canada
Primary Sources Access—Foundation Edition (Plan E) - UK / I
Primary Sources Access—Foundation Edition (Plan E) - EMEALA
Primary Sources Access & Build (Plan A) - APAC
Primary Sources Access & Build (Plan A) - Canada
Primary Sources Access & Build (Plan A) - West
Primary Sources Access & Build (Plan A) - EMEALA
Primary Sources Access (Plan D) - Northeast
Primary Sources Access & Build (Plan A) - Midwest
Primary Sources Access & Build (Plan A) - North Central
Primary Sources Access & Build (Plan A) - Northeast
Primary Sources Access & Build (Plan A) - South Central
Primary Sources Access & Build (Plan A) - Southeast
Primary Sources Access (Plan D) - UK / I
Primary Sources Access—Foundation Edition (Plan E) - APAC
Primary Sources Access—Foundation Edition (Plan E) - MEA
Computer and Information Systems Abstracts
Linguistics and Language Behavior Abstracts (LLBA)
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Periodicals Index Online Segments 1-50
Periodicals Index Online Segment 16
Periodicals Index Online Segment 33
Periodicals Index Online
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Linguistics and Language Behavior Abstracts (LLBA)
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database
Linguistics and Language Behavior Abstracts (LLBA)
DeliveryMethod fulltext_linktorsrc
Discipline Library & Information Science
Computer Science
EISSN 1572-8412
1574-0218
EndPage 152
ExternalDocumentID 62755927
10_1023_A_1002633105432
30204801
GroupedDBID -DZ
-~C
-~X
.4H
.4S
.86
.DC
07C
0R~
199
29F
2J2
2JN
2KG
2LR
3EH
3LD
40E
53G
5GY
67Z
6J9
6NX
8UJ
95-
95.
95~
96X
AABHQ
AAHCP
AARTL
AAUTI
AAWCG
AAYOK
ABBHK
ABBXA
ABECW
ABFTV
ABJOX
ABKCH
ABQSL
ABTMW
ABXSQ
ACNXV
ACPVT
ADHKG
ADMHC
ADMHG
ADPTO
ADURQ
ADYFF
AEGNC
AEOHA
AEUPB
AEXYK
AFFNX
AGDGC
AGQMX
AGQPQ
AHEXP
AI.
AJRNO
ALMA_UNASSIGNED_HOLDINGS
AMKLP
ARCSS
B-.
CAG
COF
CS3
DL5
DPUIP
DU5
EBS
EDO
EHI
EJD
F5P
FNLPD
GPZZG
GQ8
HMHOC
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I09
IHE
IPSME
IZIGR
I~Z
JAAYA
JAB
JBMMH
JBSCW
JCJTX
JENOY
JHFFW
JKQEH
JLEZI
JLXEF
JPL
JST
KDC
KOV
LAK
MA-
MK~
ML~
MVM
N2Q
NB0
NF0
NQJWS
O9-
O93
O9G
OVD
P-O
PF-
PT5
QOS
R4E
RHV
RNI
ROL
RPX
RSV
RXW
RZC
RZE
SA0
SAP
SDA
SDH
SHS
SNX
SZN
TAE
TEORI
TN5
TSK
TUC
TUS
U2A
UG4
VC2
VH1
WH7
WK8
YZZ
~45
1OL
2.D
28-
4.4
5QI
78A
AAYXX
ABMNI
AEFIE
AGGDS
ASPBG
AVWKF
AZFZN
BBWZM
CITATION
HF~
LPU
QOK
RZK
SDM
WHG
XOL
~EX
HFIND
JQCIK
K30
PAAUG
PAWHS
PAWZZ
PAXOH
PBHAV
PBQSW
PBYQZ
PCIWU
PCMID
PCZJX
PDGRG
PDWWI
PETMR
PFVGT
PGXDX
PIHIL
PISVA
PJCTQ
PJTMS
PLCHJ
PMHAD
PNQDJ
POUND
PPLAD
PQAPC
PQCAN
PQCMW
PQEME
PQHKH
PQMID
PQNCT
PQNET
PQSCT
PQSET
PSVJG
PVMQY
PZGFC
-Y2
0VY
203
29L
2JY
2P1
30V
406
5VS
7SC
7T9
8FD
8TC
AAAVM
AACDK
AAGAY
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAPKM
AARHV
AASML
AATNV
AATVU
AAUYE
AAYIU
AAYQN
AAYTO
ABAKF
ABBBX
ABBRH
ABDBE
ABDZT
ABECU
ABFSG
ABHLI
ABHQN
ABJNI
ABKTR
ABLJU
ABMQK
ABNWP
ABQBU
ABRTQ
ABSXP
ABTEG
ABTHY
ABTKH
ABULA
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFO
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACPIV
ACREN
ACSTC
ACZOJ
ADHIR
ADKNI
ADKPE
ADRFC
ADTPH
ADULT
ADYOE
ADZKW
AEBTG
AEFQL
AEGAL
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEPYU
AESKC
AETLH
AEVLU
AEZWR
AFBBN
AFDZB
AFGCZ
AFHIU
AFLOW
AFQWF
AFYQB
AFZKB
AGAYW
AGJBK
AGMZJ
AGQEE
AGRTI
AGWIL
AGWZB
AGYKE
AHBYD
AHPBZ
AHSBF
AHWEU
AHYZX
AIAKS
AIGIU
AILAN
AITGF
AIXLP
AJBLW
AJZVZ
ALWAN
AMTXH
AMXSW
AOCGG
ARMRJ
ATHPR
AXYYD
AYFIA
AYQZM
BDATZ
BGNMA
CSCUP
DDRTE
DNIVK
EBLON
EIOEI
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNWQR
GQ7
HG6
HLICF
IJ-
IKXTQ
ITM
IWAJR
IZQ
J-C
J0Z
JQ2
JZLTJ
L7M
LLZTM
L~C
L~D
M4Y
MQGED
NPVJJ
NU0
OAM
P19
P9Q
PT4
R89
R9I
S16
S27
S3B
SISQX
SJYHP
SNE
SNPRN
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
T13
TSG
UOJIU
UTJUX
UZXMN
VFIZW
W23
YLTOR
ZMTXR
ID FETCH-LOGICAL-c419t-a30288b3231d013dabd006b941bac0594f6e31d6b6019575fcbf303cb00420f03
ISSN 0010-4817
1574-020X
IngestDate Wed Jul 30 10:45:22 EDT 2025
Sun Oct 26 13:22:55 EDT 2025
Fri Jul 25 03:01:44 EDT 2025
Thu Apr 24 22:56:59 EDT 2025
Wed Oct 01 05:48:32 EDT 2025
Thu May 29 08:35:30 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1/2
Language English
License https://www.springernature.com/gp/researchers/text-and-data-mining
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c419t-a30288b3231d013dabd006b941bac0594f6e31d6b6019575fcbf303cb00420f03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
PQID 1750861386
PQPubID 1817236
PageCount 6
ParticipantIDs proquest_miscellaneous_85529043
proquest_journals_214799169
proquest_journals_1750861386
crossref_primary_10_1023_A_1002633105432
crossref_citationtrail_10_1023_A_1002633105432
jstor_primary_30204801
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2000-04-01
PublicationDateYYYYMMDD 2000-04-01
PublicationDate_xml – month: 04
  year: 2000
  text: 2000-04-01
  day: 01
PublicationDecade 2000
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle Computers and the humanities
PublicationYear 2000
Publisher Kluwer Academic Publishers
Pergamon
Springer Nature B.V
Publisher_xml – name: Kluwer Academic Publishers
– name: Pergamon
– name: Springer Nature B.V
References 258481_CR5
258481_CR4
258481_CR3
258481_CR2
258481_CR1
258481_CR6
References_xml – ident: 258481_CR3
  doi: 10.3115/981574.981590
– ident: 258481_CR1
  doi: 10.3115/981732.981752
– ident: 258481_CR5
  doi: 10.3115/981863.981869
– ident: 258481_CR6
  doi: 10.3115/981732.981745
– ident: 258481_CR2
  doi: 10.7551/mitpress/7287.003.0018
– ident: 258481_CR4
  doi: 10.3115/980432.980696
SSID ssj0002228
ssj0042478
Score 1.5557072
Snippet Lin presents a case-based algorithm for word sense disambiguation. The case library consists of local contexts of sense-tagged examples in the training corpus.
A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using...
SourceID proquest
crossref
jstor
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 147
SubjectTerms Algorithms
Ambiguity
Cities
Computational linguistics
Computer Applications
Computer Generated Language Analysis
Computer programming
Corpus Linguistics
Dictionaries
English Systems
Ethnic conflict
Libraries
Natural Language Processing
Nouns
Polysemy
Semantics
Verbs
Word Meaning
Word sense disambiguation
Words
Title Word Sense Disambiguation with a Similarity-Smoothed Case Library
URI https://www.jstor.org/stable/30204801
https://www.proquest.com/docview/1750861386
https://www.proquest.com/docview/214799169
https://www.proquest.com/docview/85529043
Volume 34
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVLSH
  databaseName: SpringerLink Journals
  customDbUrl:
  mediaType: online
  eissn: 1572-8412
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0042478
  issn: 0010-4817
  databaseCode: AFBBN
  dateStart: 19970101
  isFulltext: true
  providerName: Library Specific Holdings
– providerCode: PRVAVX
  databaseName: SpringerLINK - Czech Republic Consortium
  customDbUrl:
  eissn: 1572-8412
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0042478
  issn: 0010-4817
  databaseCode: AGYKE
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: http://link.springer.com
  providerName: Springer Nature
– providerCode: PRVAVX
  databaseName: SpringerLink Journals (ICM)
  customDbUrl:
  eissn: 1572-8412
  dateEnd: 20041130
  omitProxy: true
  ssIdentifier: ssj0002228
  issn: 0010-4817
  databaseCode: U2A
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: http://www.springerlink.com/journals/
  providerName: Springer Nature
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nb9MwGLbQduECYzAR9oEPCCFVmZL4owm3bGKaEOzSVewW2a4D0WiLaHfZr-d9bSdL6IaAS1TFqVP5ef1-uH4eE_JGZkaJQlsoS5iMuVIyVoalYMsWywMIuU528fOFPJ_yj1fi6u7oO8cuWetjc3svr-R_UIV7gCuyZP8B2a5TuAGfAV-4AsJw_SuMvyDXbwKFqEUVTTXXzVev3B0oa6NJM2-gdIVMO57Ml0i2mo1OIW6NAl2hn5q25zusuk2V37w4Rm-b4acmOKlrFSJeu2CQ9PaZ_Hn7fd9PgnfmuWdVtn4yLDo2vzFJvONLvW5miKGpV6XdcM9eJkI5mQgo_hjkloKH9c2h6jVD0m6OxLztDPw1HsoxzcouxuK6lY-x_nf2RJvU-0Hfg3zDbzndCLsul7jcIU9CEUBLj-gz8sgudsnTFgAa_O0uOQww0bc00MYQ3Lb9OSnRAKgzADo0AIoGQBW9xwAoGgANPb8g07MPl6fncTgTIzY8LdaxgoHJc80gLZ9B9j5TegZ-Uxc81cqg9k4tLTRJLZEJOha10TVkKcZ556RO2B7ZWiwX9iWhQog60WNualvwoq5zVN6HwZRja2DOqogctyNXmSAYj-eWfK_cxoWMVWU1GOqIvOu-8MNrpTz86J6DonuuxTsiBy02VZhpqwpSXKi8U5bLiOxvNuNRW1jmFBF53bWCl8S_vtTCLm9WVS5EViScvXrovfvk8d1cOSBb65839hDSzbU-Itvl2cnJxZGzwF_OLXt1
linkProvider Springer Nature
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Word+Sense+Disambiguation+with+a+Similarity-Smoothed+Case+Library&rft.jtitle=Computers+and+the+humanities&rft.au=Lin%2C+Dekang&rft.date=2000-04-01&rft.pub=Kluwer+Academic+Publishers&rft.issn=0010-4817&rft.volume=34&rft.issue=1%2F2&rft.spage=147&rft.epage=152&rft_id=info:doi/10.1023%2Fa%3A1002633105432&rft.externalDocID=30204801
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4817&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4817&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4817&client=summon