Word Sense Disambiguation with a Similarity-Smoothed Case Library
A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using dependency trees to define local contexts of test words as feature vectors in which each feature of a word is a path in the dependency tree of i...
Saved in:
| Published in | Computers and the humanities Vol. 34; no. 1/2; pp. 147 - 152 |
|---|---|
| Main Author | |
| Format | Journal Article |
| Language | English |
| Published |
New York
Kluwer Academic Publishers
01.04.2000
Pergamon Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0010-4817 1574-020X 1572-8412 1574-0218 |
| DOI | 10.1023/a:1002633105432 |
Cover
| Abstract | A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using dependency trees to define local contexts of test words as feature vectors in which each feature of a word is a path in the dependency tree of its containing sentence. Data sparseness is addressed by applying a similarity function to a thesaurus extracted from a 125 million word corpus, thereby recognizing commonalities between local contexts; target words are tagged with the sense value of the example having the maximally similar local context. Training with the entire training corpus yielded robust SENSEVAL evaluation results of 0.701 recall & 0.706 precision; running the system without the thesaurus produced a 4%-6% drop in both values, & a 7% drop resulted when local contexts were formalized as surrounding words instead of dependency trees. 2 Tables, 6 References. J. Hitchcock |
|---|---|
| AbstractList | Lin presents a case-based algorithm for word sense disambiguation. The case library consists of local contexts of sense-tagged examples in the training corpus. A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using dependency trees to define local contexts of test words as feature vectors in which each feature of a word is a path in the dependency tree of its containing sentence. Data sparseness is addressed by applying a similarity function to a thesaurus extracted from a 125 million word corpus, thereby recognizing commonalities between local contexts; target words are tagged with the sense value of the example having the maximally similar local context. Training with the entire training corpus yielded robust SENSEVAL evaluation results of 0.701 recall & 0.706 precision; running the system without the thesaurus produced a 4%-6% drop in both values, & a 7% drop resulted when local contexts were formalized as surrounding words instead of dependency trees. 2 Tables, 6 References. J. Hitchcock |
| Author | Lin, Dekang |
| Author_xml | – sequence: 1 givenname: Dekang surname: Lin fullname: Lin, Dekang |
| BookMark | eNp9kMtr4zAQh0VpoUnbc08Fswu9eTOjhy3vLaSPXQj0kJYejWTLWwXbykoKS_77qk0ptLA9zWG-3zy-KTkc3WgIOUf4gUDZTP1EAFowhiA4owdkgqKkueRID8kEACHnEstjMg1hDQmlVE7I_NH5NluZMZjsygY1aPtnq6J1Y_bPxqdMZSs72F55G3f5anAuPpk2W6iEL632yu9OyVGn-mDO3uoJebi5vl_8ypd3t78X82XecKxirhhQKTWjDFtA1irdAhS64qhVA6LiXWFSq9AFYCVK0TW6Y8AaDcApdMBOyOV-7sa7v1sTYj3Y0Ji-V6Nx21BLIWgFnCXw2ydw7bZ-TLfVFHlZVVhUCfr-PwhLAbJAJotEzfZU410I3nT1xtshPV0j1C_W63n9wXpKiE-JxsZXn9Er23-Ru9jn1iE6_74mSQMuk69nUtGNeQ |
| CODEN | COHUAD |
| CitedBy_id | crossref_primary_10_1111_exsy_12075 crossref_primary_10_1093_llc_fqs074 crossref_primary_10_1631_jzus_2006_A1609 crossref_primary_10_3182_20020721_6_ES_1901_01454 |
| Cites_doi | 10.3115/981574.981590 10.3115/981732.981752 10.3115/981863.981869 10.3115/981732.981745 10.7551/mitpress/7287.003.0018 10.3115/980432.980696 |
| ContentType | Journal Article |
| Copyright | Copyright 2000 Kluwer Academic Publishers Copyright Kluwer Academic Publishers Apr 2000 |
| Copyright_xml | – notice: Copyright 2000 Kluwer Academic Publishers – notice: Copyright Kluwer Academic Publishers Apr 2000 |
| DBID | AAYXX CITATION HFIND JQCIK K30 PAAUG PAWHS PAWZZ PAXOH PBHAV PBQSW PBYQZ PCIWU PCMID PCZJX PDGRG PDWWI PETMR PFVGT PGXDX PIHIL PISVA PJCTQ PJTMS PLCHJ PMHAD PNQDJ POUND PPLAD PQAPC PQCAN PQCMW PQEME PQHKH PQMID PQNCT PQNET PQSCT PQSET PSVJG PVMQY PZGFC 7SC 7T9 8FD JQ2 L7M L~C L~D |
| DOI | 10.1023/a:1002633105432 |
| DatabaseName | CrossRef Periodicals Index Online Segment 16 Periodicals Index Online Segment 33 Periodicals Index Online Primary Sources Access—Foundation Edition (Plan E) - West Primary Sources Access (Plan D) - International Primary Sources Access & Build (Plan A) - MEA Primary Sources Access—Foundation Edition (Plan E) - Midwest Primary Sources Access—Foundation Edition (Plan E) - Northeast Primary Sources Access (Plan D) - Southeast Primary Sources Access (Plan D) - North Central Primary Sources Access—Foundation Edition (Plan E) - Southeast Primary Sources Access (Plan D) - South Central Primary Sources Access & Build (Plan A) - UK / I Primary Sources Access (Plan D) - Canada Primary Sources Access (Plan D) - EMEALA Primary Sources Access—Foundation Edition (Plan E) - North Central Primary Sources Access—Foundation Edition (Plan E) - South Central Primary Sources Access & Build (Plan A) - International Primary Sources Access—Foundation Edition (Plan E) - International Primary Sources Access (Plan D) - West Periodicals Index Online Segments 1-50 Primary Sources Access (Plan D) - APAC Primary Sources Access (Plan D) - Midwest Primary Sources Access (Plan D) - MEA Primary Sources Access—Foundation Edition (Plan E) - Canada Primary Sources Access—Foundation Edition (Plan E) - UK / I Primary Sources Access—Foundation Edition (Plan E) - EMEALA Primary Sources Access & Build (Plan A) - APAC Primary Sources Access & Build (Plan A) - Canada Primary Sources Access & Build (Plan A) - West Primary Sources Access & Build (Plan A) - EMEALA Primary Sources Access (Plan D) - Northeast Primary Sources Access & Build (Plan A) - Midwest Primary Sources Access & Build (Plan A) - North Central Primary Sources Access & Build (Plan A) - Northeast Primary Sources Access & Build (Plan A) - South Central Primary Sources Access & Build (Plan A) - Southeast Primary Sources Access (Plan D) - UK / I Primary Sources Access—Foundation Edition (Plan E) - APAC Primary Sources Access—Foundation Edition (Plan E) - MEA Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Periodicals Index Online Segments 1-50 Periodicals Index Online Segment 16 Periodicals Index Online Segment 33 Periodicals Index Online Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Linguistics and Language Behavior Abstracts (LLBA) Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database Linguistics and Language Behavior Abstracts (LLBA) |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Library & Information Science Computer Science |
| EISSN | 1572-8412 1574-0218 |
| EndPage | 152 |
| ExternalDocumentID | 62755927 10_1023_A_1002633105432 30204801 |
| GroupedDBID | -DZ -~C -~X .4H .4S .86 .DC 07C 0R~ 199 29F 2J2 2JN 2KG 2LR 3EH 3LD 40E 53G 5GY 67Z 6J9 6NX 8UJ 95- 95. 95~ 96X AABHQ AAHCP AARTL AAUTI AAWCG AAYOK ABBHK ABBXA ABECW ABFTV ABJOX ABKCH ABQSL ABTMW ABXSQ ACNXV ACPVT ADHKG ADMHC ADMHG ADPTO ADURQ ADYFF AEGNC AEOHA AEUPB AEXYK AFFNX AGDGC AGQMX AGQPQ AHEXP AI. AJRNO ALMA_UNASSIGNED_HOLDINGS AMKLP ARCSS B-. CAG COF CS3 DL5 DPUIP DU5 EBS EDO EHI EJD F5P FNLPD GPZZG GQ8 HMHOC HMJXF HQYDN HRMNR HVGLF HZ~ I09 IHE IPSME IZIGR I~Z JAAYA JAB JBMMH JBSCW JCJTX JENOY JHFFW JKQEH JLEZI JLXEF JPL JST KDC KOV LAK MA- MK~ ML~ MVM N2Q NB0 NF0 NQJWS O9- O93 O9G OVD P-O PF- PT5 QOS R4E RHV RNI ROL RPX RSV RXW RZC RZE SA0 SAP SDA SDH SHS SNX SZN TAE TEORI TN5 TSK TUC TUS U2A UG4 VC2 VH1 WH7 WK8 YZZ ~45 1OL 2.D 28- 4.4 5QI 78A AAYXX ABMNI AEFIE AGGDS ASPBG AVWKF AZFZN BBWZM CITATION HF~ LPU QOK RZK SDM WHG XOL ~EX HFIND JQCIK K30 PAAUG PAWHS PAWZZ PAXOH PBHAV PBQSW PBYQZ PCIWU PCMID PCZJX PDGRG PDWWI PETMR PFVGT PGXDX PIHIL PISVA PJCTQ PJTMS PLCHJ PMHAD PNQDJ POUND PPLAD PQAPC PQCAN PQCMW PQEME PQHKH PQMID PQNCT PQNET PQSCT PQSET PSVJG PVMQY PZGFC -Y2 0VY 203 29L 2JY 2P1 30V 406 5VS 7SC 7T9 8FD 8TC AAAVM AACDK AAGAY AAHNG AAIAL AAJBT AAJKR AANZL AAPKM AARHV AASML AATNV AATVU AAUYE AAYIU AAYQN AAYTO ABAKF ABBBX ABBRH ABDBE ABDZT ABECU ABFSG ABHLI ABHQN ABJNI ABKTR ABLJU ABMQK ABNWP ABQBU ABRTQ ABSXP ABTEG ABTHY ABTKH ABULA ABWNU ABXPI ACAOD ACBXY ACDTI ACGFO ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACPIV ACREN ACSTC ACZOJ ADHIR ADKNI ADKPE ADRFC ADTPH ADULT ADYOE ADZKW AEBTG AEFQL AEGAL AEJHL AEJRE AEKMD AEMSY AENEX AEPYU AESKC AETLH AEVLU AEZWR AFBBN AFDZB AFGCZ AFHIU AFLOW AFQWF AFYQB AFZKB AGAYW AGJBK AGMZJ AGQEE AGRTI AGWIL AGWZB AGYKE AHBYD AHPBZ AHSBF AHWEU AHYZX AIAKS AIGIU AILAN AITGF AIXLP AJBLW AJZVZ ALWAN AMTXH AMXSW AOCGG ARMRJ ATHPR AXYYD AYFIA AYQZM BDATZ BGNMA CSCUP DDRTE DNIVK EBLON EIOEI ESBYG FEDTE FERAY FFXSO FIGPU FINBP FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNWQR GQ7 HG6 HLICF IJ- IKXTQ ITM IWAJR IZQ J-C J0Z JQ2 JZLTJ L7M LLZTM L~C L~D M4Y MQGED NPVJJ NU0 OAM P19 P9Q PT4 R89 R9I S16 S27 S3B SISQX SJYHP SNE SNPRN SOHCF SOJ SPISZ SRMVM SSLCW T13 TSG UOJIU UTJUX UZXMN VFIZW W23 YLTOR ZMTXR |
| ID | FETCH-LOGICAL-c419t-a30288b3231d013dabd006b941bac0594f6e31d6b6019575fcbf303cb00420f03 |
| ISSN | 0010-4817 1574-020X |
| IngestDate | Wed Jul 30 10:45:22 EDT 2025 Sun Oct 26 13:22:55 EDT 2025 Fri Jul 25 03:01:44 EDT 2025 Thu Apr 24 22:56:59 EDT 2025 Wed Oct 01 05:48:32 EDT 2025 Thu May 29 08:35:30 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1/2 |
| Language | English |
| License | https://www.springernature.com/gp/researchers/text-and-data-mining |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c419t-a30288b3231d013dabd006b941bac0594f6e31d6b6019575fcbf303cb00420f03 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| PQID | 1750861386 |
| PQPubID | 1817236 |
| PageCount | 6 |
| ParticipantIDs | proquest_miscellaneous_85529043 proquest_journals_214799169 proquest_journals_1750861386 crossref_primary_10_1023_A_1002633105432 crossref_citationtrail_10_1023_A_1002633105432 jstor_primary_30204801 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2000-04-01 |
| PublicationDateYYYYMMDD | 2000-04-01 |
| PublicationDate_xml | – month: 04 year: 2000 text: 2000-04-01 day: 01 |
| PublicationDecade | 2000 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | Computers and the humanities |
| PublicationYear | 2000 |
| Publisher | Kluwer Academic Publishers Pergamon Springer Nature B.V |
| Publisher_xml | – name: Kluwer Academic Publishers – name: Pergamon – name: Springer Nature B.V |
| References | 258481_CR5 258481_CR4 258481_CR3 258481_CR2 258481_CR1 258481_CR6 |
| References_xml | – ident: 258481_CR3 doi: 10.3115/981574.981590 – ident: 258481_CR1 doi: 10.3115/981732.981752 – ident: 258481_CR5 doi: 10.3115/981863.981869 – ident: 258481_CR6 doi: 10.3115/981732.981745 – ident: 258481_CR2 doi: 10.7551/mitpress/7287.003.0018 – ident: 258481_CR4 doi: 10.3115/980432.980696 |
| SSID | ssj0002228 ssj0042478 |
| Score | 1.5557072 |
| Snippet | Lin presents a case-based algorithm for word sense disambiguation. The case library consists of local contexts of sense-tagged examples in the training corpus. A case-based algorithm for word sense disambiguation, tested in the SENSEVAL workshop competition, constructs a case library from the training corpus using... |
| SourceID | proquest crossref jstor |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 147 |
| SubjectTerms | Algorithms Ambiguity Cities Computational linguistics Computer Applications Computer Generated Language Analysis Computer programming Corpus Linguistics Dictionaries English Systems Ethnic conflict Libraries Natural Language Processing Nouns Polysemy Semantics Verbs Word Meaning Word sense disambiguation Words |
| Title | Word Sense Disambiguation with a Similarity-Smoothed Case Library |
| URI | https://www.jstor.org/stable/30204801 https://www.proquest.com/docview/1750861386 https://www.proquest.com/docview/214799169 https://www.proquest.com/docview/85529043 |
| Volume | 34 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVLSH databaseName: SpringerLink Journals customDbUrl: mediaType: online eissn: 1572-8412 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0042478 issn: 0010-4817 databaseCode: AFBBN dateStart: 19970101 isFulltext: true providerName: Library Specific Holdings – providerCode: PRVAVX databaseName: SpringerLINK - Czech Republic Consortium customDbUrl: eissn: 1572-8412 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0042478 issn: 0010-4817 databaseCode: AGYKE dateStart: 19970101 isFulltext: true titleUrlDefault: http://link.springer.com providerName: Springer Nature – providerCode: PRVAVX databaseName: SpringerLink Journals (ICM) customDbUrl: eissn: 1572-8412 dateEnd: 20041130 omitProxy: true ssIdentifier: ssj0002228 issn: 0010-4817 databaseCode: U2A dateStart: 19970101 isFulltext: true titleUrlDefault: http://www.springerlink.com/journals/ providerName: Springer Nature |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nb9MwGLbQduECYzAR9oEPCCFVmZL4owm3bGKaEOzSVewW2a4D0WiLaHfZr-d9bSdL6IaAS1TFqVP5ef1-uH4eE_JGZkaJQlsoS5iMuVIyVoalYMsWywMIuU528fOFPJ_yj1fi6u7oO8cuWetjc3svr-R_UIV7gCuyZP8B2a5TuAGfAV-4AsJw_SuMvyDXbwKFqEUVTTXXzVev3B0oa6NJM2-gdIVMO57Ml0i2mo1OIW6NAl2hn5q25zusuk2V37w4Rm-b4acmOKlrFSJeu2CQ9PaZ_Hn7fd9PgnfmuWdVtn4yLDo2vzFJvONLvW5miKGpV6XdcM9eJkI5mQgo_hjkloKH9c2h6jVD0m6OxLztDPw1HsoxzcouxuK6lY-x_nf2RJvU-0Hfg3zDbzndCLsul7jcIU9CEUBLj-gz8sgudsnTFgAa_O0uOQww0bc00MYQ3Lb9OSnRAKgzADo0AIoGQBW9xwAoGgANPb8g07MPl6fncTgTIzY8LdaxgoHJc80gLZ9B9j5TegZ-Uxc81cqg9k4tLTRJLZEJOha10TVkKcZ556RO2B7ZWiwX9iWhQog60WNualvwoq5zVN6HwZRja2DOqogctyNXmSAYj-eWfK_cxoWMVWU1GOqIvOu-8MNrpTz86J6DonuuxTsiBy02VZhpqwpSXKi8U5bLiOxvNuNRW1jmFBF53bWCl8S_vtTCLm9WVS5EViScvXrovfvk8d1cOSBb65839hDSzbU-Itvl2cnJxZGzwF_OLXt1 |
| linkProvider | Springer Nature |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Word+Sense+Disambiguation+with+a+Similarity-Smoothed+Case+Library&rft.jtitle=Computers+and+the+humanities&rft.au=Lin%2C+Dekang&rft.date=2000-04-01&rft.pub=Kluwer+Academic+Publishers&rft.issn=0010-4817&rft.volume=34&rft.issue=1%2F2&rft.spage=147&rft.epage=152&rft_id=info:doi/10.1023%2Fa%3A1002633105432&rft.externalDocID=30204801 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4817&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4817&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4817&client=summon |