Naïve Bayes classifier for Kashmiri word sense disambiguation

Many applications of Natural Language Processing (NLP) like machine translation, document clustering, and information retrieval make use of Word Sense Disambiguation (WSD). WSD automatically predicts the sense of an ambiguous word that exactly fits it as per the given situation. While it may seem ve...

Full description

Saved in:
Bibliographic Details
Published inSadhana (Bangalore) Vol. 49; no. 3; p. 226
Main Authors Mir, Tawseef Ahmad, Lawaye, Aadil Ahmad
Format Journal Article
LanguageEnglish
Published New Delhi Springer India 29.07.2024
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0973-7677
0256-2499
0973-7677
DOI10.1007/s12046-024-02551-7

Cover

More Information
Summary:Many applications of Natural Language Processing (NLP) like machine translation, document clustering, and information retrieval make use of Word Sense Disambiguation (WSD). WSD automatically predicts the sense of an ambiguous word that exactly fits it as per the given situation. While it may seem very easy for humans to interpret the meaning of natural language, machines require the processing of huge amounts of data for similar tasks. In this paper, we propose an automatic WSD system for the Kashmiri language based on the Naive Bayes classifier. This work is the first attempt towards developing a WSD system for the Kashmiri language to the best of our knowledge. Bag-of-Words (BoW) and Part-of-Speech (PoS) based features are used in this study for developing the WSD system. Experiments are carried out on a manually crafted sense-tagged dataset for 60 ambiguous Kashmiri words. These 60 words are selected based on the frequency in the raw corpus collected. Senses for annotation purposes of these ambiguous words are extracted from Kashmiri WordNet. The performance of the proposed system is measured using accuracy, precision, recall and F-1 measure metrics. The proposed WSD model reported the best performance (accuracy = 89.92, precision = 0.84, recall = 0.89, F-1 measure = 0.86) when both PoS and BoW features were used at the same time.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0973-7677
0256-2499
0973-7677
DOI:10.1007/s12046-024-02551-7