Named Entity Recognition System for Postpositional Languages: Urdu as a Case Study

Named Entity Recognition and Classification is the process of identifying named entities and classifying them into one of the classes like person name, organization name, location name, etc. In this paper, we propose a tagging scheme Begin Inside Last -2 (BIL2) for the Subject Object Verb (SOV) lang...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of advanced computer science & applications Vol. 7; no. 10
Main Authors Kamran, Muhammad, Mansoor, Syed
Format Journal Article
LanguageEnglish
Published West Yorkshire Science and Information (SAI) Organization Limited 01.01.2016
Subjects
Online AccessGet full text
ISSN2158-107X
2156-5570
2156-5570
DOI10.14569/IJACSA.2016.071019

Cover

More Information
Summary:Named Entity Recognition and Classification is the process of identifying named entities and classifying them into one of the classes like person name, organization name, location name, etc. In this paper, we propose a tagging scheme Begin Inside Last -2 (BIL2) for the Subject Object Verb (SOV) languages that contain postposition. We use the Urdu language as a case study. We compare the F-measure values obtained for the tagging schemes IO, BIO2, BILOU and BIL2 using Hidden Markov Model (HMM) and Conditional Random Field (CRF). The BIL2 tagging scheme results are better than the other three tagging schemes using the same parameters including bigram and context window. With HMM, the F-measure values for IO, BIO2, BILOU, and BIL2 are 44.87%, 44.88%, 45.14%, and 45.88%, respectively. With CRF, the F-measure values for IO, BIO2, BILOU, and BIL2 are 35.13%, 35.90%, 37.85%, and 38.39%, respectively. The F-measure values for BIL2 are better than those of previously reported techniques
Bibliography:ObjectType-Case Study-2
SourceType-Scholarly Journals-1
content type line 14
ObjectType-Report-1
ISSN:2158-107X
2156-5570
2156-5570
DOI:10.14569/IJACSA.2016.071019