An Unconstrained Benchmark Urdu Handwritten Sentence Database with Automatic Line Segmentation

In this paper we present and announce a novel off-line sentence database of Urdu handwritten documents along with a few preprocessing and text line segmentation procedures. Despite an increased research interest in Urdu handwritten document analysis over the recent years, a standard benchmark datase...

Full description

Saved in:
Bibliographic Details
Published in2012 International Conference on Frontiers in Handwriting Recognition pp. 491 - 496
Main Authors Raza, A., Siddiqi, I., Abidi, A., Arif, F.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.09.2012
Subjects
Online AccessGet full text
ISBN9781467322621
1467322628
DOI10.1109/ICFHR.2012.177

Cover

Abstract In this paper we present and announce a novel off-line sentence database of Urdu handwritten documents along with a few preprocessing and text line segmentation procedures. Despite an increased research interest in Urdu handwritten document analysis over the recent years, a standard benchmark dataset, which could be used in Urdu handwriting recognition tasks, has been missing. Based on our own developed and updated corpus named CENIP-UCCP (Center for Image Processing-Urdu Corpus Construction Project), we have developed an Urdu handwritten database. The corpus is a collection of a variety of Urdu texts that were used to generate forms. These forms were subsequently filled by native writers in their natural handwritings. Six categories of text were used to generate these forms with each category using approximately 66 forms. Up till now, the database comprises 400 digitized forms produced by 200 different writers. The database is completely labeled for content information as well as content detection and supports the evaluation of systems like Urdu handwriting recognition, line segmentation and writer identification. The database was also experimented with the proposed Urdu text line segmentation scheme rendering promising segmentation results.
AbstractList In this paper we present and announce a novel off-line sentence database of Urdu handwritten documents along with a few preprocessing and text line segmentation procedures. Despite an increased research interest in Urdu handwritten document analysis over the recent years, a standard benchmark dataset, which could be used in Urdu handwriting recognition tasks, has been missing. Based on our own developed and updated corpus named CENIP-UCCP (Center for Image Processing-Urdu Corpus Construction Project), we have developed an Urdu handwritten database. The corpus is a collection of a variety of Urdu texts that were used to generate forms. These forms were subsequently filled by native writers in their natural handwritings. Six categories of text were used to generate these forms with each category using approximately 66 forms. Up till now, the database comprises 400 digitized forms produced by 200 different writers. The database is completely labeled for content information as well as content detection and supports the evaluation of systems like Urdu handwriting recognition, line segmentation and writer identification. The database was also experimented with the proposed Urdu text line segmentation scheme rendering promising segmentation results.
Author Raza, A.
Siddiqi, I.
Arif, F.
Abidi, A.
Author_xml – sequence: 1
  givenname: A.
  surname: Raza
  fullname: Raza, A.
  email: ashen.raza@mcs.edu.pk
  organization: Nat. Univ. of Sci. & Technol., Islamabad, Pakistan
– sequence: 2
  givenname: I.
  surname: Siddiqi
  fullname: Siddiqi, I.
  email: imran.siddiqi@bahria.edu.pk
  organization: Bahria Univ., Islamabad, Pakistan
– sequence: 3
  givenname: A.
  surname: Abidi
  fullname: Abidi, A.
  email: abidi@mcs.edu.pk
  organization: Nat. Univ. of Sci. & Technol., Islamabad, Pakistan
– sequence: 4
  givenname: F.
  surname: Arif
  fullname: Arif, F.
  email: fahim@mcs.edu.pk
  organization: Nat. Univ. of Sci. & Technol., Islamabad, Pakistan
BookMark eNotj01Lw0AYhBe0oK25evGyfyB1v5LdPdZqTaEgqLlaNm_e2FW7kWRL6b93RU8DM88MzJSchz4gIdeczTln9na9XFXPc8G4mHOtz0hmteGq1FKIUvAJmf5GVlpm1AXJxvGDMZaKmnF2Sd4WgdYB-jDGwfmALb3DALu9Gz5pPbQHWrnQHgcfIwb6giEJIL130TVuRHr0cUcXh9jvXfRAN2khUe_7BCajD1dk0rmvEbN_nZF69fC6rPLN0-N6udjknusi5ryV1hoBzEABSjhpSgShpC6wK2UHbZE-oNVgsGTQCIC2kYYDaFAKLZMzcvO36xFx-z34dOC0LZVQSkn5AwUsVtA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICFHR.2012.177
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EndPage 496
ExternalDocumentID 6424443
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ADFMO
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
IERZE
OCL
RIE
RIL
ID FETCH-LOGICAL-i175t-1d39982c08c5c42a386ec24375ef63fcd5201e97c8e60cb2ccdb381cc7c44e903
IEDL.DBID RIE
ISBN 9781467322621
1467322628
IngestDate Wed Aug 27 08:34:43 EDT 2025
IsPeerReviewed false
IsScholarly true
LCCN 2012939084
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-1d39982c08c5c42a386ec24375ef63fcd5201e97c8e60cb2ccdb381cc7c44e903
PageCount 6
ParticipantIDs ieee_primary_6424443
PublicationCentury 2000
PublicationDate 2012-Sept.
PublicationDateYYYYMMDD 2012-09-01
PublicationDate_xml – month: 09
  year: 2012
  text: 2012-Sept.
PublicationDecade 2010
PublicationTitle 2012 International Conference on Frontiers in Handwriting Recognition
PublicationTitleAbbrev icfhr
PublicationYear 2012
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001107010
Score 2.020167
Snippet In this paper we present and announce a novel off-line sentence database of Urdu handwritten documents along with a few preprocessing and text line...
SourceID ieee
SourceType Publisher
StartPage 491
SubjectTerms Benchmark testing
corpus
Handwriting recognition
Image segmentation
Labeling
Text analysis
Urdu
Writing
Title An Unconstrained Benchmark Urdu Handwritten Sentence Database with Automatic Line Segmentation
URI https://ieeexplore.ieee.org/document/6424443
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NS8MwFA_bTp5UNvGbHDzaLm3StD3O6ZjCRNTCTo40edMh66S0CP71vrTdhuLBWxtCCO8lvI-83-8RciE8BnMp8H4rsJAcxZw0TAMnwvOD47H2wb7oTu7lOBF302DaIpcbLAwAVMVn4NrP6i3frHRpU2V9aVFZgrdJO4xkjdXa5lMwjsHYosJuyRCPqfSjNaVT8-81pI0ei_u3w9H40VZ2-a4X_mytUlmW0S6ZrPdUF5S8u2WRuvrrF13jfze9R3pbDB992FinfdKCrEteBhlNMm19QtsaAgy9wolvS5W_0yQ3JR2rzHzmiwI9afpk6TrtMteqUNbaUZu0pYOyWFU8rxTjWMBZr8sGv5T1SDK6eR6OnabDgrNAt6FwPIP-SeRrFulAC1_xSIK2FIUBKpDPtQlQQhCHOgLJdOprbVI08VqHWgiIGT8gnWyVwSGhvlBqrjgoloIIcC2DrgBEEDATc87FEela2cw-ahKNWSOW47-HT8iO1U1dzHVKOkVewhla_yI9r9T-DR0KrDw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4gHvSkBoxv9-DRlj52-zgiSooCMQoJJ8l2d1BCKKZpY-Kvd7YtEo0HT203m81mdpp57HzfEHLFbAtmHsP_W4CG5AjLiP2YGwHqD46H0gF9ozsYetGY3U_4pEauv7EwAFAUn4GpX4u7fLWSuU6VtTyNymLuFtnm-OQlWmuTUcFIBqOLAr3l-aionhOsSZ2qb7uibbStsNXrdKMnXdvlmLb_s7lKYVu6e2Sw3lVZUrIw8yw25ecvwsb_bnufNDcoPvr4bZ8OSA2SBnlpJ3ScSO0V6uYQoOgNTnxbinRBx6nKaSQS9ZHOM_Sl6bMm7NTL3IpMaHtHddqWtvNsVTC9UoxkAWe9LisEU9Ik4-7dqBMZVY8FY46OQ2bYCj2UwJFWILlkjnADD6QmKeR4hO5MKo4SgtCXAXiWjB0pVYxGXkpfMgah5R6SerJK4IhQhwkxEy4IKwbGcS2FzgAEwC0Vuq7LjklDy2b6XtJoTCuxnPw9fEl2otGgP-33hg-nZFefU1nadUbqWZrDOfoCWXxRqMAXU6mviQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+International+Conference+on+Frontiers+in+Handwriting+Recognition&rft.atitle=An+Unconstrained+Benchmark+Urdu+Handwritten+Sentence+Database+with+Automatic+Line+Segmentation&rft.au=Raza%2C+A.&rft.au=Siddiqi%2C+I.&rft.au=Abidi%2C+A.&rft.au=Arif%2C+F.&rft.date=2012-09-01&rft.pub=IEEE&rft.isbn=9781467322621&rft.spage=491&rft.epage=496&rft_id=info:doi/10.1109%2FICFHR.2012.177&rft.externalDocID=6424443
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467322621/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467322621/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467322621/sc.gif&client=summon&freeimage=true