An Unconstrained Benchmark Urdu Handwritten Sentence Database with Automatic Line Segmentation

In this paper we present and announce a novel off-line sentence database of Urdu handwritten documents along with a few preprocessing and text line segmentation procedures. Despite an increased research interest in Urdu handwritten document analysis over the recent years, a standard benchmark datase...

Full description

Saved in:

Bibliographic Details
Published in	2012 International Conference on Frontiers in Handwriting Recognition pp. 491 - 496
Main Authors	Raza, A., Siddiqi, I., Abidi, A., Arif, F.
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2012
Subjects	Benchmark testing corpus Handwriting recognition Image segmentation Labeling Text analysis Urdu Writing
Online Access	Get full text
ISBN	9781467322621 1467322628
DOI	10.1109/ICFHR.2012.177

Cover

Abstract	In this paper we present and announce a novel off-line sentence database of Urdu handwritten documents along with a few preprocessing and text line segmentation procedures. Despite an increased research interest in Urdu handwritten document analysis over the recent years, a standard benchmark dataset, which could be used in Urdu handwriting recognition tasks, has been missing. Based on our own developed and updated corpus named CENIP-UCCP (Center for Image Processing-Urdu Corpus Construction Project), we have developed an Urdu handwritten database. The corpus is a collection of a variety of Urdu texts that were used to generate forms. These forms were subsequently filled by native writers in their natural handwritings. Six categories of text were used to generate these forms with each category using approximately 66 forms. Up till now, the database comprises 400 digitized forms produced by 200 different writers. The database is completely labeled for content information as well as content detection and supports the evaluation of systems like Urdu handwriting recognition, line segmentation and writer identification. The database was also experimented with the proposed Urdu text line segmentation scheme rendering promising segmentation results.
AbstractList	In this paper we present and announce a novel off-line sentence database of Urdu handwritten documents along with a few preprocessing and text line segmentation procedures. Despite an increased research interest in Urdu handwritten document analysis over the recent years, a standard benchmark dataset, which could be used in Urdu handwriting recognition tasks, has been missing. Based on our own developed and updated corpus named CENIP-UCCP (Center for Image Processing-Urdu Corpus Construction Project), we have developed an Urdu handwritten database. The corpus is a collection of a variety of Urdu texts that were used to generate forms. These forms were subsequently filled by native writers in their natural handwritings. Six categories of text were used to generate these forms with each category using approximately 66 forms. Up till now, the database comprises 400 digitized forms produced by 200 different writers. The database is completely labeled for content information as well as content detection and supports the evaluation of systems like Urdu handwriting recognition, line segmentation and writer identification. The database was also experimented with the proposed Urdu text line segmentation scheme rendering promising segmentation results.
Author	Raza, A. Siddiqi, I. Arif, F. Abidi, A.
Author_xml	– sequence: 1 givenname: A. surname: Raza fullname: Raza, A. email: ashen.raza@mcs.edu.pk organization: Nat. Univ. of Sci. & Technol., Islamabad, Pakistan – sequence: 2 givenname: I. surname: Siddiqi fullname: Siddiqi, I. email: imran.siddiqi@bahria.edu.pk organization: Bahria Univ., Islamabad, Pakistan – sequence: 3 givenname: A. surname: Abidi fullname: Abidi, A. email: abidi@mcs.edu.pk organization: Nat. Univ. of Sci. & Technol., Islamabad, Pakistan – sequence: 4 givenname: F. surname: Arif fullname: Arif, F. email: fahim@mcs.edu.pk organization: Nat. Univ. of Sci. & Technol., Islamabad, Pakistan
BookMark	eNotj01Lw0AYhBe0oK25evGyfyB1v5LdPdZqTaEgqLlaNm_e2FW7kWRL6b93RU8DM88MzJSchz4gIdeczTln9na9XFXPc8G4mHOtz0hmteGq1FKIUvAJmf5GVlpm1AXJxvGDMZaKmnF2Sd4WgdYB-jDGwfmALb3DALu9Gz5pPbQHWrnQHgcfIwb6giEJIL130TVuRHr0cUcXh9jvXfRAN2khUe_7BCajD1dk0rmvEbN_nZF69fC6rPLN0-N6udjknusi5ryV1hoBzEABSjhpSgShpC6wK2UHbZE-oNVgsGTQCIC2kYYDaFAKLZMzcvO36xFx-z34dOC0LZVQSkn5AwUsVtA
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ICFHR.2012.177
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EndPage	496
ExternalDocumentID	6424443
Genre	orig-research
GroupedDBID	6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ADFMO ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK IEGSK IERZE OCL RIE RIL
ID	FETCH-LOGICAL-i175t-1d39982c08c5c42a386ec24375ef63fcd5201e97c8e60cb2ccdb381cc7c44e903
IEDL.DBID	RIE
ISBN	9781467322621 1467322628
IngestDate	Wed Aug 27 08:34:43 EDT 2025
IsPeerReviewed	false
IsScholarly	true
LCCN	2012939084
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i175t-1d39982c08c5c42a386ec24375ef63fcd5201e97c8e60cb2ccdb381cc7c44e903
PageCount	6
ParticipantIDs	ieee_primary_6424443
PublicationCentury	2000
PublicationDate	2012-Sept.
PublicationDateYYYYMMDD	2012-09-01
PublicationDate_xml	– month: 09 year: 2012 text: 2012-Sept.
PublicationDecade	2010
PublicationTitle	2012 International Conference on Frontiers in Handwriting Recognition
PublicationTitleAbbrev	icfhr
PublicationYear	2012
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0001107010
Score	2.020167
Snippet	In this paper we present and announce a novel off-line sentence database of Urdu handwritten documents along with a few preprocessing and text line...
SourceID	ieee
SourceType	Publisher
StartPage	491
SubjectTerms	Benchmark testing corpus Handwriting recognition Image segmentation Labeling Text analysis Urdu Writing
Title	An Unconstrained Benchmark Urdu Handwritten Sentence Database with Automatic Line Segmentation
URI	https://ieeexplore.ieee.org/document/6424443
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NS8MwFA_bTp5UNvGbHDzaLm3StD3O6ZjCRNTCTo40edMh66S0CP71vrTdhuLBWxtCCO8lvI-83-8RciE8BnMp8H4rsJAcxZw0TAMnwvOD47H2wb7oTu7lOBF302DaIpcbLAwAVMVn4NrP6i3frHRpU2V9aVFZgrdJO4xkjdXa5lMwjsHYosJuyRCPqfSjNaVT8-81pI0ei_u3w9H40VZ2-a4X_mytUlmW0S6ZrPdUF5S8u2WRuvrrF13jfze9R3pbDB992FinfdKCrEteBhlNMm19QtsaAgy9wolvS5W_0yQ3JR2rzHzmiwI9afpk6TrtMteqUNbaUZu0pYOyWFU8rxTjWMBZr8sGv5T1SDK6eR6OnabDgrNAt6FwPIP-SeRrFulAC1_xSIK2FIUBKpDPtQlQQhCHOgLJdOprbVI08VqHWgiIGT8gnWyVwSGhvlBqrjgoloIIcC2DrgBEEDATc87FEela2cw-ahKNWSOW47-HT8iO1U1dzHVKOkVewhla_yI9r9T-DR0KrDw
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4gHvSkBoxv9-DRlj52-zgiSooCMQoJJ8l2d1BCKKZpY-Kvd7YtEo0HT203m81mdpp57HzfEHLFbAtmHsP_W4CG5AjLiP2YGwHqD46H0gF9ozsYetGY3U_4pEauv7EwAFAUn4GpX4u7fLWSuU6VtTyNymLuFtnm-OQlWmuTUcFIBqOLAr3l-aionhOsSZ2qb7uibbStsNXrdKMnXdvlmLb_s7lKYVu6e2Sw3lVZUrIw8yw25ecvwsb_bnufNDcoPvr4bZ8OSA2SBnlpJ3ScSO0V6uYQoOgNTnxbinRBx6nKaSQS9ZHOM_Sl6bMm7NTL3IpMaHtHddqWtvNsVTC9UoxkAWe9LisEU9Ik4-7dqBMZVY8FY46OQ2bYCj2UwJFWILlkjnADD6QmKeR4hO5MKo4SgtCXAXiWjB0pVYxGXkpfMgah5R6SerJK4IhQhwkxEy4IKwbGcS2FzgAEwC0Vuq7LjklDy2b6XtJoTCuxnPw9fEl2otGgP-33hg-nZFefU1nadUbqWZrDOfoCWXxRqMAXU6mviQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+International+Conference+on+Frontiers+in+Handwriting+Recognition&rft.atitle=An+Unconstrained+Benchmark+Urdu+Handwritten+Sentence+Database+with+Automatic+Line+Segmentation&rft.au=Raza%2C+A.&rft.au=Siddiqi%2C+I.&rft.au=Abidi%2C+A.&rft.au=Arif%2C+F.&rft.date=2012-09-01&rft.pub=IEEE&rft.isbn=9781467322621&rft.spage=491&rft.epage=496&rft_id=info:doi/10.1109%2FICFHR.2012.177&rft.externalDocID=6424443
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467322621/lc.gif&client=summon&freeimage=true
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467322621/mc.gif&client=summon&freeimage=true
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467322621/sc.gif&client=summon&freeimage=true