Optical Character Recognition (OCR) for Telugu: Database, Algorithm and Application

Telugu is a Dravidian language spoken by more than 80 million people worldwide. The optical character recognition (OCR) of the Telugu script has wide ranging applications including education, health-care, administration etc. The beautiful Telugu script however is very different from Germanic scripts...

Full description

Saved in:
Bibliographic Details
Published inProceedings - International Conference on Image Processing pp. 3963 - 3967
Main Authors Chandra Prakash, Konkimalla, Srikar, Y. M., Trishal, Gayam, Mandal, Souraj, Channappayya, Sumohana S.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2018
Subjects
Online AccessGet full text
ISSN2381-8549
DOI10.1109/ICIP.2018.8451438

Cover

Abstract Telugu is a Dravidian language spoken by more than 80 million people worldwide. The optical character recognition (OCR) of the Telugu script has wide ranging applications including education, health-care, administration etc. The beautiful Telugu script however is very different from Germanic scripts like English and German. This makes the use of transfer learning of Germanic OCR solutions to Telugu a non-trivial task. To address the challenge of OCR for Telugu, we make three contributions in this work: (i) a database of Telugu characters, (ii) a deep learning based OCR algorithm, and (iii) a client server solution for the online deployment of the algorithm. For the benefit of the Telugu people and the research community, our code has been made freely available at this link.
AbstractList Telugu is a Dravidian language spoken by more than 80 million people worldwide. The optical character recognition (OCR) of the Telugu script has wide ranging applications including education, health-care, administration etc. The beautiful Telugu script however is very different from Germanic scripts like English and German. This makes the use of transfer learning of Germanic OCR solutions to Telugu a non-trivial task. To address the challenge of OCR for Telugu, we make three contributions in this work: (i) a database of Telugu characters, (ii) a deep learning based OCR algorithm, and (iii) a client server solution for the online deployment of the algorithm. For the benefit of the Telugu people and the research community, our code has been made freely available at this link.
Author Chandra Prakash, Konkimalla
Trishal, Gayam
Mandal, Souraj
Srikar, Y. M.
Channappayya, Sumohana S.
Author_xml – sequence: 1
  givenname: Konkimalla
  surname: Chandra Prakash
  fullname: Chandra Prakash, Konkimalla
  organization: Indian Inst. of Technol. Hyderabad, Kandi, India
– sequence: 2
  givenname: Y. M.
  surname: Srikar
  fullname: Srikar, Y. M.
  organization: Indian Inst. of Technol. Hyderabad, Kandi, India
– sequence: 3
  givenname: Gayam
  surname: Trishal
  fullname: Trishal, Gayam
  organization: Indian Inst. of Technol. Hyderabad, Kandi, India
– sequence: 4
  givenname: Souraj
  surname: Mandal
  fullname: Mandal, Souraj
  organization: Indian Inst. of Technol. Hyderabad, Kandi, India
– sequence: 5
  givenname: Sumohana S.
  surname: Channappayya
  fullname: Channappayya, Sumohana S.
  organization: Indian Inst. of Technol. Hyderabad, Kandi, India
BookMark eNotkL1OwzAURg0Cibb0ARCLR5BIsX3j2pctCn-VKgWVMldOfNMapUmUpANvTxGdvuWcM3xjdlE3NTF2I8VMSoGPi3TxMVNC2pmNtYzBnrEpGitjg2jEXKpzNlJgZWR1jFds3PffQhx5kCP2mbVDKFzF053rXDFQx1dUNNs6DKGp-V2Wru552XR8TdVhe3jiz25wuevpgSfVtunCsNtzV3uetG11DP1Z1-yydFVP09NO2Nfryzp9j5bZ2yJNllGQRg9RnCsH2pKHUs2tAPBelYioSNt5CYCmQKMtGlDKlYRgPBaSsMi18s5LmLDb_24gok3bhb3rfjanD-AXkWJQxg
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICIP.2018.8451438
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9781479970612
1479970611
EISSN 2381-8549
EndPage 3967
ExternalDocumentID 8451438
Genre orig-research
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i175t-4b2a358ed3f268033dd2f9992e586f3397c975897322afe937d9c1e9cb52dad13
IEDL.DBID RIE
IngestDate Wed Aug 27 02:45:44 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-4b2a358ed3f268033dd2f9992e586f3397c975897322afe937d9c1e9cb52dad13
PageCount 5
ParticipantIDs ieee_primary_8451438
PublicationCentury 2000
PublicationDate 2018-Oct.
PublicationDateYYYYMMDD 2018-10-01
PublicationDate_xml – month: 10
  year: 2018
  text: 2018-Oct.
PublicationDecade 2010
PublicationTitle Proceedings - International Conference on Image Processing
PublicationTitleAbbrev ICIP
PublicationYear 2018
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020131
Score 2.1346543
Snippet Telugu is a Dravidian language spoken by more than 80 million people worldwide. The optical character recognition (OCR) of the Telugu script has wide ranging...
SourceID ieee
SourceType Publisher
StartPage 3963
SubjectTerms Character recognition
Convolutional neural network
Deep learning
Document Recognition
Image segmentation
Machine learning
Neural networks
OCR
Optical character recognition software
Optical imaging
Telugu
Training
Title Optical Character Recognition (OCR) for Telugu: Database, Algorithm and Application
URI https://ieeexplore.ieee.org/document/8451438
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFG6Akyf8gfF3evCgCRtbu43WG5kSMEEIQsKNtGuLRNwIbhf_etttQDQevC1Lli19Wb-v733vewDcmskBAQ2U5XIiLY9LYnFBkUWFkhQzvW9Kk9AfvAS9qfc882cV0Nz1wkgpc_GZtM1lXssXSZSZVFmLeAbeSRVU2yQoerV2hyvjG1NWLV2Htvphf2SEW8QuH_oxPSUHj24dDLavLTQj73aWcjv6-uXI-N_vOgSNfZseHO0A6AhUZHwM6iWvhOVf-3kCXofrPGMNw607MxxvdUNJDO-G4fgeavIKJ3KVLbIH-MhSZuCtCTurRbJZpm8fkMUCdvbV7gaYdp8mYc8qhylYS80QUh0GxLBPpMAKBcTBWAikNDtE0ieBwpqWRFSfHYx5D2ImTm1BI1fSiPtIMOHiU1CLk1ieAYiE3pVcwRwceZ5nDAsJ4oo7rvCdQDnoHJyYNZqvC7-Mebk8F3_fvgQHJk6FQO4K1NJNJq810Kf8Jo_wN4zLp2w
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFG4QD3pCBeNve_CgCYOt3WbrjUwJKL-CkHAj7dohETeC28W_3nYbEI0Hb8uSZUu_rO_re9_7HgA3enKAS93AsDiRhs0lMbigyKAikBQztW9KndDv9tzW2H6eOJMCqG56YaSUqfhM1vRlWssXkZ_oVFmd2Dq8kx2w69i27WTdWpvjlXaOyeuWlknrba890NItUssf-zE_JQ0fzRLorl-cqUbea0nMa_7XL0_G_37ZAahsG_XgYBOCDkFBhkeglDNLmP-3n2Xw2l-mOWvorf2Z4XCtHIpCeNv3hndQ0Vc4kotkljzARxYzHeCqsLGYRat5_PYBWShgY1vvroBx82nktYx8nIIxVxwhVkAghh0iBQ6QS0yMhUCB4odIOsQNsCImPlWnB23fg5hG6l5Q35LU5w4STFj4GBTDKJQnACKh9iVLMBP7CghtWUgQD7hpCcd0AxOdgrJeo-kyc8yY5stz9vfta7DXGnU7006793IO9jVmmVzuAhTjVSIvVdiP-VWK9jeCh6q5
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+International+Conference+on+Image+Processing&rft.atitle=Optical+Character+Recognition+%28OCR%29+for+Telugu%3A+Database%2C+Algorithm+and+Application&rft.au=Chandra+Prakash%2C+Konkimalla&rft.au=Srikar%2C+Y.+M.&rft.au=Trishal%2C+Gayam&rft.au=Mandal%2C+Souraj&rft.date=2018-10-01&rft.pub=IEEE&rft.eissn=2381-8549&rft.spage=3963&rft.epage=3967&rft_id=info:doi/10.1109%2FICIP.2018.8451438&rft.externalDocID=8451438