Pixel-level end-to-end dual-channel bill text detection based algorithm

Existing scene text detection algorithms do not fully utilize high-level semantic and spatial information, which limits the model's ability to classify complex background pixels and to detect and localize text instances at different scales, and the upsampling process is time-costly. To solve th...

Full description

Saved in:
Bibliographic Details
Published in2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP) pp. 405 - 409
Main Authors Chen, Xi, Chang, Yongjuan, Zhang, Pengfei, Xin, Rui, Lu, Yanyan, Zhang, Ze'en, Cao, Jingang, Wang, Xinying
Format Conference Proceeding
LanguageEnglish
Published IEEE 15.04.2022
Subjects
Online AccessGet full text
DOI10.1109/ICSP54964.2022.9778469

Cover

More Information
Summary:Existing scene text detection algorithms do not fully utilize high-level semantic and spatial information, which limits the model's ability to classify complex background pixels and to detect and localize text instances at different scales, and the upsampling process is time-costly. To solve the above problems, this paper proposes a double-channel bill text detection (DBTD) model. The model consists of two parts: spatial information path (SP) and semantic information path (SE). SP can retain rich spatial details and SE can provide a large reception domain. Based on these two paths, a new feature fusion module (FFM) is introduced to effectively fuse features. A comparison with multiple models is performed on the ICDAR2015 dataset, and results show that the proposed model has the best detection performance, and F1 is improved to 79.8%.
DOI:10.1109/ICSP54964.2022.9778469