Pixel-level end-to-end dual-channel bill text detection based algorithm
Existing scene text detection algorithms do not fully utilize high-level semantic and spatial information, which limits the model's ability to classify complex background pixels and to detect and localize text instances at different scales, and the upsampling process is time-costly. To solve th...
Saved in:
| Published in | 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP) pp. 405 - 409 |
|---|---|
| Main Authors | , , , , , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
15.04.2022
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/ICSP54964.2022.9778469 |
Cover
| Summary: | Existing scene text detection algorithms do not fully utilize high-level semantic and spatial information, which limits the model's ability to classify complex background pixels and to detect and localize text instances at different scales, and the upsampling process is time-costly. To solve the above problems, this paper proposes a double-channel bill text detection (DBTD) model. The model consists of two parts: spatial information path (SP) and semantic information path (SE). SP can retain rich spatial details and SE can provide a large reception domain. Based on these two paths, a new feature fusion module (FFM) is introduced to effectively fuse features. A comparison with multiple models is performed on the ICDAR2015 dataset, and results show that the proposed model has the best detection performance, and F1 is improved to 79.8%. |
|---|---|
| DOI: | 10.1109/ICSP54964.2022.9778469 |