Innovative Binarization Solutions for Historical Document Clarity
Images of historical documents often have characteristics, such as wrinkles, faint writing, stains, bleed-through ink, and other issues. These factors distort the text visibility and affect the performance of binarization. Preserving these document images aids future generations in learning about a...
Saved in:
| Published in | 2024 4th International Conference on Pervasive Computing and Social Networking (ICPCSN) pp. 210 - 217 |
|---|---|
| Main Authors | , , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
03.05.2024
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/ICPCSN62568.2024.00043 |
Cover
| Summary: | Images of historical documents often have characteristics, such as wrinkles, faint writing, stains, bleed-through ink, and other issues. These factors distort the text visibility and affect the performance of binarization. Preserving these document images aids future generations in learning about a variety of subjects. This article presents a new binarization approach for historic documents. This work uses bilateral and unsharp filtering, Otsu thresholding, histogram analysis, and k-means clustering for dark spot removal as part of a multi-step image enhancement process. Applying an intensity-based mask raises the quality of pixels above a set threshold. Furthermore, the method includes two additional refinements: a concluding sharpening phase and an enhancement of color contrast. The performance of proposed binarization approach is assessed using the Flesch Reading Ease Formula. The findings reveal that the algorithm achieved its highest readability score when applied to degraded documents, with an average readability score of 96.44. This suggests the algorithm's efficacy in enhancing the readability of noisy images, particularly in the context of degraded documents. |
|---|---|
| DOI: | 10.1109/ICPCSN62568.2024.00043 |