Automatic image captioning in Thai for house defect using a deep learning-based approach

This study aims to automate the reporting process of house inspections, which enables prospective buyers to make informed decisions. Currently, the inspection report generated by an inspector involves inserting all defect images into a spreadsheet software and manually captioning each image with ide...

Full description

Saved in:
Bibliographic Details
Published inAdvances in computational intelligence Vol. 4; no. 1; p. 1
Main Authors Jaruschaimongkol, Manadda, Satirapiwong, Krittin, Pipatsattayanuwong, Kittipan, Temviriyakul, Suwant, Sangprasert, Ratchanat, Siriborvornratanakul, Thitirat
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.03.2024
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN2730-7794
2730-7808
DOI10.1007/s43674-023-00068-w

Cover

More Information
Summary:This study aims to automate the reporting process of house inspections, which enables prospective buyers to make informed decisions. Currently, the inspection report generated by an inspector involves inserting all defect images into a spreadsheet software and manually captioning each image with identified defects. To the best of our knowledge, there are no previous works or datasets that have automated this process. Therefore, this paper proposes a new image captioning dataset for house defect inspection, which is benchmarked with three deep learning-based models. Our models are based on the encoder–decoder architecture where three image encoders (i.e., VGG16, MobileNet, and InceptionV3) and one GRU-based decoder with an additive attention mechanism of Bahdanau are experimented. The experimental results indicate that, despite similar training losses in all models, VGG16 takes the least time to train a model, while MobileNet achieves the highest BLEU-1 to BLEU-4 scores of 0.866, 0.850, 0.823, and 0.728, respectively. However, InceptionV3 is suggested as the optimal model, since it outperforms the others in terms of accurate attention plots and its BLEU scores are comparable to the best scores obtained by MobileNet.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2730-7794
2730-7808
DOI:10.1007/s43674-023-00068-w