Remote Sensing Image Captioning Using Deep Learning

Image captioning is a complex task that involves using deep learning techniques to automatically generate sentences describing the images. This idea is expanded upon by remote sensing image captioning, which applies to images obtained from high altitudes, such as those captured by satellites, aircra...

Full description

Saved in:
Bibliographic Details
Published in2024 International Conference on Automation and Computation (AUTOCOM) pp. 295 - 302
Main Authors Yamani, Bhavitha, Medavarapu, Nikhil, Rakesh, S.
Format Conference Proceeding
LanguageEnglish
Published IEEE 14.03.2024
Subjects
Online AccessGet full text
DOI10.1109/AUTOCOM60220.2024.10486178

Cover

More Information
Summary:Image captioning is a complex task that involves using deep learning techniques to automatically generate sentences describing the images. This idea is expanded upon by remote sensing image captioning, which applies to images obtained from high altitudes, such as those captured by satellites, aircraft, or drones. In the realm of remote sensing image captioning, the prevailing approach predominantly employs encoder-decoder frameworks. In this framework, the image input is encoded by a convolutional neural network (CNN), and a recurrent neural network (RNN) deciphers this encoded information into coherent sentence descriptions. However, recent times have witnessed the emergence of advanced models and solutions that leverage deep learning techniques to enhance performance. To gain a comprehensive understanding of these solutions, extensive research has been conducted, delving into reputable journal publications. This study conducts a thorough examination of the numerous proposed solutions. It delves into their methodologies, meticulously assesses their strengths and weaknesses, and conducts comparative analyses. Through this research, our objective is to provide a thorough review of the subject of remote sensing image captioning, shedding light on the latest developments and their potential implications.
DOI:10.1109/AUTOCOM60220.2024.10486178