Image Captioning using Google's Inception-resnet-v2 and Recurrent Neural Network
Given a photograph as input, this paper solves the problem of experiencing a plausible caption of the photograph. The model learns about the correlations between language and images from the provided data-set of labeled images. It proposes a fully automatic approach through a combination of Convolut...
Saved in:
Published in | International Conference on Contemporary Computing pp. 1 - 6 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.08.2019
|
Subjects | |
Online Access | Get full text |
ISSN | 2572-6129 |
DOI | 10.1109/IC3.2019.8844921 |
Cover
Summary: | Given a photograph as input, this paper solves the problem of experiencing a plausible caption of the photograph. The model learns about the correlations between language and images from the provided data-set of labeled images. It proposes a fully automatic approach through a combination of Convolutional Neural Network and a Recurrent Neural Network. The encoder is responsible for understanding the features present in the inputted image that are useful in eventually producing an explanation. The model attempts at producing captions for both the objects and the regions present in the image. Treating language as a big label space, the project generates predictions for the various regions of the image and then stitches them together. |
---|---|
ISSN: | 2572-6129 |
DOI: | 10.1109/IC3.2019.8844921 |