Generation of Image Captions using Deep Learning and Natural Language Processing: A Review


  • M Balakrishna Mallapu, Deepthi Godavarthi


Deep learning, Natural Language Processing, Computer Vision


Deep Learning methodologies have significant possibilities for applications that endeavour to generate image captions or image descriptions automatically. Image captioning is among the most academically hard obstacles in image research. The caption of images is an extremely important study area that aims to automatically generate descriptive words based on an image's visual content. It's a multidisciplinary method that combines Artificial Intelligence (AI), Natural Language Processing (NLP), and Computer Vision (CV). Recognizing the Primary elements of the image, characteristics, and interactions is required for captioning. It should also generate sentences that are syntactically and semantically correct. Next, we evaluated the present literature discusses utilizing the language models to improve various applications, including image captioning, report creation, report categorization, extraction of findings, and visual query response and so on. In this article, we intend to present a comprehensive overview of available captioning of images using deep learning approaches. We also describe the datasets and assessment measures commonly utilized in deep learning for the automatic captioning of images.


