Image Captioning

Posted on January 24, 2022 • 1 minutes • 106 words

Auto generation of captions for photos describing the objects, features and activity on the photo to help visually impaired people.

Project Architecture Summary

Encoder – Decoder architecture using VGG-16 Architecture for image feature extraction and GRU with attention mechanism for decoding the features and generating text.

Evaluation and Testing

Evaluation metric used was BLEU score for 1,2,3 and 4 grams

for Deployment

using cog (Cog is an open-source command-line tool for packaging ML models in a standard, production-ready container) package the trained model using Cog and push it to Replicate .

link for the project for the base model and for attention !

An example image