Spaces:
Runtime error
Runtime error
File size: 606 Bytes
f82fbe0 58582da f82fbe0 cbc5727 f82fbe0 |
1 2 3 4 5 6 |
This demo uses [CLIP-mBART50 model checkpoint](https://huggingface.co/flax-community/multilingual-image-captioning-5M/) to predict caption for a given image in 4 languages (English, French, German, Spanish). Training was done using image encoder (CLIP-ViT) and text decoder (mBART50) with approximately 5 million image-text pairs taken from the [Conceptual 12M dataset](https://github.com/google-research-datasets/conceptual-12m) translated using [MBart50](https://huggingface.co/transformers/model_doc/mbart50.html).
New demo coming soon 🤗
For more details, click on `Usage` or `Article` 🤗 below. |