bhavitvyamalik's picture
update README
a82b5ff
|
raw
history blame
358 Bytes

In this project, we presented Proof-of-Concept with our CLIP Vision + mBART-50 model baseline which leverages a multilingual checkpoint with pre-trained image encoders in four languages - English, French, German, and Spanish. We intend to extend this project to more languages with better translations and improve our work based on the observations made.