gchhablani commited on
Commit
a03955b
1 Parent(s): 7f68d58

Update conclusion

Browse files
sections/conclusion_future_work/conclusion.md CHANGED
@@ -1 +1 @@
1
- In this project, we presented Proof-of-Concept with our CLIP Vision + BERT model baseline which leverages a multilingual checkpoint with pre-trained image encoders in four languages - **English, French, German, and Spanish**. We hope to improve this in the future by using better translators (for e.g. Google Translate API) to get more multilingual data, especially in low-resource languages.
 
1
+ In this project, we presented Proof-of-Concept with our CLIP Vision + BERT model baseline which leverages a multilingual checkpoint with pre-trained image encoders in four languages - **English, French, German, and Spanish**. Our model performs very well considering the amount of training time we were able to get and achieves 0.49 eval accuracy on our multilingual VQAv2 dataset.