README.md · visheratin/nllb-clip-base-oc at f16398ba58d1d5cff27450ce2fe971b087969444

metadata

tags:
  - clip
library_name: open_clip
pipeline_tag: zero-shot-image-classification
license: cc-by-nc-4.0
datasets:
  - visheratin/laion-coco-nllb

Model Summary

NLLB-CLIP is a model that combines a text encoder from the NLLB model and an image encoder from the standard CLIP. This allows us to extend the model capabilities to 201 languages of the Flores-200. NLLB-CLIP sets state-of-the-art on the Crossmodal-3600 dataset by performing very well on low-resource languages. You can find more details about the model in the paper.

Acknowledgements

I thank ML Collective for providing Google Cloud compute resources to train the OpenCLIP-compatible version of NLLB-CLIP.