## Checkpoints - Pre-trained checkpoint at 60k steps: [clip-vision-bert-cc12m-60k](https://huggingface.co/flax-community/clip-vision-bert-cc12m-60k) - Pre-trained checkpoint at 70k steps: [clip-vision-bert-cc12m-70k](https://huggingface.co/flax-community/clip-vision-bert-cc12m-70k) - Fine-tuned checkpoint at 6k steps on 60k pre-trained checkpoint: [clip-vision-bert-vqa-ft-6k](https://huggingface.co/flax-community/clip-vision-bert-vqa-ft-6k) ### Other checkpoints: - All pre-trained checkpoints: [multilingual-vqa](https://huggingface.co/flax-community/multilingual-vqa) - Fine-tuned checkpoints on 45k pre-trained checkpoint: [multilingual-vqa-pt-45k-ft](https://huggingface.co/flax-community/multilingual-vqa-pt-45k-ft) - Fine-tuned checkpoints on 45k pre-trained checkpoint with AdaFactor (others use AdamW): [multilingual-vqa-pt-45k-ft-adf](https://huggingface.co/flax-community/multilingual-vqa-pt-45k-ft-adf) - Fine-tuned checkpoints on 60k pre-trained checkpoint: [multilingual-vqa-pt-60k-ft](https://huggingface.co/flax-community/multilingual-vqa-pt-60k-ft) - Fine-tuned checkpoints on 70k pre-trained checkpoint: [multilingual-vqa-pt-60k-ft](https://huggingface.co/flax-community/multilingual-vqa-pt-70k-ft) - From scratch (without pre-training) model: [multilingual-vqa-ft](https://huggingface.co/flax-community/multilingual-vqa-ft)