- [Conceptual 12M Dataset](https://github.com/google-research-datasets/conceptual-12m) - [VQA v2 Dataset](https://visualqa.org/challenge.html) - [Hybrid CLIP Example](https://github.com/huggingface/transformers/blob/master/src/transformers/models/clip/modeling_flax_clip.py) - [VisualBERT Modeling File](https://github.com/huggingface/transformers/blob/master/src/transformers/models/visual_bert/modeling_visual_bert.py) - [BERT Modeling File](https://github.com/huggingface/transformers/blob/master/src/transformers/models/bert/modeling_flax_bert.py) - [CLIP Modeling File](https://github.com/huggingface/transformers/blob/master/src/transformers/models/clip/modeling_flax_clip.py) - [Summarization Training Script](https://github.com/huggingface/transformers/blob/master/examples/flax/summarization/run_summarization_flax.py) - [MLM Training Script](https://github.com/huggingface/transformers/blob/2df63282e010ac518683252d8ddba21e58d2faf3/examples/flax/language-modeling/run_mlm_flax.py)