flax-community
/

vit-gpt2

Model card Files Files and versions Metrics Training metrics Community

ydshieh HF staff commited on Jul 18, 2021

Commit

ea6132e

•

1 Parent(s): ef6a4ab

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -2,5 +2,6 @@ An image caption model [ViT-GPT2](https://huggingface.co/flax-community/vit-gpt2
 Part of the [Huggingface JAX/Flax event](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/).
 The pretained weights of both models are loaded, with a set of randomly initialized cross-attention weigths.
 The model is trained on 65000 images from the COCO dataset for about 1500 steps, with the original english cpationis are translated to french for training purpose.

 Part of the [Huggingface JAX/Flax event](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/).
+The GPT2 model source code is modified so it can accept an encoder's output.
 The pretained weights of both models are loaded, with a set of randomly initialized cross-attention weigths.
 The model is trained on 65000 images from the COCO dataset for about 1500 steps, with the original english cpationis are translated to french for training purpose.