gsarti
/

it5-small

@@ -15,9 +15,9 @@ thumbnail: https://gsarti.com/publication/it5/featured.png
 The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
-This model is released as part of the project ["IT5: Large-Scale Text-to-Text Pretraining for Italian Language Understanding and Generation"](https://gsarti.com) (to be released), by [Gabriele Sarti](https://gsarti.com/) with the support of [Huggingface](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104) and with TPU usage sponsored by Google's [TPU Research Cloud](https://sites.research.google/trc/). All the training was conducted on a single TPU3v8-VM machine on Google Cloud. Refer to the Tensorboard tab of the repository for an overview of the training process.
-*The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The model [`gsarti/it5-base-nli`](https://huggingface.co/gsarti/it5-base-nli) provides an example of this model fine-tuned on a downstream NLI task.*
 ## Model variants
@@ -54,7 +54,7 @@ tokenizer = T5Tokenizer.from_pretrained("gsarti/it5-small")
 model = T5ForConditionalGeneration.from_pretrained("gsarti/it5-small")
 ```
-*Note: You will need to fine-tune the model on your downstream seq2seq task to use it. See an example [here](https://huggingface.co/gsarti/it5-base-nli).*
 Flax and Tensorflow versions of the model are also available:
@@ -75,4 +75,13 @@ For problems or updates on this model, please contact [gabriele.sarti996@gmail.c
 ##  Citation Information
-*Coming soon!*

 The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
+This model is released as part of the project ["IT5: Large-Scale Text-to-Text Pretraining for Italian Language Understanding and Generation"](https://arxiv.org/abs/2203.03759), by [Gabriele Sarti](https://gsarti.com/) and [Malvina Nissim](https://malvinanissim.github.io/) with the support of [Huggingface](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104) and with TPU usage sponsored by Google's [TPU Research Cloud](https://sites.research.google/trc/). All the training was conducted on a single TPU3v8-VM machine on Google Cloud. Refer to the Tensorboard tab of the repository for an overview of the training process.
+*The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The models in the  [`it5`](https://huggingface.co/it5) organization provide some examples of this model fine-tuned on various downstream task.*
 ## Model variants
 model = T5ForConditionalGeneration.from_pretrained("gsarti/it5-small")
 ```
+*Note: You will need to fine-tune the model on your downstream seq2seq task to use it. See an example [here](https://huggingface.co/it5/it5-base-question-answering).*
 Flax and Tensorflow versions of the model are also available:
 ##  Citation Information
+```bibtex
+@article{sarti-nissim-2022-it5,
+    title={IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation},
+    author={Sarti, Gabriele and Nissim, Malvina},
+    journal={ArXiv preprint 2203.03759},
+    url={https://arxiv.org/abs/2203.03759},
+    year={2022},
+	month={mar}
+}
+```