gsarti commited on
Commit
c16db81
1 Parent(s): a124153

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -4
README.md CHANGED
@@ -15,9 +15,9 @@ thumbnail: https://gsarti.com/publication/it5/featured.png
15
 
16
  The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
17
 
18
- This model is released as part of the project ["IT5: Large-Scale Text-to-Text Pretraining for Italian Language Understanding and Generation"](https://gsarti.com) (to be released), by [Gabriele Sarti](https://gsarti.com/) with the support of [Huggingface](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104) and with TPU usage sponsored by Google's [TPU Research Cloud](https://sites.research.google/trc/). All the training was conducted on a single TPU3v8-VM machine on Google Cloud. Refer to the Tensorboard tab of the repository for an overview of the training process.
19
 
20
- *The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The model [`gsarti/it5-base-nli`](https://huggingface.co/gsarti/it5-base-nli) provides an example of this model fine-tuned on a downstream NLI task.*
21
 
22
  ## Model variants
23
 
@@ -54,7 +54,7 @@ tokenizer = T5Tokenizer.from_pretrained("gsarti/it5-small")
54
  model = T5ForConditionalGeneration.from_pretrained("gsarti/it5-small")
55
  ```
56
 
57
- *Note: You will need to fine-tune the model on your downstream seq2seq task to use it. See an example [here](https://huggingface.co/gsarti/it5-base-nli).*
58
 
59
  Flax and Tensorflow versions of the model are also available:
60
 
@@ -75,4 +75,13 @@ For problems or updates on this model, please contact [gabriele.sarti996@gmail.c
75
 
76
  ## Citation Information
77
 
78
- *Coming soon!*
 
 
 
 
 
 
 
 
 
15
 
16
  The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
17
 
18
+ This model is released as part of the project ["IT5: Large-Scale Text-to-Text Pretraining for Italian Language Understanding and Generation"](https://arxiv.org/abs/2203.03759), by [Gabriele Sarti](https://gsarti.com/) and [Malvina Nissim](https://malvinanissim.github.io/) with the support of [Huggingface](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104) and with TPU usage sponsored by Google's [TPU Research Cloud](https://sites.research.google/trc/). All the training was conducted on a single TPU3v8-VM machine on Google Cloud. Refer to the Tensorboard tab of the repository for an overview of the training process.
19
 
20
+ *The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The models in the [`it5`](https://huggingface.co/it5) organization provide some examples of this model fine-tuned on various downstream task.*
21
 
22
  ## Model variants
23
 
54
  model = T5ForConditionalGeneration.from_pretrained("gsarti/it5-small")
55
  ```
56
 
57
+ *Note: You will need to fine-tune the model on your downstream seq2seq task to use it. See an example [here](https://huggingface.co/it5/it5-base-question-answering).*
58
 
59
  Flax and Tensorflow versions of the model are also available:
60
 
75
 
76
  ## Citation Information
77
 
78
+ ```bibtex
79
+ @article{sarti-nissim-2022-it5,
80
+ title={IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation},
81
+ author={Sarti, Gabriele and Nissim, Malvina},
82
+ journal={ArXiv preprint 2203.03759},
83
+ url={https://arxiv.org/abs/2203.03759},
84
+ year={2022},
85
+ month={mar}
86
+ }
87
+ ```