Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ license: apache-2.0
|
|
14 |
# t5-v1.1-base-dutch-uncased
|
15 |
|
16 |
A [T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) sequence to sequence model
|
17 |
-
pre-trained from scratch on [cleaned Dutch π³π±π§πͺ mC4
|
18 |
|
19 |
|
20 |
* Pre-trained T5 models need to be finetuned before they can be used for downstream tasks, therefore the inference widget on the right has been turned off.
|
@@ -22,7 +22,7 @@ pre-trained from scratch on [cleaned Dutch π³π±π§πͺ mC4 ${and_english}](
|
|
22 |
the **[Netherformer π°](https://huggingface.co/spaces/flax-community/netherformer)** example application!
|
23 |
|
24 |
Please refer to the original T5 papers and Scale Efficiently papers for more information about the T5 architecture
|
25 |
-
and configs, though it must be noted that this model (
|
26 |
* **[Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)** by *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*.
|
27 |
* **[Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers](https://arxiv.org/abs/2109.10686)** by *Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler*.
|
28 |
|
|
|
14 |
# t5-v1.1-base-dutch-uncased
|
15 |
|
16 |
A [T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) sequence to sequence model
|
17 |
+
pre-trained from scratch on [cleaned Dutch π³π±π§πͺ mC4 ](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned).
|
18 |
|
19 |
|
20 |
* Pre-trained T5 models need to be finetuned before they can be used for downstream tasks, therefore the inference widget on the right has been turned off.
|
|
|
22 |
the **[Netherformer π°](https://huggingface.co/spaces/flax-community/netherformer)** example application!
|
23 |
|
24 |
Please refer to the original T5 papers and Scale Efficiently papers for more information about the T5 architecture
|
25 |
+
and configs, though it must be noted that this model (t5-v1.1-base-dutch-uncased) is unrelated to these projects and not an 'official' checkpoint.
|
26 |
* **[Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf)** by *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*.
|
27 |
* **[Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers](https://arxiv.org/abs/2109.10686)** by *Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler*.
|
28 |
|