flax-community
/

arabic-t5-small

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

salti commited on Jul 25, 2021

Commit

21df10c

•

1 Parent(s): d2d9560

create README for previous faulty run

Files changed (1) hide show

README.md +2 -37

README.md CHANGED Viewed

@@ -1,38 +1,3 @@
----
-language:
-  - ar
-datasets:
-  - mc4
-  - oscar
-  - arabic_billion_words
----
-# arabic-t5-small
-This is a T5v1.1 (small) trained on the concatenation of the Arabic Billion Words corpus and the Arabic subsets of the mC4 and Oscar datasets. The model could only be trained for about `10%` of the whole dataset due to time limitations.
-## Training parameters
-|                       |               |
-| :-------------------: | :-----------: |
-|         steps         |   `22'000`    |
-|  Training batch size  |     `384`     |
-| Evaluation batch size |     `768`     |
-|     learning rate     |    `1e-2`     |
-|         dtype         | `jnp.float32` |
-## Note for finetuning:
-This model was pretrained with dropout turned off, so the default `dropout_rate` in the model config is `0`.
-To finetune the model dropout should be turned be back on, like this:
-```python
-model = T5ForConditionalGeneration.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)
-```
-or,
-```python
-model = AutoModelForSeq2SeqLM.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)
-```


1	+ The model and logs in this directory are for a faulty run where `dropout_rate` was mistakenly set to `0.1` instead of `0`.







2
3	+ The model here was trained only for `10'000` steps.