salti commited on
Commit
21df10c
1 Parent(s): d2d9560

create README for previous faulty run

Browse files
Files changed (1) hide show
  1. README.md +2 -37
README.md CHANGED
@@ -1,38 +1,3 @@
1
- ---
2
- language:
3
- - ar
4
- datasets:
5
- - mc4
6
- - oscar
7
- - arabic_billion_words
8
- ---
9
 
10
- # arabic-t5-small
11
-
12
- This is a T5v1.1 (small) trained on the concatenation of the Arabic Billion Words corpus and the Arabic subsets of the mC4 and Oscar datasets. The model could only be trained for about `10%` of the whole dataset due to time limitations.
13
-
14
- ## Training parameters
15
-
16
- | | |
17
- | :-------------------: | :-----------: |
18
- | steps | `22'000` |
19
- | Training batch size | `384` |
20
- | Evaluation batch size | `768` |
21
- | learning rate | `1e-2` |
22
- | dtype | `jnp.float32` |
23
-
24
-
25
- ## Note for finetuning:
26
-
27
- This model was pretrained with dropout turned off, so the default `dropout_rate` in the model config is `0`.
28
- To finetune the model dropout should be turned be back on, like this:
29
-
30
- ```python
31
- model = T5ForConditionalGeneration.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)
32
- ```
33
-
34
- or,
35
-
36
- ```python
37
- model = AutoModelForSeq2SeqLM.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)
38
- ```
 
1
+ The model and logs in this directory are for a faulty run where `dropout_rate` was mistakenly set to `0.1` instead of `0`.
 
 
 
 
 
 
 
2
 
3
+ The model here was trained only for `10'000` steps.