Update README.md
Browse files
README.md
CHANGED
@@ -27,14 +27,14 @@ Tokenizer:
|
|
27 |
Training details:
|
28 |
|
29 |
* Training started on step 360K (bs 16) ppl 21 of earlier model trained with Adam optimizer.
|
30 |
-
* Training at step 1100K of 2082K (
|
31 |
* Block size: 512
|
32 |
* Optimizer: adafactor
|
33 |
* Learning rate: 3.3e-5
|
34 |
* Batch size: 32
|
35 |
* Warmup steps: 5000
|
36 |
|
37 |
-
|
38 |
|
39 |
* Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
|
40 |
* Thanks to @gsarti for creating the [t5-flax-gcp
|
|
|
27 |
Training details:
|
28 |
|
29 |
* Training started on step 360K (bs 16) ppl 21 of earlier model trained with Adam optimizer.
|
30 |
+
* Training at step 1100K (53%) of 2082K (bs 32) ppl 15,1
|
31 |
* Block size: 512
|
32 |
* Optimizer: adafactor
|
33 |
* Learning rate: 3.3e-5
|
34 |
* Batch size: 32
|
35 |
* Warmup steps: 5000
|
36 |
|
37 |
+
Jan 2022
|
38 |
|
39 |
* Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
|
40 |
* Thanks to @gsarti for creating the [t5-flax-gcp
|