yhavinga commited on
Commit
df643e2
·
1 Parent(s): 67b1af5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -27,14 +27,14 @@ Tokenizer:
27
  Training details:
28
 
29
  * Training started on step 360K (bs 16) ppl 21 of earlier model trained with Adam optimizer.
30
- * Training at step 1100K of 2082K (53%) pp 15,1
31
  * Block size: 512
32
  * Optimizer: adafactor
33
  * Learning rate: 3.3e-5
34
  * Batch size: 32
35
  * Warmup steps: 5000
36
 
37
- Work in progress. Dec 2021-Jan2022
38
 
39
  * Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
40
  * Thanks to @gsarti for creating the [t5-flax-gcp
 
27
  Training details:
28
 
29
  * Training started on step 360K (bs 16) ppl 21 of earlier model trained with Adam optimizer.
30
+ * Training at step 1100K (53%) of 2082K (bs 32) ppl 15,1
31
  * Block size: 512
32
  * Optimizer: adafactor
33
  * Learning rate: 3.3e-5
34
  * Batch size: 32
35
  * Warmup steps: 5000
36
 
37
+ Jan 2022
38
 
39
  * Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
40
  * Thanks to @gsarti for creating the [t5-flax-gcp