yhavinga commited on
Commit
6f7461f
1 Parent(s): 74babbc

Saving weights and logs at step 280000

Browse files
README.md CHANGED
@@ -17,22 +17,22 @@ datasets:
17
  Dataset:
18
 
19
  * [mC4 NL Cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned)
20
- * dataset split: full (33B tokens)
21
 
22
  Tokenizer:
23
 
24
- * New tokenizer trained on mC4 with the scripts from the Huggingface
25
  Transformers [Flax examples](https://github.com/huggingface/transformers/tree/master/examples/flax/language-modeling)
26
 
27
  Training details:
28
 
29
- * Trained for 240k steps (29 dec 2021)
30
  * Block size: 512
31
  * Optimizer: adam, lr 8e-4, beta1 0.9, beta2 0.98
32
  * Warmup steps: 5000
33
  * Weight decay: 0.01
34
 
35
- Work in progress. Dec 2021.
36
 
37
  * Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
38
  * Thanks to @gsarti for creating the [t5-flax-gcp
 
17
  Dataset:
18
 
19
  * [mC4 NL Cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned)
20
+ * dataset config: full (33B tokens)
21
 
22
  Tokenizer:
23
 
24
+ * Tokenizer trained on mC4 with scripts from the Huggingface
25
  Transformers [Flax examples](https://github.com/huggingface/transformers/tree/master/examples/flax/language-modeling)
26
 
27
  Training details:
28
 
29
+ * Trained for 280k steps (30 dec 2021)
30
  * Block size: 512
31
  * Optimizer: adam, lr 8e-4, beta1 0.9, beta2 0.98
32
  * Warmup steps: 5000
33
  * Weight decay: 0.01
34
 
35
+ Work in progress. Dec 2021-Jan2022
36
 
37
  * Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
38
  * Thanks to @gsarti for creating the [t5-flax-gcp
flax_model.msgpack CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:65d0a6df749f03c6825305fe0a4ddf10af7019dfcc8da9a4e2777521606137f5
3
  size 1419302302
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d2bc942466bedf81fea88c9bbeaaafa7dfb2fec485a78c89c52705b841a2bf0a
3
  size 1419302302
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a9e66e988d4ce5517d9a3b59e2eb641274b0fce5815e8118f4a84f71259a09dc
3
  size 1444576537
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f6576b3366a236813f2b96767c8eb783a6d52ed8aa71222557e56edeae404cf0
3
  size 1444576537
runs/events.out.tfevents.1640332964.t1v-n-f9cfcc28-w-0.384322.0.v2 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a3905eb5358f312aee4fa9af05441f5c7554919cb9e3ff992c6f549447cca17a
3
- size 35774541
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c60bb6a82a55ceae1859f8fc81e83b0c19ee72e64de5ecdc95e012746328f4c6
3
+ size 43681985