Locutusque commited on
Commit
4e4bea5
·
1 Parent(s): d09c542

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -12,5 +12,7 @@ Like version 1, this model will be trained on a single GPU, with hopes of gettin
12
  - Train on 1,000,000 examples of Skylion007/openwebtext at a learning rate of 3e-4 and batch size of 32
13
  - Once perplexity reaches an average of ~100, a cosine scheduler will be applied, and batch size will be increased to 4096
14
  - After trained on 3,000,000 - 5,000,000 examples of Skylion007/openwebtext, the model will be trained on graelo/wikipedia and mattymchen/refinedweb-3m, and the batch size will be increased to 49,152.
 
 
15
  # Disclaimer
16
  This model may be cancelled if performance improvement is not seen over its predecessor
 
12
  - Train on 1,000,000 examples of Skylion007/openwebtext at a learning rate of 3e-4 and batch size of 32
13
  - Once perplexity reaches an average of ~100, a cosine scheduler will be applied, and batch size will be increased to 4096
14
  - After trained on 3,000,000 - 5,000,000 examples of Skylion007/openwebtext, the model will be trained on graelo/wikipedia and mattymchen/refinedweb-3m, and the batch size will be increased to 49,152.
15
+
16
+ - I'm open to any suggestions to modify this roadmap if you feel it isn't sufficient!
17
  # Disclaimer
18
  This model may be cancelled if performance improvement is not seen over its predecessor