Update README.md
Browse files
README.md
CHANGED
@@ -57,7 +57,7 @@ The format for TinyCot was:
|
|
57 |
|
58 |
For the initial supervised finetuning step:
|
59 |
- Adalite optimizer, default hyperparameters of supertrainer2000 unless otherwise specified
|
60 |
-
- Lambda (Adalite's analogue to weight decay) of 0.01
|
61 |
- LR of 1e-5
|
62 |
- MixCE ratio of 0.75
|
63 |
- Sequence length of 4096
|
|
|
57 |
|
58 |
For the initial supervised finetuning step:
|
59 |
- Adalite optimizer, default hyperparameters of supertrainer2000 unless otherwise specified
|
60 |
+
- Lambda (Adalite's analogue to weight decay, see [here](https://arxiv.org/abs/2103.06583) for details) of 0.01
|
61 |
- LR of 1e-5
|
62 |
- MixCE ratio of 0.75
|
63 |
- Sequence length of 4096
|