Update README.md
Browse files
README.md
CHANGED
@@ -77,5 +77,5 @@ print(tokenizer.decode(outputs[0]))
|
|
77 |
|
78 |
### Training Procedure
|
79 |
|
80 |
-
This was trained with axolotl, using full fine tuning (no LoRA etc). I used a sequence length of 2048, learning rate of 0.003 with the adamw_bnb_8bit optimizer and a cosine scheduler.
|
81 |
Due to an error I made in calculating the token count, I accidentally trained for nearly 2 epochs, with the learning rate not reaching its proper minimum.
|
|
|
77 |
|
78 |
### Training Procedure
|
79 |
|
80 |
+
This was trained with axolotl, using full fine tuning (no LoRA etc). I used a sequence length of 2048 with an effective batch size of 512, learning rate of 0.003 with the adamw_bnb_8bit optimizer and a cosine scheduler.
|
81 |
Due to an error I made in calculating the token count, I accidentally trained for nearly 2 epochs, with the learning rate not reaching its proper minimum.
|