miguelcarv
commited on
Commit
•
d251b72
1
Parent(s):
1d37eee
Update README.md
Browse files
README.md
CHANGED
@@ -44,8 +44,9 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
44 |
## Training Details
|
45 |
|
46 |
- Trained for one epoch on SlimOrca-Dedup
|
47 |
-
- Learning rate:
|
|
|
48 |
- Optimizer: AdamW
|
49 |
-
- Effective batch size:
|
50 |
-
- Gradient accumulation steps (mini batch size):
|
51 |
- Trained with FP32
|
|
|
44 |
## Training Details
|
45 |
|
46 |
- Trained for one epoch on SlimOrca-Dedup
|
47 |
+
- Learning rate: 2e-5
|
48 |
+
- Cosine learning rate decay to 0
|
49 |
- Optimizer: AdamW
|
50 |
+
- Effective batch size: 256
|
51 |
+
- Gradient accumulation steps (mini batch size): 64 (4)
|
52 |
- Trained with FP32
|