Update README.md
Browse files
README.md
CHANGED
@@ -50,10 +50,10 @@ The model was fine-tuned using a private Bitext dataset designed for question an
|
|
50 |
|
51 |
- **Optimizer**: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
|
52 |
- **Learning Rate**: 0.0002 with a cosine learning rate scheduler
|
53 |
-
- **Epochs**:
|
54 |
-
- **Batch Size**:
|
55 |
-
- **Gradient Accumulation Steps**:
|
56 |
-
- **Maximum Sequence Length**:
|
57 |
|
58 |
### Environment
|
59 |
|
|
|
50 |
|
51 |
- **Optimizer**: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
|
52 |
- **Learning Rate**: 0.0002 with a cosine learning rate scheduler
|
53 |
+
- **Epochs**: 4
|
54 |
+
- **Batch Size**: 10
|
55 |
+
- **Gradient Accumulation Steps**: 8
|
56 |
+
- **Maximum Sequence Length**: 8192 tokens
|
57 |
|
58 |
### Environment
|
59 |
|