Update README.md
Browse files
README.md
CHANGED
@@ -43,3 +43,4 @@ The following hyperparameters were used during DPO training:
|
|
43 |
- lr_scheduler_warmup_ratio: 0.1
|
44 |
- Weight Decay: 0.0
|
45 |
- num_epochs: 3.0
|
|
|
|
43 |
- lr_scheduler_warmup_ratio: 0.1
|
44 |
- Weight Decay: 0.0
|
45 |
- num_epochs: 3.0
|
46 |
+
- Specifically add above input format over training samples
|