Update README.md
Browse files
README.md
CHANGED
@@ -4,7 +4,7 @@ license: "apache-2.0"
|
|
4 |
|
5 |
*This model was trained as part of a series of experiments testing the performance of pure DPO vs SFT vs ORPO, all supported by Unsloth/Huggingface TRL.*
|
6 |
|
7 |
-
Note: This model failed to train because the LR was too high (
|
8 |
|
9 |
**Benchmarks**
|
10 |
|
|
|
4 |
|
5 |
*This model was trained as part of a series of experiments testing the performance of pure DPO vs SFT vs ORPO, all supported by Unsloth/Huggingface TRL.*
|
6 |
|
7 |
+
Note: This model failed to train because the LR was too high (stopped early at 300 steps). Do not use!
|
8 |
|
9 |
**Benchmarks**
|
10 |
|