G-reen
/

EXPERIMENT-DPO-m7b2-3-merged

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

G-reen commited on Mar 31

Commit

904beea

•

1 Parent(s): 1194bcf

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ license: "apache-2.0"
 *This model was trained as part of a series of experiments testing the performance of pure DPO vs SFT vs ORPO, all supported by Unsloth/Huggingface TRL.*
-Note: This model failed to train because the LR was too high (was stopped early, at 300 steps). Do not use!
 **Benchmarks**

 *This model was trained as part of a series of experiments testing the performance of pure DPO vs SFT vs ORPO, all supported by Unsloth/Huggingface TRL.*
+Note: This model failed to train because the LR was too high (stopped early at 300 steps). Do not use!
 **Benchmarks**