adamo1139
/

Yi-34B-200K-XLCTX-RAW-1904

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

adamo1139 commited on May 3

Commit

5aadafe

•

1 Parent(s): 63c0b6f

Update README.md

Files changed (1) hide show

README.md +10 -5

README.md CHANGED Viewed

@@ -1,5 +1,10 @@
----
-license: other
-license_name: yi-license
-license_link: LICENSE
----

+---
+license: other
+license_name: yi-license
+license_link: LICENSE
+datasets:
+- adamo1139/rawrr_v2-2_stage1
+---
+## Model description
+This is a base Yi-34B-200K XLCTX model treated with DPO with adamo1139/rawrr_v2-2_stage1 dataset to make outputs be completions instead of answers for a question. DPO was done using chatml format but no previous SFT step was done. If it would do it now, I would have used ORPO instead of DPO for this step to make it stronger, but too late for that. It can be used to maybe slightly decensor a model, but I don't think this idea works too well with DPO before SFT step, as was widely known but I did it anyway.