The model is a version of the SFT that trained on purpose to do DPO on Yi model.
I was unable to resolve the OOM issue while trying to train DPO, so I am only uploading the SFT.
If you would like to DPO on that model, please use the maywell/why_no_one_do_dpo_on_yi dataset.
It follows prompt format ChatML.
Below code used to load maywell/why_no_one_do_dpo_on_yi dataset on axolotl.
class SimpleShareGPTPromptTokenizingStrategy(ShareGPTPromptTokenizingStrategy):
_strict = True
@property
def strict(self):
return self._strict
@strict.setter
def strict(self, strict):
self._strict = strict
def get_conversation_thread(self, prompt):
conversations = prompt['chosen']
turns = [{"from": "assistant" if t["role"] == "assistant" else t["role"], "value": t["content"]} for t in conversations]
return turns
ํด๋น ๋ชจ๋ธ์ Yi ๋ชจ๋ธ์ DPOํ๊ธฐ ์ํด ํ๋ จ์์ผฐ๋ SFT ๋ฒ์ ์ ๋๋ค.
DPO ํ๋ จ์ ํ๋ ค๋ ์ค OOM ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ์ง ๋ชปํ์ฌ SFT๋ง ์ ๋ก๋ํฉ๋๋ค.
ํด๋น ๋ชจ๋ธ์ DPO๋ฅผ ํ์๋ ค๋ฉด maywell/why_no_one_do_dpo_on_yi ๋ฐ์ดํฐ์ ์ ์ด์ฉํด์ฃผ์ธ์.
ํ๋กฌํํธ ํฌ๋งท์ ChatML์ ๋ฐ๋ฆ ๋๋ค.
- Downloads last month
- 2,966
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.