Edit model card

The model is a version of the SFT that trained on purpose to do DPO on Yi model.

I was unable to resolve the OOM issue while trying to train DPO, so I am only uploading the SFT.

If you would like to DPO on that model, please use the maywell/why_no_one_do_dpo_on_yi dataset.

It follows prompt format ChatML.

Below code used to load maywell/why_no_one_do_dpo_on_yi dataset on axolotl.

class SimpleShareGPTPromptTokenizingStrategy(ShareGPTPromptTokenizingStrategy):

    _strict = True

    @property
    def strict(self):
        return self._strict

    @strict.setter
    def strict(self, strict):
        self._strict = strict

    def get_conversation_thread(self, prompt):
        conversations = prompt['chosen']
        turns = [{"from": "assistant" if t["role"] == "assistant" else t["role"], "value": t["content"]} for t in conversations]
        return turns

ํ•ด๋‹น ๋ชจ๋ธ์€ Yi ๋ชจ๋ธ์„ DPOํ•˜๊ธฐ ์œ„ํ•ด ํ›ˆ๋ จ์‹œ์ผฐ๋˜ SFT ๋ฒ„์ „์ž…๋‹ˆ๋‹ค.

DPO ํ›ˆ๋ จ์„ ํ•˜๋ ค๋˜ ์ค‘ OOM ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์ง€ ๋ชปํ•˜์—ฌ SFT๋งŒ ์—…๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.

ํ•ด๋‹น ๋ชจ๋ธ์— DPO๋ฅผ ํ•˜์‹œ๋ ค๋ฉด maywell/why_no_one_do_dpo_on_yi ๋ฐ์ดํ„ฐ์…‹์„ ์ด์šฉํ•ด์ฃผ์„ธ์š”.

ํ”„๋กฌํ”„ํŠธ ํฌ๋งท์€ ChatML์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

Downloads last month
3,014
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.