kyujinpy
/

Sakura-SOLAR-Instruct-DPO-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kyujinpy commited on Dec 27, 2023

Commit

7f45a1e

•

1 Parent(s): a347647

Upload README.md

Files changed (1) hide show

README.md +4 -5

README.md CHANGED Viewed

@@ -2,12 +2,12 @@
 language:
 - en
 datasets:
-- Intel/orca_dpo_pairs
 pipeline_tag: text-generation
 license: cc-by-nc-sa-4.0
 ---
-# **Sakura-SOLRCA-Instruct-DPO**
 <img src='./sakura.png' width=512>
 **(주)미디어그룹사람과숲과 (주)마커의 LLM 연구 컨소시엄에서 개발된 모델입니다**
@@ -18,7 +18,7 @@ license: cc-by-nc-sa-4.0
 **Method**
 Using DPO method.
-With [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs).
 I shared the information about my model. (training and code)
 Please see: ⭐[Sakura-SOLAR](https://github.com/KyujinHan/Sakura-SOLAR-DPO).
@@ -30,7 +30,6 @@ Please see: ⭐[Sakura-SOLAR](https://github.com/KyujinHan/Sakura-SOLAR-DPO).
 | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
 | --- | --- | --- | --- | --- | --- | --- | --- |
-| Sakura-SOLRCA-Instruct-DPO | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
 | Sakura-SOLAR-Instruct-DPO-v2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
 | Sakura-SOLAR-Instruct-DPO-v1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
 | [kyujinpy/Sakura-SOLAR-Instruct](https://huggingface.co/kyujinpy/Sakura-SOLAR-Instruct) | 74.40 | 70.99 | 88.42 | 66.33 | 71.79 | 83.66 | 65.20
@@ -42,7 +41,7 @@ Please see: ⭐[Sakura-SOLAR](https://github.com/KyujinHan/Sakura-SOLAR-DPO).
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
-repo = "kyujinpy/Sakura-SOLRCA-Instruct-DPO"
 OpenOrca = AutoModelForCausalLM.from_pretrained(
         repo,
         return_dict=True,

 language:
 - en
 datasets:
+- argilla/distilabel-math-preference-dpo
 pipeline_tag: text-generation
 license: cc-by-nc-sa-4.0
 ---
+# **Sakura-SOLAR-Instruct-DPO-v2**
 <img src='./sakura.png' width=512>
 **(주)미디어그룹사람과숲과 (주)마커의 LLM 연구 컨소시엄에서 개발된 모델입니다**
 **Method**
 Using DPO method.
+With [argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo).
 I shared the information about my model. (training and code)
 Please see: ⭐[Sakura-SOLAR](https://github.com/KyujinHan/Sakura-SOLAR-DPO).
 | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
 | --- | --- | --- | --- | --- | --- | --- | --- |
 | Sakura-SOLAR-Instruct-DPO-v2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
 | Sakura-SOLAR-Instruct-DPO-v1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
 | [kyujinpy/Sakura-SOLAR-Instruct](https://huggingface.co/kyujinpy/Sakura-SOLAR-Instruct) | 74.40 | 70.99 | 88.42 | 66.33 | 71.79 | 83.66 | 65.20
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
+repo = "kyujinpy/Sakura-SOLAR-Instruct-DPO-v2"
 OpenOrca = AutoModelForCausalLM.from_pretrained(
         repo,
         return_dict=True,