Edit model card
ENERGY-DRINK-LOVE/eeve_dpo-v3
Our Team
- Jingyeom Kim
- Youjin Chung
Model
Base Model
Hardware and Software
- Hardware: A100 * 8 for training our model
- Deepspeed library & Huggingface TRL Trainer
Dataset
- DPO_dataset
- ์์ฒด ์ ์ dpo dataset(AI-hub dataset ํ์ฉ)
- OpenOrca DPO ๋ฑ ์์ด ๋ฐ์ดํฐ์
๋ฒ์ญ(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, ์์ฒด๋ชจ๋ธ ํ์ฉ)
Training Method
Benchmark
Ko LM Eval Harness
Task |
0-shot |
5-shot |
kobest_boolq |
0.950142 |
0.944444 |
kobest_copa |
0.751 |
0.835 |
kobest_hellaswag |
0.474 |
0.508 |
kobest_sentineg |
0.811083 |
0.972292 |
Average |
0.74655625 |
0.81493399 |
Ko-LLM-Leaderboard
- (240307๊ธฐ์ค 7๋ฑ)
Average |
Ko-ARC |
Ko-HellaSwag |
Ko-MMLU |
Ko-TruthfulQA |
Ko-CommonGen V2 |
57.97 |
57.51 |
67.01 |
56.3 |
54.86 |
54.19 |
- Downloads last month
- 1,256
Finetuned from