Model
Dataset
BenchMark (KOR)
# alias
A = youjunhyeok/Llama-3-8B-slerp-262k-sft-lora-ko
B = DavidAhn/Llama-3-8B-slerp-262k
C = meta-llama/Meta-Llama-3-8B
D = chihoonlee10/T3Q-ko-solar-dpo-v7.0 (24.05.24 ko 리더보드 1등)
Benchmark (macro_f1) |
A |
B |
C |
D |
kobest_boolq (0-shot) |
57.6 |
33.5 |
38.2 |
34.1 |
kobest_boolq (5-shot) |
77.9 |
68.8 |
83.8 |
93.1 |
kobest_copa (0-shot) |
59.9 |
58.5 |
63.1 |
81.0 |
kobest_copa (5-shot) |
61.4 |
61.7 |
69.1 |
91.0 |
kobest_hellaswag (0-shot) |
40.6 |
43.2 |
42.1 |
55.1 |
kobest_hellaswag (5-shot) |
41.5 |
45.3 |
44.2 |
55.2 |
kobest_sentineg (0-shot) |
61.1 |
34.8 |
51.5 |
82.7 |
kobest_sentineg (5-shot) |
92.4 |
85.8 |
94.7 |
91.4 |
BenchMark (ENG)
|
openbookqa |
hellaswag |
boolq |
arc_easy |
arc_challenge |
youjunhyeok/Llama-3-8B-slerp-262k-sft-lora-ko |
0.334 |
0.575 |
0.778 |
0.763 |
0.471 |
DavidAhn/Llama-3-8B-slerp-262k |
0.312 |
0.587 |
0.832 |
0.808 |
0.518 |
meta-llama/Meta-Llama-3-8B-Instruct |
0.338 |
0.576 |
0.831 |
0.815 |
0.529 |