kyujinpy commited on
Commit
357fade
1 Parent(s): 8115a4c

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -59,8 +59,8 @@ dtype: float16
59
  | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Ko-CommonGenV2 |
60
  | --- | --- | --- | --- | --- | --- | --- |
61
  | PracticeLLM/Twice-KoSOLAR-16.1B-test | NaN | NaN | NaN | NaN | NaN | NaN |
 
62
  | [seungduk/KoSOLAR-10.7B-v0.1](https://huggingface.co/seungduk/KoSOLAR-10.7B-v0.1) | 52.40 | 47.18 | 59.54 | 52.04 | 41.84 | 61.39 |
63
- | [jjourney1125/M-SOLAR-10.7B-v1.0](https://huggingface.co/jjourney1125/M-SOLAR-10.7B-v1.0) | 55.15 | 49.57 | 60.12 | 54.60 | 49.23 | 62.22 |
64
 
65
  - Follow up as [En-link](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
66
  | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
@@ -87,6 +87,19 @@ gpt2 (pretrained=PracticeLLM/Twice-KoSOLAR-16.1B-test), limit: None, provide_des
87
  |kobest_sentineg | 0|acc |0.7078|± |0.0229|
88
  | | |macro_f1|0.7071|± |0.0229|
89
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
  gpt2 (pretrained=yanolja/KoSOLAR-10.7B-v0.1), limit: None, provide_description: False, num_fewshot: 0, batch_size: None
91
  | Task |Version| Metric |Value | |Stderr|
92
  |----------------|------:|--------|-----:|---|-----:|
@@ -99,6 +112,8 @@ gpt2 (pretrained=yanolja/KoSOLAR-10.7B-v0.1), limit: None, provide_description:
99
  | | |macro_f1|0.4296|± |0.0221|
100
  |kobest_sentineg | 0|acc |0.7506|± |0.0217|
101
  | | |macro_f1|0.7505|± |0.0217|
 
 
102
  ```
103
 
104
  - Follow up as [Eleuther/LM-Harness](https://github.com/EleutherAI/lm-evaluation-harness)
 
59
  | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Ko-CommonGenV2 |
60
  | --- | --- | --- | --- | --- | --- | --- |
61
  | PracticeLLM/Twice-KoSOLAR-16.1B-test | NaN | NaN | NaN | NaN | NaN | NaN |
62
+ | [jjourney1125/M-SOLAR-10.7B-v1.0](https://huggingface.co/jjourney1125/M-SOLAR-10.7B-v1.0) | 55.15 | 49.57 | 60.12 | 54.60 | 49.23 | 62.22 |
63
  | [seungduk/KoSOLAR-10.7B-v0.1](https://huggingface.co/seungduk/KoSOLAR-10.7B-v0.1) | 52.40 | 47.18 | 59.54 | 52.04 | 41.84 | 61.39 |
 
64
 
65
  - Follow up as [En-link](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
66
  | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
 
87
  |kobest_sentineg | 0|acc |0.7078|± |0.0229|
88
  | | |macro_f1|0.7071|± |0.0229|
89
 
90
+ gpt2 (pretrained=jjourney1125/M-SOLAR-10.7B-v1.0), limit: None, provide_description: False, num_fewshot: 0, batch_size: None
91
+ | Task |Version| Metric |Value | |Stderr|
92
+ |----------------|------:|--------|-----:|---|-----:|
93
+ |kobest_boolq | 0|acc |0.5228|± |0.0133|
94
+ | | |macro_f1|0.3788|± |0.0097|
95
+ |kobest_copa | 0|acc |0.6860|± |0.0147|
96
+ | | |macro_f1|0.6858|± |0.0147|
97
+ |kobest_hellaswag| 0|acc |0.4580|± |0.0223|
98
+ | | |acc_norm|0.5380|± |0.0223|
99
+ | | |macro_f1|0.4552|± |0.0222|
100
+ |kobest_sentineg | 0|acc |0.6474|± |0.0240|
101
+ | | |macro_f1|0.6012|± |0.0257|
102
+
103
  gpt2 (pretrained=yanolja/KoSOLAR-10.7B-v0.1), limit: None, provide_description: False, num_fewshot: 0, batch_size: None
104
  | Task |Version| Metric |Value | |Stderr|
105
  |----------------|------:|--------|-----:|---|-----:|
 
112
  | | |macro_f1|0.4296|± |0.0221|
113
  |kobest_sentineg | 0|acc |0.7506|± |0.0217|
114
  | | |macro_f1|0.7505|± |0.0217|
115
+
116
+
117
  ```
118
 
119
  - Follow up as [Eleuther/LM-Harness](https://github.com/EleutherAI/lm-evaluation-harness)