beomi commited on
Commit
e714105
1 Parent(s): b2d2c6b
Files changed (1) hide show
  1. README.md +33 -26
README.md CHANGED
@@ -103,32 +103,39 @@ Model evaluation metrics and results.
103
 
104
  ### Benchmark Results
105
 
106
- | Evaluation | Metric | Shots | 7b |
107
- |-----------------------|------------------------|-------|--------|
108
- | Default Metric | ACC | | |
109
- | Knowledge (5-shot) | MMLU | | 61.76 |
110
- | | KMMLU | | 42.75 |
111
- | | CMLU | | 50.93 |
112
- | | JMLU | | - |
113
- | | C-Eval | | 50.07 |
114
- | | HAERAE (0-shot) | | 63.89 |
115
- | KOBest (5-shot) | BoolQ | | 85.47 |
116
- | | COPA | | 83.5 |
117
- | | Hellaswag (acc-norm) | | 63.2 |
118
- | | Sentineg | | 97.98 |
119
- | | WiC | | 70.95 |
120
- | JP Eval Harness | JcommonsenseQA | 3-shot| 85.97 |
121
- | (Prompt ver 0.3) | JNLI | 3-shot| 39.11 |
122
- | | MARC_JA | 3-shot| 96.48 |
123
- | | JSQUAD | 2-shot| 70.69 |
124
- | | JAQKET | 1-shot| 81.53 |
125
- | | MGSM | 5-shot| 28.8 |
126
- | XWinograd (5-shot) | EN | | 90.71 |
127
- | | FR | | 80.72 |
128
- | | JP | | 84.15 |
129
- | | PT | | 80.99 |
130
- | | RU | | 76.51 |
131
- | | ZH | | 76.98
 
 
 
 
 
 
 
132
 
133
 
134
  ## Usage and Limitations
 
103
 
104
  ### Benchmark Results
105
 
106
+ | Category | Metric | Shots | 7b |
107
+ |----------------------------------|----------------------|------------|--------|
108
+ | **Default Metric** | **ACC** | | |
109
+ | **Knowledge (5-shot)** | MMLU | | 61.76 |
110
+ | | KMMLU | | 42.75 |
111
+ | | CMLU | | 50.93 |
112
+ | | JMLU | | |
113
+ | | C-EVAL | | 50.07 |
114
+ | | HAERAE (0-shot) | | 63.89 |
115
+ | **KoBest (5-shot)** | BoolQ | | 85.47 |
116
+ | | COPA | | 83.5 |
117
+ | | Hellaswag (acc-norm) | | 63.2 |
118
+ | | Sentineg | | 97.98 |
119
+ | | WiC | | 70.95 |
120
+ | **JP Eval Harness (Prompt ver 0.3)** | JcommonsenseQA | 3-shot | 85.97 |
121
+ | | JNLI | 3-shot | 39.11 |
122
+ | | Marc_ja | 3-shot | 96.48 |
123
+ | | JSquad | 2-shot | 70.69 |
124
+ | | Jaqket | 1-shot | 81.53 |
125
+ | | MGSM | 5-shot | 28.8 |
126
+ | **XWinograd (5-shot)** | EN | | 90.71 |
127
+ | | FR | | 80.72 |
128
+ | | JP | | 84.15 |
129
+ | | PT | | 80.99 |
130
+ | | RU | | 76.51 |
131
+ | | ZH | | 76.98 |
132
+ | **XCOPA (5-shot)** | IT | | 72.8 |
133
+ | | ID | | 76.4 |
134
+ | | TH | | 60.2 |
135
+ | | TR | | 65.6 |
136
+ | | VI | | 77.2 |
137
+ | | ZH | | 80.2 |
138
+
139
 
140
 
141
  ## Usage and Limitations