Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -131,15 +131,15 @@ Model evaluation metrics and results.
131
 
132
  ### Benchmark Results
133
 
134
- | Category | Metric | Shots | 7b |
135
  |----------------------------------|----------------------|------------|--------|
136
  | **Default Metric** | **ACC** | | |
137
  | **Knowledge (5-shot)** | MMLU | | 61.76 |
138
- | | KMMLU | | 42.75 |
139
  | | CMLU | | 50.93 |
140
  | | JMLU | | |
141
  | | C-EVAL | | 50.07 |
142
- | | HAERAE (0-shot) | | 63.89 |
143
  | **KoBest (5-shot)** | BoolQ | | 85.47 |
144
  | | COPA | | 83.5 |
145
  | | Hellaswag (acc-norm) | | 63.2 |
@@ -154,8 +154,8 @@ Model evaluation metrics and results.
154
  | **JP Eval Harness (Prompt ver 0.3)** | JcommonsenseQA | 3-shot | 85.97 |
155
  | | JNLI | 3-shot | 39.11 |
156
  | | Marc_ja | 3-shot | 96.48 |
157
- | | JSquad | 2-shot | 70.69 |
158
- | | Jaqket | 1-shot | 81.53 |
159
  | | MGSM | 5-shot | 28.8 |
160
  | **XWinograd (0-shot)** | EN | | 89.03 |
161
  | | FR | | 72.29 |
 
131
 
132
  ### Benchmark Results
133
 
134
+ | Category | Metric | Shots | Score |
135
  |----------------------------------|----------------------|------------|--------|
136
  | **Default Metric** | **ACC** | | |
137
  | **Knowledge (5-shot)** | MMLU | | 61.76 |
138
+ | | KMMLU (Exact Match) | | 42.75 |
139
  | | CMLU | | 50.93 |
140
  | | JMLU | | |
141
  | | C-EVAL | | 50.07 |
142
+ | | HAERAE | 0-shot | 63.89 |
143
  | **KoBest (5-shot)** | BoolQ | | 85.47 |
144
  | | COPA | | 83.5 |
145
  | | Hellaswag (acc-norm) | | 63.2 |
 
154
  | **JP Eval Harness (Prompt ver 0.3)** | JcommonsenseQA | 3-shot | 85.97 |
155
  | | JNLI | 3-shot | 39.11 |
156
  | | Marc_ja | 3-shot | 96.48 |
157
+ | | JSquad (Exact Match) | 2-shot | 70.69 |
158
+ | | Jaqket (Exact Match) | 1-shot | 81.53 |
159
  | | MGSM | 5-shot | 28.8 |
160
  | **XWinograd (0-shot)** | EN | | 89.03 |
161
  | | FR | | 72.29 |