Update README.md
Browse files
README.md
CHANGED
@@ -55,34 +55,13 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
55 |
```
|
56 |
I encourage you to provide feedback on the model's performance. If you'd like to create your own quantizations, feel free to do so and let me know how it works for you!
|
57 |
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
- name: "BBH"
|
68 |
-
dataset_name: "BBH"
|
69 |
-
metric_type: "acc_norm"
|
70 |
-
value: 31.81
|
71 |
-
- name: "MATH Lvl 5"
|
72 |
-
dataset_name: "MATH Lvl 5"
|
73 |
-
metric_type: "exact_match"
|
74 |
-
value: 9.67
|
75 |
-
- name: "GPQA"
|
76 |
-
dataset_name: "GPQA"
|
77 |
-
metric_type: "acc_norm"
|
78 |
-
value: 8.5
|
79 |
-
- name: "MUSR"
|
80 |
-
dataset_name: "MuSR"
|
81 |
-
metric_type: "acc_norm"
|
82 |
-
value: 11.38
|
83 |
-
- name: "MMLU-PRO"
|
84 |
-
dataset_name: "MMLU-PRO"
|
85 |
-
metric_type: "acc"
|
86 |
-
value: 27.34
|
87 |
-
|
88 |
|
|
|
55 |
```
|
56 |
I encourage you to provide feedback on the model's performance. If you'd like to create your own quantizations, feel free to do so and let me know how it works for you!
|
57 |
|
58 |
+
| Metric | Dataset Name | Metric Type | Value |
|
59 |
+
|--------------|---------------|-------------------------------------|--------|
|
60 |
+
| Average ⬆️ | Average | average_score | 19.46 |
|
61 |
+
| IFEval | IFEval | inst_level_strict_acc, prompt_level_strict_acc | 28.06 |
|
62 |
+
| BBH | BBH | acc_norm | 31.81 |
|
63 |
+
| MATH Lvl 5 | MATH Lvl 5 | exact_match | 9.67 |
|
64 |
+
| GPQA | GPQA | acc_norm | 8.5 |
|
65 |
+
| MUSR | MuSR | acc_norm | 11.38 |
|
66 |
+
| MMLU-PRO | MMLU-PRO | acc | 27.34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
67 |
|