Adding Evaluation Results

#4
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -18,6 +18,9 @@ model-index:
18
  args:
19
  num_few_shot: 25
20
  metrics:
 
 
 
21
  - type: acc_norm
22
  value: 60.67
23
  name: normalized accuracy
@@ -34,6 +37,9 @@ model-index:
34
  args:
35
  num_few_shot: 10
36
  metrics:
 
 
 
37
  - type: acc_norm
38
  value: 81.6
39
  name: normalized accuracy
@@ -51,6 +57,9 @@ model-index:
51
  args:
52
  num_few_shot: 5
53
  metrics:
 
 
 
54
  - type: acc
55
  value: 68.12
56
  name: accuracy
@@ -68,6 +77,8 @@ model-index:
68
  args:
69
  num_few_shot: 0
70
  metrics:
 
 
71
  - type: mc2
72
  value: 51.69
73
  source:
@@ -84,6 +95,9 @@ model-index:
84
  args:
85
  num_few_shot: 5
86
  metrics:
 
 
 
87
  - type: acc
88
  value: 76.56
89
  name: accuracy
@@ -101,6 +115,9 @@ model-index:
101
  args:
102
  num_few_shot: 5
103
  metrics:
 
 
 
104
  - type: acc
105
  value: 69.45
106
  name: accuracy
@@ -200,3 +217,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
200
  |Winogrande (5-shot) |76.56|
201
  |GSM8k (5-shot) |69.45|
202
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  args:
19
  num_few_shot: 25
20
  metrics:
21
+ - type: acc_norm
22
+ value: 60.67
23
+ name: normalized accuracy
24
  - type: acc_norm
25
  value: 60.67
26
  name: normalized accuracy
 
37
  args:
38
  num_few_shot: 10
39
  metrics:
40
+ - type: acc_norm
41
+ value: 81.6
42
+ name: normalized accuracy
43
  - type: acc_norm
44
  value: 81.6
45
  name: normalized accuracy
 
57
  args:
58
  num_few_shot: 5
59
  metrics:
60
+ - type: acc
61
+ value: 68.12
62
+ name: accuracy
63
  - type: acc
64
  value: 68.12
65
  name: accuracy
 
77
  args:
78
  num_few_shot: 0
79
  metrics:
80
+ - type: mc2
81
+ value: 51.69
82
  - type: mc2
83
  value: 51.69
84
  source:
 
95
  args:
96
  num_few_shot: 5
97
  metrics:
98
+ - type: acc
99
+ value: 76.56
100
+ name: accuracy
101
  - type: acc
102
  value: 76.56
103
  name: accuracy
 
115
  args:
116
  num_few_shot: 5
117
  metrics:
118
+ - type: acc
119
+ value: 69.45
120
+ name: accuracy
121
  - type: acc
122
  value: 69.45
123
  name: accuracy
 
217
  |Winogrande (5-shot) |76.56|
218
  |GSM8k (5-shot) |69.45|
219
 
220
+
221
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
222
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_xxx777xxxASD__L3-SnowStorm-v1.15-4x8B-B)
223
+
224
+ | Metric |Value|
225
+ |---------------------------------|----:|
226
+ |Avg. |68.01|
227
+ |AI2 Reasoning Challenge (25-Shot)|60.67|
228
+ |HellaSwag (10-Shot) |81.60|
229
+ |MMLU (5-Shot) |68.12|
230
+ |TruthfulQA (0-shot) |51.69|
231
+ |Winogrande (5-shot) |76.56|
232
+ |GSM8k (5-shot) |69.45|
233
+