barthfab commited on
Commit
dc61e7c
1 Parent(s): 6195be9

clean tables

Browse files

- 5 lang table
- repo_id only as label
- avg. as first column

Files changed (1) hide show
  1. README.md +31 -19
README.md CHANGED
@@ -92,30 +92,42 @@ Currently, we are working on more suitable benchmarks for Spanish, French, Germa
92
  <details>
93
  <summary>Evaluation results</summary>
94
 
95
- ### English
 
 
 
 
 
 
 
 
 
 
96
 
97
- | | arc_challenge | belebele | hellaswag | mmlu | truthfulqa | avg |
98
- |:-------------------------------------|----------------:|-----------:|------------:|---------:|-------------:|---------:|
99
- | occiglot/occiglot-7b-eu5 | 0.530717 | 0.726667 | 0.789882 | 0.531904 | 0.403678 | 0.59657 |
100
- | occiglot/occiglot-7b-eu5-instruct | 0.558874 | 0.746667 | 0.799841 | 0.535109 | 0.449034 | 0.617905 |
101
- | occiglot/occiglot-7b-it-en | 0.580205 | 0.774444 | 0.804222 | 0.578977 | 0.412786 | 0.630127 |
102
- | occiglot/occiglot-7b-it-en-instruct | 0.609215 | 0.82 | 0.809301 | 0.578835 | 0.479562 | 0.659383 |
103
- | galatolo/cerbero-7b | 0.613481 | 0.827778 | 0.810396 | 0.600484 | 0.480911 | 0.66661 |
104
- | mistralai/Mistral-7B-v0.1 | 0.612628 | 0.844444 | 0.834097 | 0.624555 | 0.426201 | 0.668385 |
105
- | mistralai/Mistral-7B-Instruct-v0.2 | 0.637372 | 0.824444 | 0.846345 | 0.59201 | 0.668116 | 0.713657 |
106
 
 
 
 
 
 
 
 
 
 
 
 
107
 
108
  ### Italian
109
 
110
- | | arc_challenge_it | belebele_it | hellaswag_it | mmlu_it | truthfulqa_it | avg |
111
- |:-------------------------------------|-------------------:|--------------:|---------------:|----------:|----------------:|---------:|
112
- | occiglot/occiglot-7b-eu5 | 0.501283 | 0.652222 | 0.700533 | 0 | 0.252874 | 0.421382 |
113
- | occiglot/occiglot-7b-eu5-instruct | 0.516681 | 0.661111 | 0.71326 | 0 | 0.295019 | 0.437214 |
114
- | occiglot/occiglot-7b-it-en | 0.536356 | 0.684444 | 0.694768 | 0 | 0.247765 | 0.432667 |
115
- | occiglot/occiglot-7b-it-en-instruct | 0.545766 | 0.717778 | 0.713804 | 0 | 0.303959 | 0.456261 |
116
- | galatolo/cerbero-7b | 0.522669 | 0.717778 | 0.631567 | 0 | 0.302682 | 0.434939 |
117
- | mistralai/Mistral-7B-v0.1 | 0.502139 | 0.734444 | 0.630371 | 0 | 0.264368 | 0.426264 |
118
- | mistralai/Mistral-7B-Instruct-v0.2 | 0.519247 | 0.703333 | 0.6394 | 0 | 0.349936 | 0.442383 |
119
 
120
 
121
  </details>
 
92
  <details>
93
  <summary>Evaluation results</summary>
94
 
95
+ ### All 5 Languages
96
+
97
+ | | avg | arc_challenge | belebele | hellaswag | mmlu | truthfulqa |
98
+ |:---------------------------|---------:|----------------:|-----------:|------------:|---------:|-------------:|
99
+ | Occiglot-7b-eu5 | 0.516895 | 0.508109 | 0.675556 | 0.718963 | 0.402064 | 0.279782 |
100
+ | Occiglot-7b-eu5-instruct | 0.537799 | 0.53632 | 0.691111 | 0.731918 | 0.405198 | 0.32445 |
101
+ | Occiglot-7b-it-en | 0.513221 | 0.500564 | 0.694444 | 0.668099 | 0.413528 | 0.289469 |
102
+ | Occiglot-7b-it-en-instruct | 0.53721 | 0.523128 | 0.726667 | 0.683414 | 0.414918 | 0.337927 |
103
+ | Cerbero-7b | 0.532385 | 0.513714 | 0.743111 | 0.654061 | 0.427566 | 0.323475 |
104
+ | Mistral-7b-v0.1 | 0.547111 | 0.528937 | 0.768444 | 0.682516 | 0.448253 | 0.307403 |
105
+ | Mistral-7b-instruct-v0.2 | 0.56713 | 0.547228 | 0.741111 | 0.69455 | 0.422501 | 0.430262 |
106
 
 
 
 
 
 
 
 
 
 
107
 
108
+ ### English
109
+
110
+ | | avg | arc_challenge | belebele | hellaswag | mmlu | truthfulqa |
111
+ |:---------------------------|---------:|----------------:|-----------:|------------:|---------:|-------------:|
112
+ | Occiglot-7b-eu5 | 0.59657 | 0.530717 | 0.726667 | 0.789882 | 0.531904 | 0.403678 |
113
+ | Occiglot-7b-eu5-instruct | 0.617905 | 0.558874 | 0.746667 | 0.799841 | 0.535109 | 0.449 |
114
+ | Occiglot-7b-it-en | 0.630127 | 0.580205 | 0.774444 | 0.804222 | 0.578977 | 0.412786 |
115
+ | Occiglot-7b-it-en-instruct | 0.659383 | 0.609215 | 0.82 | 0.809301 | 0.578835 | 0.479562 |
116
+ | Cerbero-7b | 0.66661 | 0.613481 | 0.827778 | 0.810396 | 0.600484 | 0.480911 |
117
+ | Mistral-7b-v0.1 | 0.668385 | 0.612628 | 0.844444 | 0.834097 | 0.624555 | 0.426201 |
118
+ | Mistral-7b-instruct-v0.2 | 0.713657 | 0.637372 | 0.824444 | 0.846345 | 0.59201 | 0.668116 |
119
 
120
  ### Italian
121
 
122
+ | | avg | arc_challenge_it | belebele_it | hellaswag_it | mmlu_it | truthfulqa_it |
123
+ |:---------------------------|---------:|-------------------:|--------------:|---------------:|----------:|----------------:|
124
+ | Occiglot-7b-eu5 | 0.421382 | 0.501283 | 0.652222 | 0.700533 | 0 | 0.252874 |
125
+ | Occiglot-7b-eu5-instruct | 0.437214 | 0.516681 | 0.661111 | 0.71326 | 0 | 0.295019 |
126
+ | Occiglot-7b-it-en | 0.432667 | 0.536356 | 0.684444 | 0.694768 | 0 | 0.247765 |
127
+ | Occiglot-7b-it-en-instruct | 0.456261 | 0.545766 | 0.717778 | 0.713804 | 0 | 0.303959 |
128
+ | Cerbero-7b | 0.434939 | 0.522669 | 0.717778 | 0.631567 | 0 | 0.302682 |
129
+ | Mistral-7b-v0.1 | 0.426264 | 0.502139 | 0.734444 | 0.630371 | 0 | 0.264368 |
130
+ | Mistral-7b-instruct-v0.2 | 0.442383 | 0.519247 | 0.703333 | 0.6394 | 0 | 0.349936 |
131
 
132
 
133
  </details>