migtissera leaderboard-pr-bot commited on
Commit
fc008ff
1 Parent(s): 41a2e61

Adding Evaluation Results (#5)

Browse files

- Adding Evaluation Results (e7ba518c950f23bf3e7d5c72da40c7a2eecebb1c)


Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -190,4 +190,17 @@ Once the rocket has reached the necessary velocity, it must also have sufficient
190
 
191
  Overall, launching a rocket into LEO is a complex process that involves careful planning, preparation, and execution. Achieving the necessary velocity and maintaining the rocket's orbit requires a high level of technical expertise and precision.
192
 
193
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
190
 
191
  Overall, launching a rocket into LEO is a complex process that involves careful planning, preparation, and execution. Achieving the necessary velocity and maintaining the rocket's orbit requires a high level of technical expertise and precision.
192
 
193
+ ```
194
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
195
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_migtissera__Synthia-13B)
196
+
197
+ | Metric | Value |
198
+ |-----------------------|---------------------------|
199
+ | Avg. | 48.56 |
200
+ | ARC (25-shot) | 59.98 |
201
+ | HellaSwag (10-shot) | 81.86 |
202
+ | MMLU (5-shot) | 56.11 |
203
+ | TruthfulQA (0-shot) | 47.41 |
204
+ | Winogrande (5-shot) | 76.09 |
205
+ | GSM8K (5-shot) | 10.99 |
206
+ | DROP (3-shot) | 7.45 |