Weyaxi leaderboard-pr-bot commited on
Commit
26b30a0
1 Parent(s): f8988da

Adding Evaluation Results (#1)

Browse files

- Adding Evaluation Results (7f9eda7298f5127fded5a3cc0ec40731beb49acf)


Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -21,4 +21,17 @@ Samantha-Nebula-7B is a merge of [ehartford/samantha-mistral-7b](https://hugging
21
  | ARC (25-shot) | |
22
  | HellaSwag (10-shot) | |
23
  | MMLU (5-shot) | |
24
- | TruthfulQA (0-shot) | |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  | ARC (25-shot) | |
22
  | HellaSwag (10-shot) | |
23
  | MMLU (5-shot) | |
24
+ | TruthfulQA (0-shot) | |
25
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
26
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__Samantha-Nebula-7B)
27
+
28
+ | Metric | Value |
29
+ |-----------------------|---------------------------|
30
+ | Avg. | 52.87 |
31
+ | ARC (25-shot) | 57.0 |
32
+ | HellaSwag (10-shot) | 82.25 |
33
+ | MMLU (5-shot) | 54.21 |
34
+ | TruthfulQA (0-shot) | 49.58 |
35
+ | Winogrande (5-shot) | 73.09 |
36
+ | GSM8K (5-shot) | 11.37 |
37
+ | DROP (3-shot) | 42.57 |