leaderboard-pr-bot commited on
Commit
86c03ac
1 Parent(s): c673387

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -55,4 +55,17 @@ What is the best way to train a dolphin to obey me? Please answer step by step.
55
 
56
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/xnz5M1lYd4oGVATSDRkQ-.png)
57
 
58
- [Buy me a coffee](https://www.buymeacoffee.com/ehartford)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/xnz5M1lYd4oGVATSDRkQ-.png)
57
 
58
+ [Buy me a coffee](https://www.buymeacoffee.com/ehartford)
59
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
60
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ehartford__dolphin-2.0-mistral-7b)
61
+
62
+ | Metric | Value |
63
+ |-----------------------|---------------------------|
64
+ | Avg. | 55.85 |
65
+ | ARC (25-shot) | 59.22 |
66
+ | HellaSwag (10-shot) | 80.26 |
67
+ | MMLU (5-shot) | 56.9 |
68
+ | TruthfulQA (0-shot) | 61.09 |
69
+ | Winogrande (5-shot) | 75.37 |
70
+ | GSM8K (5-shot) | 18.65 |
71
+ | DROP (3-shot) | 39.49 |