leaderboard-pr-bot commited on
Commit
a6ff2f0
1 Parent(s): 16d36cd

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -90,4 +90,17 @@ ASSISTANT: To help your vehicle start, I will guide you through a step-by-step p
90
  By following these steps, you should be able to diagnose and potentially fix the issue causing your car to not start. However, if after going through these checks and still having trouble, it is recommended to seek assistance from a qualified mechanic.
91
  ```
92
 
93
- [Buy me a coffee](https://www.buymeacoffee.com/ehartford)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
  By following these steps, you should be able to diagnose and potentially fix the issue causing your car to not start. However, if after going through these checks and still having trouble, it is recommended to seek assistance from a qualified mechanic.
91
  ```
92
 
93
+ [Buy me a coffee](https://www.buymeacoffee.com/ehartford)
94
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
95
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ehartford__dolphin-llama2-7b)
96
+
97
+ | Metric | Value |
98
+ |-----------------------|---------------------------|
99
+ | Avg. | 41.88 |
100
+ | ARC (25-shot) | 46.59 |
101
+ | HellaSwag (10-shot) | 67.52 |
102
+ | MMLU (5-shot) | 48.37 |
103
+ | TruthfulQA (0-shot) | 49.72 |
104
+ | Winogrande (5-shot) | 63.77 |
105
+ | GSM8K (5-shot) | 5.69 |
106
+ | DROP (3-shot) | 11.53 |