ehartford leaderboard-pr-bot commited on
Commit
412a040
1 Parent(s): 16d36cd

Adding Evaluation Results (#6)

Browse files

- Adding Evaluation Results (a6ff2f09c14e69d58d4711e69e80199b8f377314)


Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -90,4 +90,17 @@ ASSISTANT: To help your vehicle start, I will guide you through a step-by-step p
90
  By following these steps, you should be able to diagnose and potentially fix the issue causing your car to not start. However, if after going through these checks and still having trouble, it is recommended to seek assistance from a qualified mechanic.
91
  ```
92
 
93
- [Buy me a coffee](https://www.buymeacoffee.com/ehartford)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
  By following these steps, you should be able to diagnose and potentially fix the issue causing your car to not start. However, if after going through these checks and still having trouble, it is recommended to seek assistance from a qualified mechanic.
91
  ```
92
 
93
+ [Buy me a coffee](https://www.buymeacoffee.com/ehartford)
94
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
95
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ehartford__dolphin-llama2-7b)
96
+
97
+ | Metric | Value |
98
+ |-----------------------|---------------------------|
99
+ | Avg. | 41.88 |
100
+ | ARC (25-shot) | 46.59 |
101
+ | HellaSwag (10-shot) | 67.52 |
102
+ | MMLU (5-shot) | 48.37 |
103
+ | TruthfulQA (0-shot) | 49.72 |
104
+ | Winogrande (5-shot) | 63.77 |
105
+ | GSM8K (5-shot) | 5.69 |
106
+ | DROP (3-shot) | 11.53 |