Adding Evaluation Results

#1
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -129,3 +129,17 @@ hf-causal-experimental (pretrained=teknium/OpenHermes-7B,dtype=float16), limit:
129
  ## Training procedure
130
 
131
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/Vzy7Z4Qcwj4hGJcQ2BT20.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
  ## Training procedure
130
 
131
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/Vzy7Z4Qcwj4hGJcQ2BT20.png)
132
+
133
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
134
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_teknium__OpenHermes-7B)
135
+
136
+ | Metric | Value |
137
+ |-----------------------|---------------------------|
138
+ | Avg. | 48.76 |
139
+ | ARC (25-shot) | 56.14 |
140
+ | HellaSwag (10-shot) | 78.32 |
141
+ | MMLU (5-shot) | 48.62 |
142
+ | TruthfulQA (0-shot) | 45.0 |
143
+ | Winogrande (5-shot) | 74.51 |
144
+ | GSM8K (5-shot) | 5.0 |
145
+ | DROP (3-shot) | 33.7 |