add eval results
Browse files
README.md
CHANGED
@@ -91,4 +91,17 @@ name: Llama-3-8B-Ultra-Instruct
|
|
91 |
{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
92 |
|
93 |
{output}<|eot_id|>
|
94 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
91 |
{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
92 |
|
93 |
{output}<|eot_id|>
|
94 |
+
```
|
95 |
+
|
96 |
+
### [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
97 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_elinas__Llama-3-8B-Ultra-Instruct)
|
98 |
+
|
99 |
+
| Metric |Value|
|
100 |
+
|---------------------------------|----:|
|
101 |
+
|Avg. |69.11|
|
102 |
+
|AI2 Reasoning Challenge (25-Shot)|64.59|
|
103 |
+
|HellaSwag (10-Shot) |81.63|
|
104 |
+
|MMLU (5-Shot) |68.32|
|
105 |
+
|TruthfulQA (0-shot) |52.80|
|
106 |
+
|Winogrande (5-shot) |76.95|
|
107 |
+
|GSM8k (5-shot) |70.36|
|