Update README.md
Browse files
README.md
CHANGED
@@ -112,3 +112,5 @@ The following data has been re-evaluated and calculated as the average for each
|
|
112 |
| TruthfulQA | **64.35** | 53.25 | 62.67 | 61.04 | 59.09 | 57.8 | 56.75 |
|
113 |
| BBH | **49.48** | 44.87 | 48.86 | 48.47 | 48.30 | 48.19 | 47.93 |
|
114 |
| GPQA | 31.98 | 29.50 | 32.25 | 32.38 | **32.61** | 31.14 | 30.6 |
|
|
|
|
|
|
112 |
| TruthfulQA | **64.35** | 53.25 | 62.67 | 61.04 | 59.09 | 57.8 | 56.75 |
|
113 |
| BBH | **49.48** | 44.87 | 48.86 | 48.47 | 48.30 | 48.19 | 47.93 |
|
114 |
| GPQA | 31.98 | 29.50 | 32.25 | 32.38 | **32.61** | 31.14 | 30.6 |
|
115 |
+
|
116 |
+
The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Llama-3.1-8B-Fusion-6040/blob/main/eval.sh)
|