cstr commited on
Commit
cb0fc74
1 Parent(s): cf0fab9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -40,6 +40,21 @@ It achieves (running quantized) in
40
  - German EQ Bench: Score (v2_de): 62.59 (Parseable: 171.0).
41
  - English EQ Bench: Score (v2): 76.43 (Parseable: 171.0).
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
44
  |--------------------------------------------------------------|------:|------:|---------:|-------:|------:|
45
  |[Spaetzle-v69-7b](https://huggingface.co/cstr/Spaetzle-v69-7b)| 44.48| 75.84| 66.15| 46.59| 58.27|
 
40
  - German EQ Bench: Score (v2_de): 62.59 (Parseable: 171.0).
41
  - English EQ Bench: Score (v2): 76.43 (Parseable: 171.0).
42
 
43
+ [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard):
44
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_cstr__Spaetzle-v69-7b)
45
+
46
+ | Metric |Value|
47
+ |---------------------------------|----:|
48
+ |Avg. |72.87|
49
+ |AI2 Reasoning Challenge (25-Shot)|69.54|
50
+ |HellaSwag (10-Shot) |86.77|
51
+ |MMLU (5-Shot) |64.63|
52
+ |TruthfulQA (0-shot) |65.61|
53
+ |Winogrande (5-shot) |81.93|
54
+ |GSM8k (5-shot) |68.76|
55
+
56
+ Nous benchmark results:
57
+
58
  | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
59
  |--------------------------------------------------------------|------:|------:|---------:|-------:|------:|
60
  |[Spaetzle-v69-7b](https://huggingface.co/cstr/Spaetzle-v69-7b)| 44.48| 75.84| 66.15| 46.59| 58.27|