ChuckMcSneed commited on
Commit
d1522d8
1 Parent(s): 00a78d5

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -68,4 +68,14 @@ Alpaca.
68
  | D | 1 | 1 |0 |1 |1 |0.5|3|
69
  | S | 5 | 6.75 |7.25 |7.25 |6.75 |6.5|7.25|
70
  | P | 6 | 4.75 |4.25 |5.25 |5.25 |5.5|5|
71
- | Total | 17 | 16.5 |14.5 |18.5 |16.5 |16|18.25|
 
 
 
 
 
 
 
 
 
 
 
68
  | D | 1 | 1 |0 |1 |1 |0.5|3|
69
  | S | 5 | 6.75 |7.25 |7.25 |6.75 |6.5|7.25|
70
  | P | 6 | 4.75 |4.25 |5.25 |5.25 |5.5|5|
71
+ | Total | 17 | 16.5 |14.5 |18.5 |16.5 |16|18.25|
72
+
73
+ ## Open LLM leaderboard
74
+ [Leaderboard on Huggingface](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
75
+ |Model |Average|ARC |HellaSwag|MMLU |TruthfulQA|Winogrande|GSM8K|
76
+ |---------------------------------------|-------|-----|---------|-----|----------|----------|-----|
77
+ |PMaxxxer-v1-70b |72.41 |71.08|87.88 |70.39|59.77 |82.64 |62.7 |
78
+ |SMaxxxer-v1-70b |72.23 |70.65|88.02 |70.55|60.7 |82.87 |60.58|
79
+ |Difference |0.18 |0.43 |-0.14 |-0.16|-0.93 |-0.23 |2.12 |
80
+
81
+ Performance here is decent. It was #5 on the leaderboard among 70b models when I submitted it. This leaderboard is currently quite useless though, some 7b braindead meme merges have high scores there, claiming to be the next GPT4. At least I don't pretend that my models aren't a meme.