giraffe176 commited on
Commit
400f634
1 Parent(s): d2f7754

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -196,6 +196,8 @@ dtype: bfloat16
196
 
197
 
198
  ### Table of Benchmarks
 
 
199
  | | MT-Bench | EQ-Bench v2.1 | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
200
  |---------------------------------------------------------|---------------------------------------------|---------------------------------------------------------------------------------|---------|-------|-----------|-------|------------|------------|-------|
201
  | giraffe176/WestMaid_HermesMonarchv0.1 | 8.021875 | 77.19 (3 Shot, ooba) | 72.62 | 70.22 | 87.42 | 64.31 | 61.99 | 82.16 | 69.6 |
@@ -205,4 +207,11 @@ dtype: bfloat16
205
  | NeverSleep/Noromaid-7B-0.4-DPO | | | 59.08 | 62.29 | 84.32 | 63.2 | 42.28 | 76.95 | 25.47 |
206
  | claude-v1 | 7.900000 | 76.83 | | | | | | | |
207
  | gpt-3.5-turbo | 7.943750 | 71.74 | | | | | | | |
208
- | | [(Paper)](https://arxiv.org/abs/2306.05685) | [(Paper)](https://arxiv.org/abs/2312.06281) [Leaderboard](https://eqbench.com/) | | | | | | | |
 
 
 
 
 
 
 
 
196
 
197
 
198
  ### Table of Benchmarks
199
+
200
+ ## Open LLM Leaderboard
201
  | | MT-Bench | EQ-Bench v2.1 | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
202
  |---------------------------------------------------------|---------------------------------------------|---------------------------------------------------------------------------------|---------|-------|-----------|-------|------------|------------|-------|
203
  | giraffe176/WestMaid_HermesMonarchv0.1 | 8.021875 | 77.19 (3 Shot, ooba) | 72.62 | 70.22 | 87.42 | 64.31 | 61.99 | 82.16 | 69.6 |
 
207
  | NeverSleep/Noromaid-7B-0.4-DPO | | | 59.08 | 62.29 | 84.32 | 63.2 | 42.28 | 76.95 | 25.47 |
208
  | claude-v1 | 7.900000 | 76.83 | | | | | | | |
209
  | gpt-3.5-turbo | 7.943750 | 71.74 | | | | | | | |
210
+ | | [(Paper)](https://arxiv.org/abs/2306.05685) | [(Paper)](https://arxiv.org/abs/2312.06281) [Leaderboard](https://eqbench.com/) | | | | | | | |
211
+
212
+ ## Yet Another LLM Leaderboard benchmarks
213
+
214
+ | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
215
+ |------------------------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
216
+ |[WestMaid_HermesMonarchv0.1](https://huggingface.co/giraffe176/WestMaid_HermesMonarchv0.1)| 45.34| 76.33| 61.99| 46.02| 57.42|
217
+