bjoernp commited on
Commit
fee642c
1 Parent(s): 3177614

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -60,7 +60,7 @@ The model was trained with compute provided by [HessianAI](https://hessian.ai/)
60
  ### Hugginface Leaderboard
61
 
62
  This models is still an early Alpha and we can't guarantee that there isn't any contamination.
63
- However, the average of **71.24** would earn the #2 spot on the HF leaderboard at the time of writing and the highest score for a >70b model yet.
64
 
65
  | Metric | Value |
66
  |-----------------------|-------|
@@ -84,6 +84,9 @@ We use [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-eval
84
  | MMLU | 64.7 |
85
  | **Avg.** | **48.87** |
86
 
 
 
 
87
  ### MTBench
88
 
89
  ```json
@@ -103,7 +106,8 @@ We use [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-eval
103
  "average": 7.48125
104
  }
105
  ```
106
-
 
107
  ## Prompt Format
108
 
109
  This model follows the ChatML format:
 
60
  ### Hugginface Leaderboard
61
 
62
  This models is still an early Alpha and we can't guarantee that there isn't any contamination.
63
+ However, the average of **71.24** would earn the #2 spot on the HF leaderboard at the time of writing.
64
 
65
  | Metric | Value |
66
  |-----------------------|-------|
 
84
  | MMLU | 64.7 |
85
  | **Avg.** | **48.87** |
86
 
87
+ Screenshot of the current (sadly no longer maintained) FastEval CoT leaderboard:
88
+ ![FastEval Leaderboard](imgs/cot_leaderboard.png)
89
+
90
  ### MTBench
91
 
92
  ```json
 
106
  "average": 7.48125
107
  }
108
  ```
109
+ Screenshot of the current FastEval MT Bench leaderboard:
110
+ ![FastEval Leaderboard](imgs/mtbench_leaderboard.png)
111
  ## Prompt Format
112
 
113
  This model follows the ChatML format: