sarath-shekkizhar commited on
Commit
0ff1c54
1 Parent(s): 7307326

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -96,6 +96,17 @@ MT-Bench is a benchmark made up of 80 high-quality multi-turn questions. These q
96
 
97
  ![hexplot.png](assets/hexplot.png)
98
 
 
 
 
 
 
 
 
 
 
 
 
99
  ## LM Evaluation - Open LLM Leaderboard
100
 
101
  We assess models on 7 benchmarks using the [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness). This setup is based of that used for [Open LLM Leaderboard.](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 
96
 
97
  ![hexplot.png](assets/hexplot.png)
98
 
99
+ ### Comparison with additional Open LLM LeaderBoard models
100
+ | Model | First Turn | Second Turn | Average |
101
+ | --- | --- | --- | --- |
102
+ | TenyxChat-7B-v1 | 8.45000 | 7.756250 | 8.103125 |
103
+ | SamirGPT-v1 | 8.05000 | 7.612500 | 7.831250 |
104
+ | FernandoGPT-v1 | 8.08125 | 7.256250 | 7.668750 |
105
+ | Go-Bruins-v2 | 8.13750 | 7.150000 | 7.643750 |
106
+ | mistral_tv-neural-marconroni | 7.76875 | 6.987500 | 7.378125 |
107
+ | neuronovo-7B-v0.2 | 7.73750 | 6.662500 | 7.200000 |
108
+ | neural-chat-7b-v3-3 | 7.39375 | 5.881250 | 6.637500 |
109
+
110
  ## LM Evaluation - Open LLM Leaderboard
111
 
112
  We assess models on 7 benchmarks using the [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness). This setup is based of that used for [Open LLM Leaderboard.](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)