Update README.md

Browse files

Files changed (1) hide show

README.md +7 -9

README.md CHANGED Viewed

@@ -198,12 +198,10 @@ Scores 65.56 on [EQ-Bench v2](https://arxiv.org/abs/2312.06281)
 ### [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_giraffe176__WestLake_Noromaid_OpenHermes_neural-chatv0.1)
-|             Metric              |Value|
-|---------------------------------|----:|
-|Avg.                             |68.86|
-|AI2 Reasoning Challenge (25-Shot)|66.72|
-|HellaSwag (10-Shot)              |85.37|
-|MMLU (5-Shot)                    |64.67|
-|TruthfulQA (0-shot)              |51.50|
-|Winogrande (5-shot)              |79.72|
-|GSM8k (5-shot)                   |65.20|

 ### [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_giraffe176__WestLake_Noromaid_OpenHermes_neural-chatv0.1)
+|                                           | Avg.  | AI2 (25-Shot) | HellaSwag (10-Shot) | MMLU (5-Shot) | TruthfulQA (0-shot) | Winogrande (5-shot) | GSM8k (5-shot) |
+|:-----------------------------------------:|-------|-----------------------------------|---------------------|---------------|---------------------|---------------------|----------------|
+| This model                                | 68.86 | 66.72                             | 85.37               | 64.67         | 51.50               | 79.72               | 65.20          |
+| cognitivecomputations/WestLake-7B-v2-laser| **74.78** | 73.29                             | **88.66**               | **64.72**         | **67.04**               | **86.74**               | **68.23**          |
+| NeverSleep/Noromaid-7B-0.4-DPO            | 59.08 | 62.29                             | 84.32               | 63.2          | 42.28               | 76.95               | 25.47          |
+| teknium/OpenHermes-2.5-Mistral-7B         | 61.52 | 64.93                             | 84.18               | 63.64         | 52.24               | 78.06               | 26.08          |
+| Intel/neural-chat-7b-v3-3                 | 69.83 | **66.89**                             | 85.26               | 63.07         | 63.01               | 79.64               | 61.11          |