chrisliu298
commited on
Commit
•
88e9c1d
1
Parent(s):
d73d05f
Update README.md
Browse files
README.md
CHANGED
@@ -51,13 +51,13 @@ We evaluate our model on [RewardBench](https://huggingface.co/spaces/allenai/rew
|
|
51 |
|
52 |
| Rank | Model | Model Type | Score | Chat | Chat Hard | Safety | Reasoning |
|
53 |
| :---: | -------------------------------------------- | ----------------- | :---: | :---: | :-------: | :----: | :-------: |
|
54 |
-
| 1 | **Skywork/Skywork-Reward-Gemma-2-27B-v0.2** | Seq. Classifier | 94.
|
55 |
| 2 | nvidia/Llama-3.1-Nemotron-70B-Reward | Custom Classifier | 94.1 | 97.5 | 85.7 | 95.1 | 98.1 |
|
56 |
| 3 | Skywork/Skywork-Reward-Gemma-2-27B | Seq. Classifier | 93.8 | 95.8 | 91.4 | 91.9 | 96.1 |
|
57 |
| 4 | SF-Foundation/TextEval-Llama3.1-70B | Generative | 93.5 | 94.1 | 90.1 | 93.2 | 96.4 |
|
58 |
| 5 | meta-metrics/MetaMetrics-RM-v1.0 | Custom Classifier | 93.4 | 98.3 | 86.4 | 90.8 | 98.2 |
|
59 |
| 6 | Skywork/Skywork-Critic-Llama-3.1-70B | Generative | 93.3 | 96.6 | 87.9 | 93.1 | 95.5 |
|
60 |
-
| 7 | **Skywork/Skywork-Reward-Llama-3.1-8B-v0.2** | Seq. Classifier | 93.
|
61 |
| 8 | nicolinho/QRM-Llama3.1-8B | Seq. Classifier | 93.1 | 94.4 | 89.7 | 92.3 | 95.8 |
|
62 |
| 9 | LxzGordon/URM-LLaMa-3.1-8B | Seq. Classifier | 92.9 | 95.5 | 88.2 | 91.1 | 97.0 |
|
63 |
| 10 | Salesforce/SFR-LLaMa-3.1-70B-Judge-r | Generative | 92.7 | 96.9 | 84.8 | 91.6 | 97.6 |
|
|
|
51 |
|
52 |
| Rank | Model | Model Type | Score | Chat | Chat Hard | Safety | Reasoning |
|
53 |
| :---: | -------------------------------------------- | ----------------- | :---: | :---: | :-------: | :----: | :-------: |
|
54 |
+
| 1 | **Skywork/Skywork-Reward-Gemma-2-27B-v0.2** | Seq. Classifier | 94.3 | 96.1 | 89.9 | 93.0 | 98.1 |
|
55 |
| 2 | nvidia/Llama-3.1-Nemotron-70B-Reward | Custom Classifier | 94.1 | 97.5 | 85.7 | 95.1 | 98.1 |
|
56 |
| 3 | Skywork/Skywork-Reward-Gemma-2-27B | Seq. Classifier | 93.8 | 95.8 | 91.4 | 91.9 | 96.1 |
|
57 |
| 4 | SF-Foundation/TextEval-Llama3.1-70B | Generative | 93.5 | 94.1 | 90.1 | 93.2 | 96.4 |
|
58 |
| 5 | meta-metrics/MetaMetrics-RM-v1.0 | Custom Classifier | 93.4 | 98.3 | 86.4 | 90.8 | 98.2 |
|
59 |
| 6 | Skywork/Skywork-Critic-Llama-3.1-70B | Generative | 93.3 | 96.6 | 87.9 | 93.1 | 95.5 |
|
60 |
+
| 7 | **Skywork/Skywork-Reward-Llama-3.1-8B-v0.2** | Seq. Classifier | 93.1 | 94.7 | 88.4 | 92.7 | 96.7 |
|
61 |
| 8 | nicolinho/QRM-Llama3.1-8B | Seq. Classifier | 93.1 | 94.4 | 89.7 | 92.3 | 95.8 |
|
62 |
| 9 | LxzGordon/URM-LLaMa-3.1-8B | Seq. Classifier | 92.9 | 95.5 | 88.2 | 91.1 | 97.0 |
|
63 |
| 10 | Salesforce/SFR-LLaMa-3.1-70B-Judge-r | Generative | 92.7 | 96.9 | 84.8 | 91.6 | 97.6 |
|