chrisliu298 commited on
Commit
88e9c1d
1 Parent(s): d73d05f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -51,13 +51,13 @@ We evaluate our model on [RewardBench](https://huggingface.co/spaces/allenai/rew
51
 
52
  | Rank | Model | Model Type | Score | Chat | Chat Hard | Safety | Reasoning |
53
  | :---: | -------------------------------------------- | ----------------- | :---: | :---: | :-------: | :----: | :-------: |
54
- | 1 | **Skywork/Skywork-Reward-Gemma-2-27B-v0.2** | Seq. Classifier | 94.2 | 96.1 | 89.7 | 93.0 | 98.1 |
55
  | 2 | nvidia/Llama-3.1-Nemotron-70B-Reward | Custom Classifier | 94.1 | 97.5 | 85.7 | 95.1 | 98.1 |
56
  | 3 | Skywork/Skywork-Reward-Gemma-2-27B | Seq. Classifier | 93.8 | 95.8 | 91.4 | 91.9 | 96.1 |
57
  | 4 | SF-Foundation/TextEval-Llama3.1-70B | Generative | 93.5 | 94.1 | 90.1 | 93.2 | 96.4 |
58
  | 5 | meta-metrics/MetaMetrics-RM-v1.0 | Custom Classifier | 93.4 | 98.3 | 86.4 | 90.8 | 98.2 |
59
  | 6 | Skywork/Skywork-Critic-Llama-3.1-70B | Generative | 93.3 | 96.6 | 87.9 | 93.1 | 95.5 |
60
- | 7 | **Skywork/Skywork-Reward-Llama-3.1-8B-v0.2** | Seq. Classifier | 93.2 | 94.7 | 88.8 | 92.6 | 96.7 |
61
  | 8 | nicolinho/QRM-Llama3.1-8B | Seq. Classifier | 93.1 | 94.4 | 89.7 | 92.3 | 95.8 |
62
  | 9 | LxzGordon/URM-LLaMa-3.1-8B | Seq. Classifier | 92.9 | 95.5 | 88.2 | 91.1 | 97.0 |
63
  | 10 | Salesforce/SFR-LLaMa-3.1-70B-Judge-r | Generative | 92.7 | 96.9 | 84.8 | 91.6 | 97.6 |
 
51
 
52
  | Rank | Model | Model Type | Score | Chat | Chat Hard | Safety | Reasoning |
53
  | :---: | -------------------------------------------- | ----------------- | :---: | :---: | :-------: | :----: | :-------: |
54
+ | 1 | **Skywork/Skywork-Reward-Gemma-2-27B-v0.2** | Seq. Classifier | 94.3 | 96.1 | 89.9 | 93.0 | 98.1 |
55
  | 2 | nvidia/Llama-3.1-Nemotron-70B-Reward | Custom Classifier | 94.1 | 97.5 | 85.7 | 95.1 | 98.1 |
56
  | 3 | Skywork/Skywork-Reward-Gemma-2-27B | Seq. Classifier | 93.8 | 95.8 | 91.4 | 91.9 | 96.1 |
57
  | 4 | SF-Foundation/TextEval-Llama3.1-70B | Generative | 93.5 | 94.1 | 90.1 | 93.2 | 96.4 |
58
  | 5 | meta-metrics/MetaMetrics-RM-v1.0 | Custom Classifier | 93.4 | 98.3 | 86.4 | 90.8 | 98.2 |
59
  | 6 | Skywork/Skywork-Critic-Llama-3.1-70B | Generative | 93.3 | 96.6 | 87.9 | 93.1 | 95.5 |
60
+ | 7 | **Skywork/Skywork-Reward-Llama-3.1-8B-v0.2** | Seq. Classifier | 93.1 | 94.7 | 88.4 | 92.7 | 96.7 |
61
  | 8 | nicolinho/QRM-Llama3.1-8B | Seq. Classifier | 93.1 | 94.4 | 89.7 | 92.3 | 95.8 |
62
  | 9 | LxzGordon/URM-LLaMa-3.1-8B | Seq. Classifier | 92.9 | 95.5 | 88.2 | 91.1 | 97.0 |
63
  | 10 | Salesforce/SFR-LLaMa-3.1-70B-Judge-r | Generative | 92.7 | 96.9 | 84.8 | 91.6 | 97.6 |