Update README.md
Browse files
README.md
CHANGED
@@ -17,11 +17,6 @@ We evluated this reward model on reward-bench (https://huggingface.co/spaces/all
|
|
17 |
|
18 |
Check our GRM series at 🤗[hugging face](https://huggingface.co/collections/Ray2333/grm-66882bdf7152951779506c7b), our paper at [Arxiv](https://arxiv.org/abs/2406.10216), and github repo at [Github](https://github.com/YangRui2015/Generalizable-Reward-Model).
|
19 |
|
20 |
-
|
21 |
-
|
22 |
-
## Evaluation
|
23 |
-
We evaluate GRM-Llama3.2-3B-rewardmodel-ft on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), where it achieved strong performance among models smaller than 7B.
|
24 |
-
|
25 |
**When evaluated using reward bench, please add '--not_quantized' to avoid performance drop.**
|
26 |
|
27 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|
|
|
17 |
|
18 |
Check our GRM series at 🤗[hugging face](https://huggingface.co/collections/Ray2333/grm-66882bdf7152951779506c7b), our paper at [Arxiv](https://arxiv.org/abs/2406.10216), and github repo at [Github](https://github.com/YangRui2015/Generalizable-Reward-Model).
|
19 |
|
|
|
|
|
|
|
|
|
|
|
20 |
**When evaluated using reward bench, please add '--not_quantized' to avoid performance drop.**
|
21 |
|
22 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|