Ray2333
/

GRM-Llama3-8B-rewardmodel-ft

Model card Files Files and versions Community

Ray2333 commited on Nov 30, 2024

Commit

768d05b

·

verified ·

1 Parent(s): 64ae171

Update README.md

Files changed (1) hide show

README.md +0 -5

README.md CHANGED Viewed

@@ -17,11 +17,6 @@ We evluated this reward model on reward-bench (https://huggingface.co/spaces/all
 Check our GRM series at 🤗[hugging face](https://huggingface.co/collections/Ray2333/grm-66882bdf7152951779506c7b), our paper at [Arxiv](https://arxiv.org/abs/2406.10216), and github repo at [Github](https://github.com/YangRui2015/Generalizable-Reward-Model).
-## Evaluation
-We evaluate GRM-Llama3.2-3B-rewardmodel-ft on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), where it achieved strong performance among models smaller than 7B.
 **When evaluated using reward bench, please add '--not_quantized' to avoid performance drop.**
 |       Model               | Average       |  Chat     |     Chat Hard      |     Safety      |     Reasoning     |

 Check our GRM series at 🤗[hugging face](https://huggingface.co/collections/Ray2333/grm-66882bdf7152951779506c7b), our paper at [Arxiv](https://arxiv.org/abs/2406.10216), and github repo at [Github](https://github.com/YangRui2015/Generalizable-Reward-Model).
 **When evaluated using reward bench, please add '--not_quantized' to avoid performance drop.**
 |       Model               | Average       |  Chat     |     Chat Hard      |     Safety      |     Reasoning     |