weqweasdas
/

RM-Gemma-2B

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

weqweasdas commited on Mar 22, 2024

Commit

7746149

·

verified ·

1 Parent(s): acaa838

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -10,6 +10,8 @@
 The reward model is trained from the base model [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it).  See the 7B version [RM-Gemma-7B](https://huggingface.co/weqweasdas/RM-Gemma-7B).
 ## Model Details
 If you have any question with this reward model and also any question about reward modeling, feel free to drop me an email with wx13@illinois.edu. I would be happy to chat!

 The reward model is trained from the base model [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it).  See the 7B version [RM-Gemma-7B](https://huggingface.co/weqweasdas/RM-Gemma-7B).
+The training script is available at https://github.com/WeiXiongUST/RLHF-Reward-Modeling .
 ## Model Details
 If you have any question with this reward model and also any question about reward modeling, feel free to drop me an email with wx13@illinois.edu. I would be happy to chat!