weqweasdas
commited on
Commit
•
06bd94a
1
Parent(s):
e3c1d3f
Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@
|
|
8 |
|
9 |
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
|
11 |
-
Thanks for your intersts in this reward model! We recommed you to use [weqweasdas/RM-Gemma-2B](https://huggingface.co/weqweasdas/RM-Gemma-2B) instead
|
12 |
|
13 |
In this repo, we present a reward model trained by the framework [LMFlow](https://github.com/OptimalScale/LMFlow). The reward model is for the [HH-RLHF dataset](Dahoas/full-hh-rlhf) (helpful part only), and is trained from the base model [openlm-research/open_llama_3b](https://huggingface.co/openlm-research/open_llama_3b).
|
14 |
|
|
|
8 |
|
9 |
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
|
11 |
+
**Thanks for your intersts in this reward model! We recommed you to use [weqweasdas/RM-Gemma-2B](https://huggingface.co/weqweasdas/RM-Gemma-2B) instead.**
|
12 |
|
13 |
In this repo, we present a reward model trained by the framework [LMFlow](https://github.com/OptimalScale/LMFlow). The reward model is for the [HH-RLHF dataset](Dahoas/full-hh-rlhf) (helpful part only), and is trained from the base model [openlm-research/open_llama_3b](https://huggingface.co/openlm-research/open_llama_3b).
|
14 |
|