Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
11 |
|
12 |
# bt-rm
|
13 |
|
14 |
-
This model was trained from
|
15 |
|
16 |
## Model description
|
17 |
|
|
|
11 |
|
12 |
# bt-rm
|
13 |
|
14 |
+
This model was trained from LLaMA 3.1 8B Instruct with dataset `hendrydong/preference_700K` (Preprocessed dataset `RyanYr/preference_700K_llama31_tokenized`). Training script is https://github.com/yurun-yuan/RLHF-Reward-Modeling/blob/4b827117dc9a85062c396eb62200b48e6dbfd596/bradley-terry-rm/llama3_rm.py
|
15 |
|
16 |
## Model description
|
17 |
|