RuiyangSun commited on
Commit
5ea6c15
1 Parent(s): 8def050

docs: update readme

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -34,6 +34,7 @@ It can play a role in the safe RLHF algorithm, helping the Beaver model become m
34
  - **Beaver:** <https://huggingface.co/PKU-Alignment/beaver-7b-v1.0>
35
  - **Dataset:** <https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF>
36
  - **Reward Model:** <https://huggingface.co/PKU-Alignment/beaver-7b-v1.0-reward>
 
37
  - **Paper:** *Coming soon...*
38
 
39
  ## How to Use the Reward Model
 
34
  - **Beaver:** <https://huggingface.co/PKU-Alignment/beaver-7b-v1.0>
35
  - **Dataset:** <https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF>
36
  - **Reward Model:** <https://huggingface.co/PKU-Alignment/beaver-7b-v1.0-reward>
37
+ - **Cost Model:** <https://huggingface.co/PKU-Alignment/beaver-7b-v1.0-cost>
38
  - **Paper:** *Coming soon...*
39
 
40
  ## How to Use the Reward Model