theblackcat102
/

electra-large-reward-model

Text Classification

Inference Endpoints

Model card Files Files and versions Community

theblackcat102 commited on Jan 1, 2023

Commit

7d67d6f

•

1 Parent(s): c9b42a8

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -1,3 +1,11 @@
 ---
 license: mit
 ---

 ---
 license: mit
 ---
+Reward Model pretrained on openai/webgpt_comparison and humanfeedback summary. Unlike the other electra-large model this model is trained using rank loss with one more datasets.
+On validation dataset the result is much more stable than usual.
+You can refer to this [wandb](https://wandb.ai/theblackcat102/reward-model/runs/1d4e4oi2?workspace=) for more details