weqweasdas
/

RM-Mistral-7B

Text Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

weqweasdas commited on Mar 23

Commit

47d0c20

•

1 Parent(s): 5519e53

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -18,8 +18,7 @@ If you have any question with this reward model and also any question about rewa
 <!-- Provide a longer summary of what this model is. -->
-The model is trained on a mixture of the dataset similar to [google/gemma-7b-it](https://huggingface.co/google/gemma-7b-it).
 - [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf)
 - [SHP](https://huggingface.co/datasets/stanfordnlp/SHP)
 - [UltraFeedback](https://huggingface.co/datasets/openbmb/UltraFeedback)

 <!-- Provide a longer summary of what this model is. -->
+The model is trained on a mixture of the following datasets. We also provide the mixture in [weqweasdas/preference_dataset_mixture2_and_safe_pku](weqweasdas/preference_dataset_mixture2_and_safe_pku).
 - [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf)
 - [SHP](https://huggingface.co/datasets/stanfordnlp/SHP)
 - [UltraFeedback](https://huggingface.co/datasets/openbmb/UltraFeedback)