Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
RLHFlow
's Collections
Standard-format-preference-dataset
Mixture-of-preference-reward-modeling
RM-Bradley-Terry
PM-pair
Online RLHF
Standard-format-preference-dataset
updated
11 days ago
We collect the open-source datasets and process them into the standard format.
Upvote
6
RLHFlow/UltraFeedback-preference-standard
Viewer
•
Updated
21 days ago
•
90
•
1
RLHFlow/Helpsteer-preference-standard
Viewer
•
Updated
21 days ago
•
4
RLHFlow/HH-RLHF-Helpful-standard
Viewer
•
Updated
21 days ago
•
33
RLHFlow/Orca-distibalel-standard
Viewer
•
Updated
21 days ago
•
18
RLHFlow/Capybara-distibalel-Filter-standard
Viewer
•
Updated
21 days ago
•
6
RLHFlow/CodeUltraFeedback-standard
Viewer
•
Updated
21 days ago
•
8
RLHFlow/UltraInteract-filtered-standard
Viewer
•
Updated
21 days ago
•
10
RLHFlow/PKU-SafeRLHF-30K-standard
Viewer
•
Updated
19 days ago
•
7
RLHFlow/Argilla-Math-DPO-standard
Viewer
•
Updated
18 days ago
•
17
RLHFlow/Prometheus2-preference-standard
Viewer
•
Updated
14 days ago
•
1
•
1
RLHFlow/SHP-standard
Viewer
•
Updated
10 days ago
•
5
RLHFlow/HH-RLHF-Harmless-and-RedTeam-standard
Viewer
•
Updated
11 days ago
•
31
Upvote
6
+2
Share collection
View history
Collection guide
Browse collections