Standard-format-preference-dataset - a RLHFlow Collection

RLHFlow 's Collections

RLHFlow MATH Process Reward Model

Standard-format-preference-dataset

Mixture-of-preference-reward-modeling

RM-Bradley-Terry

PM-pair

RLHFLow Reward Models

Standard-format-preference-dataset

updated May 8

We collect the open-source datasets and process them into the standard format.