Edit model card

The reference model after supervised fine-tuning on the chosen response.

Downloads last month
0
Inference Examples
Unable to determine this model's library. Check the docs .

Dataset used to train honggen/hard_dpo